SFTP and FTP: support for non-UTF-8 charsets (error message "Path could not be decoded as UTF-8") #1089

Closed
opened 2010-06-17 20:07:16 +00:00 by slush · 9 comments
slush commented 2010-06-17 20:07:16 +00:00
Owner

I open new ticket, because I didn't found any reported problem with UTF-8 (except this one #704 which is afaik something different).

When I try to create file/directory over web frontend, everything goes well. But any attempt to create directory/file with national symbols over FTP/SFTP frontend fails.

WinSCP tell me "Path could not be decoded as UTF-8", Total Commander just tell me some general error.

Example directory name which fails: žluťoučký kůň úpěl ďábelské ódy (means: Yellow horse moaned devil odes), but fails also with any other reasonable name as "dovolená" (holiday).

Reproducibility: always
Python version: 2.6

I open new ticket, because I didn't found any reported problem with UTF-8 (except this one #704 which is afaik something different). When I try to create file/directory over web frontend, everything goes well. But any attempt to create directory/file with national symbols over FTP/SFTP frontend fails. WinSCP tell me "Path could not be decoded as UTF-8", Total Commander just tell me some general error. Example directory name which fails: žluťoučký kůň úpěl ďábelské ódy (means: Yellow horse moaned devil odes), but fails also with any other reasonable name as "dovolená" (holiday). Reproducibility: always Python version: 2.6
tahoe-lafs added the
code-frontend
major
defect
1.7β
labels 2010-06-17 20:07:16 +00:00
tahoe-lafs added this to the undecided milestone 2010-06-17 20:07:16 +00:00
slush commented 2010-06-17 20:22:12 +00:00
Author
Owner

And when I create directory thru web interface, I cannot enter them using FTP client (TC tell me Internal server error).

And when I create directory thru web interface, I cannot enter them using FTP client (TC tell me Internal server error).
davidsarah commented 2010-06-17 21:40:25 +00:00
Author
Owner

This error will occur if the file or directory name is not valid UTF-8. Polish systems often use ISO-Latin-2 locales -- is that what your filesystem uses?

If so, the SFTP specification implies that it is the responsibility of the client to convert names to UTF-8, and apparently the clients you tried aren't doing that.

Alternatives:

  • have an option in tahoe.cfg or the FTP/SFTP accounts file to specify another encoding.
  • declare this to be a client bug.

The FTP frontend does not support Unicode at all; that is ticket #682. However, if #682 were fixed by implementing RFC 2640, then the FTP frontend would also only support UTF-8.

We can improve the error message for clients that display it.

It is a bug in Total Commander that it doesn't display the message. Also, if "Internal server error" was reported for SFTP then that description is inaccurate -- FX_FAILURE just means an error that has no more specific code. In this case we could arguably report FX_BAD_MESSAGE, though.

This error will occur if the file or directory name is not valid UTF-8. Polish systems often use ISO-Latin-2 locales -- is that what your filesystem uses? If so, the SFTP specification implies that it is the responsibility of the client to convert names to UTF-8, and apparently the clients you tried aren't doing that. Alternatives: * have an option in `tahoe.cfg` or the FTP/SFTP accounts file to specify another encoding. * declare this to be a client bug. The FTP frontend does not support Unicode at all; that is ticket #682. However, if #682 were fixed by implementing RFC 2640, then the FTP frontend would also only support UTF-8. We can improve the error message for clients that display it. It is a bug in Total Commander that it doesn't display the message. Also, if "Internal server error" was reported for SFTP then that description is inaccurate -- FX_FAILURE just means an error that has no more specific code. In this case we could arguably report FX_BAD_MESSAGE, though.
tahoe-lafs changed title from Path could not be decoded as UTF-8 to SFTP and FTP: Path could not be decoded as UTF-8 2010-06-17 21:40:25 +00:00
davidsarah commented 2010-06-17 21:45:11 +00:00
Author
Owner

(wiki/SftpFrontend#Unicodefilenames) already documented this problem, but I added a reference to this ticket.

(wiki/SftpFrontend#Unicodefilenames) already documented this problem, but I added a reference to this ticket.
davidsarah commented 2010-06-17 21:47:24 +00:00
Author
Owner

Replying to slush:

And when I create directory thru web interface, I cannot enter them using FTP client (TC tell me Internal server error).

This problem (listing a directory containing non-ASCII names in FTP) is part of #682.

Replying to [slush](/tahoe-lafs/trac-2024-07-25/issues/1089#issuecomment-119765): > And when I create directory thru web interface, I cannot enter them using FTP client (TC tell me Internal server error). This problem (listing a directory containing non-ASCII names in FTP) is part of #682.
slush commented 2010-06-20 23:24:02 +00:00
Author
Owner

@David:

You are right, I fixed problems with UTF-8 in WinSCP by setting correct charset. I think it should be mentioned in SFTP frontend docs (because many other SFTP servers Im using works without any settings changes).

@David: You are right, I fixed problems with UTF-8 in WinSCP by setting correct charset. I think it should be mentioned in SFTP frontend docs (because many other SFTP servers Im using works without any settings changes).
tahoe-lafs modified the milestone from undecided to soon 2010-06-21 01:55:08 +00:00
davidsarah commented 2010-06-21 01:57:02 +00:00
Author
Owner

Replying to slush:

@David:

You are right, I fixed problems with UTF-8 in WinSCP by setting correct charset.

Please change the 'Unicode filenames' section of wiki/SftpFrontend to explain how to do this.

Replying to [slush](/tahoe-lafs/trac-2024-07-25/issues/1089#issuecomment-119770): > @David: > > You are right, I fixed problems with UTF-8 in WinSCP by setting correct charset. Please change the 'Unicode filenames' section of [wiki/SftpFrontend](wiki/SftpFrontend) to explain how to do this.
tahoe-lafs added
1.7.0
and removed
1.7β
labels 2010-06-21 20:59:57 +00:00
davidsarah commented 2010-07-11 22:07:40 +00:00
Author
Owner

Replying to [davidsarah]comment:8:

Replying to slush:

You are right, I fixed problems with UTF-8 in WinSCP by setting correct charset.

Please change the 'Unicode filenames' section of wiki/SftpFrontend to explain how to do this.

Done.

Replying to [davidsarah]comment:8: > Replying to [slush](/tahoe-lafs/trac-2024-07-25/issues/1089#issuecomment-119770): > > You are right, I fixed problems with UTF-8 in WinSCP by setting correct charset. > > Please change the 'Unicode filenames' section of [wiki/SftpFrontend](wiki/SftpFrontend) to explain how to do this. Done.
tahoe-lafs modified the milestone from soon to eventually 2010-07-11 22:07:40 +00:00
davidsarah commented 2011-02-03 00:02:21 +00:00
Author
Owner

I'm unconvinced that supporting non-UTF-8 encodings is worth the hassle and complexity. Does anyone want to argue in favour of it?

Note that neither SFTP nor FTP have any standard by which a specific non-UTF-8 encoding could be automatically negotiated. So, this would have to be manually configured. But clients that do a reasonable job of supporting non-ASCII characters at all, usually have an option to select UTF-8. So I think that support for other encodings would benefit very few users.

I'm unconvinced that supporting non-UTF-8 encodings is worth the hassle and complexity. Does anyone want to argue in favour of it? Note that neither SFTP nor FTP have any standard by which a specific non-UTF-8 encoding could be automatically negotiated. So, this would have to be manually configured. But clients that do a reasonable job of supporting non-ASCII characters at all, usually have an option to select UTF-8. So I think that support for other encodings would benefit very few users.
tahoe-lafs added
minor
and removed
major
labels 2011-02-03 00:02:21 +00:00
tahoe-lafs changed title from SFTP and FTP: Path could not be decoded as UTF-8 to SFTP and FTP: support for non-UTF-8 charsets (error message "Path could not be decoded as UTF-8") 2011-02-03 00:02:21 +00:00
daira commented 2013-08-02 03:06:46 +00:00
Author
Owner

I'm wontfixing this, because SFTP and FTP simply have no support for negotiating non-UTF-8 encodings.

I'm wontfixing this, because SFTP and FTP simply have no support for negotiating non-UTF-8 encodings.
tahoe-lafs added the
wontfix
label 2013-08-02 03:06:46 +00:00
daira closed this issue 2013-08-02 03:06:46 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: tahoe-lafs/trac-2024-07-25#1089
No description provided.