capabilities from the future could have non-ascii characters #1051

Closed
opened 2010-05-19 05:50:10 +00:00 by zooko · 24 comments

In the work around ticket #833 we implemented forward-compatibility for future capability formats. However we required all future cap formats to be expressed in ASCII without (as far as I know) actually thinking it through and deciding that we really wanted to constrain future cap formats in that way.

This ticket is to loosen that constraint on future cap formats. Note that this doesn't require future cap formats to have non-ASCII characters in them -- it just makes it so that if they do then they will still enjoy the same (limited) backward-compatibility to Tahoe-LAFS v1.7 that pure-ASCII caps from the future enjoy back to Tahoe-LAFS v1.6.1.

In the work around ticket #833 we implemented forward-compatibility for future capability formats. However we required all future cap formats to be expressed in ASCII without (as far as I know) actually thinking it through and deciding that we really wanted to constrain future cap formats in that way. This ticket is to loosen that constraint on future cap formats. Note that this doesn't require future cap formats to have non-ASCII characters in them -- it just makes it so that if they *do* then they will still enjoy the same (limited) backward-compatibility to Tahoe-LAFS v1.7 that pure-ASCII caps from the future enjoy back to Tahoe-LAFS v1.6.1.
zooko added the
c/unknown
p/major
t/enhancement
v/1.6.1
labels 2010-05-19 05:50:10 +00:00
zooko added this to the 1.7.0 milestone 2010-05-19 05:50:10 +00:00
Author

Attachment refactor-test-web.dpatch (24338 bytes) added

**Attachment** refactor-test-web.dpatch (24338 bytes) added
Author

Please review attachment:refactor-test-web.dpatch.

Please review attachment:refactor-test-web.dpatch.
zooko added
c/code
and removed
c/unknown
labels 2010-05-19 06:08:29 +00:00
Author

Attachment test-nonascii-future-caps.dpatch (35715 bytes) added

**Attachment** test-nonascii-future-caps.dpatch (35715 bytes) added
Author

Here are unit tests which trunk currently fails. They test what happens if someone gives you a cap (through wui/wapi, cli, or in a child of a dir) that has non-ascii chars in the cap itself. Please review these test patches!

Here are unit tests which trunk currently fails. They test what happens if someone gives you a cap (through wui/wapi, cli, or in a child of a dir) that has non-ascii chars in the cap itself. Please review these test patches!
Author

FWIW I agree with Brian's and David-Sarah's comments on IRC that the next version of caps (hopefully coming out this year) should not use non-ASCII chars.

I would still like to see this forward-compatibility feature in Tahoe-LAFS as soon as possible though to give our future selves and our successors more graceful options.

FWIW I agree with Brian's and David-Sarah's comments on IRC that the next version of caps (hopefully coming out this year) should not use non-ASCII chars. I would still like to see this forward-compatibility feature in Tahoe-LAFS as soon as possible though to give our future selves and our successors more graceful options.

Tests look ok, but we don't have a patch that makes them pass. This may have to wait until 1.8.

Tests look ok, but we don't have a patch that makes them pass. This may have to wait until 1.8.

refactor-test-web.dpatch can be applied now.

refactor-test-web.dpatch can be applied now.

When we parse a JSON bytestring using simplejson.loads, the result is a mixture of ASCII and Unicode strings. Therefore any "rw_uri" and "ro_uri" fields in the result should be coerced using allmydata.util.stringutils.to_str, which is defined like this:

def to_str(s):
    if s is None or isinstance(s, str):
        return s
    return s.encode('utf-8')

I've been doing this for the CLI scripts as part of fixing the ticket534 branch.

(The stringutils module is likely to be renamed, maybe to encodingutil.)

When we parse a JSON bytestring using `simplejson.loads`, the result is a mixture of ASCII and Unicode strings. Therefore any "`rw_uri`" and "`ro_uri`" fields in the result should be coerced using `allmydata.util.stringutils.to_str`, which is defined like this: ``` def to_str(s): if s is None or isinstance(s, str): return s return s.encode('utf-8') ``` I've been doing this for the CLI scripts as part of fixing the ticket534 branch. (The `stringutils` module is likely to be renamed, maybe to `encodingutil`.)

The test patch is testing that (some of) our internal APIs directly support URIs that can be Unicode strings as well as byte strings. I don't think we should do it that way: we should represent URIs as UTF-8 (also in the encoded form of directories), and convert when reading caps/paths from the command line, and when displaying caps to stdout/stderr and in the WUI. (I think the former may already work, due to the changes for #534 and #565.)

The test patch is testing that (some of) our internal APIs directly support URIs that can be Unicode strings as well as byte strings. I don't think we should do it that way: we should represent URIs as UTF-8 (also in the encoded form of directories), and convert when reading caps/paths from the command line, and when displaying caps to stdout/stderr and in the WUI. (I think the former *may* already work, due to the changes for #534 and #565.)

refactor-test-web.dpatch was applied in changeset:9e2da058372cad56.

refactor-test-web.dpatch was applied in changeset:9e2da058372cad56.
daira modified the milestone from 1.7.0 to 1.8.0 2010-06-08 04:37:26 +00:00

I concur: filecaps are either ASCII or arbitrary bytestrings, not unicode
objects.

Actually, I think of it this way:

  • filecaps are abstract bundles of location/identification information about
    files/directories (think about our URI objects)
  • we currently have one concrete expression syntax for filecaps, call it V1,
    which starts with "URI:" and always contains printable ASCII
  • we can imagine other expression syntaxes in the future, in particular a
    dense binary form (call it V2), and a more-official
    follows-the-RFC-about-URIs URI form, with :// and everything (call
    it V3)
  • the internal dirnode-traversing code must be able to understand the syntax
    used in the dirnodes that it unpacks. The dirnodes contain a bytestring
    (packed with netstrings, not JSON). This is the first constraint on what
    future code can put into dirnodes
  • the WUI which displays caps in dirnodes shouldn't explode when it sees a
    cap it doesn't recognize. This is the second constraint.
  • the URLs passed into the webapi can contain both a filecap and child name
    (subdir path and/or filename). HTTP enforces a specific type here:
    %-encoding of a bytestring, and it is common to expect the bytestring to
    be a UTF-8 encoding of a unicode string.
  • the WAPI (and our CLI tools), in particular the t=json bodies, shouldn't
    explode when they see unrecognized caps, in the . Furthermore, it'd be
    nice if they could treat such caps as opaque objects and still be able to
    do certain manipulation of them (like copying/moving them from one
    directory to another).

The way we build dirnodes tells us that we can put arbitrary bytestrings into
them: whatever syntax we use is not constrained by the container they are put
in, as long as any non-bytestring syntax we use is encoded into a bytestring
on the way in.

The WUI display is not a significant constraint: we rarely accept filecaps as
input in the WUI, and we only display them on the "More Info" page, which
could use repr(). We don't need to construct URLs out of unrecognized caps,
because the WUI doesn't provide any operations to work with them.

The WAPI t=json encoding is big constraint: since we pass filecaps as
dictionary values in the JSON body, which limits them to being ASCII or
unicode objects, unless we change the definition of the t=json API to declare
that the values it contains are something else.

(sometimes you can retroactively change protocol definitions like this in
ways that magically retain backwards compatibility: for example, if we
declared that the V1 encoding of filecaps has in fact actually been unicode,
but that everybody thought it was just ASCII because nobody had ever created
a unicode filecap before, then the t=json body definition could similarly be
retroactively defined as "UTF-8 encoding of the V1 filecap", and all the
ASCII filecaps coming from those APIs would still look the same)

So anyways, the reason that I think filecaps are ASCII or bytestrings is
because they contain machine-readable data like cryptovalues and numbers, not
human-generated/human-readable data like names.

I'd strongly recommend that, if we're going to plan for expansion of the "V1"
filecap-as-string syntax beyond ASCII, then we should plan for them to be
bytestrings. You can reliably compare bytestrings for equality (which is not
generally the case for unicode strings), there is an unambigious mapping from
filecap-as-bundle-of-data to filecap-as-bytestring (which is not the case for
unicode: even if we tell everyone to use UTF-8 instead of UTF-16/etc, there
are still too many options).

A related but separate issue is how to plan for expansion to the V2/V3/etc
syntaxes. The V1 syntax, as currently (narrowly) defined, is always printable
ASCII and always starts with "URI:". We could define a V2 dense-binary syntax
which, given a single leading version byte, would not overlap with the V1
syntax. Likewise a V3 real-URI syntax, which started with tahoe://,
would not overlap. We might then retroactively define the filecaps stored in
dirnodes to be bytestrings that parse in one of these three forms (allowing
smaller dirnodes with dense binary caps). If the current dirnode-handling
code can tolerate+ignore arbitrary bytestrings, then this might be safe.
(t=json might not, however).

I concur: filecaps are either ASCII or arbitrary bytestrings, not unicode objects. Actually, I think of it this way: * filecaps are abstract bundles of location/identification information about files/directories (think about our URI objects) * we currently have one concrete expression syntax for filecaps, call it V1, which starts with "URI:" and always contains printable ASCII * we can imagine other expression syntaxes in the future, in particular a dense binary form (call it V2), and a more-official follows-the-RFC-about-URIs URI form, with `://` and everything (call it V3) * the internal dirnode-traversing code must be able to understand the syntax used in the dirnodes that it unpacks. The dirnodes contain a bytestring (packed with netstrings, not JSON). This is the first constraint on what future code can put into dirnodes * the WUI which displays caps in dirnodes shouldn't explode when it sees a cap it doesn't recognize. This is the second constraint. * the URLs passed into the webapi can contain both a filecap and child name (subdir path and/or filename). HTTP enforces a specific type here: %-encoding of a bytestring, and it is common to expect the bytestring to be a UTF-8 encoding of a unicode string. * the WAPI (and our CLI tools), in particular the t=json bodies, shouldn't explode when they see unrecognized caps, in the . Furthermore, it'd be nice if they could treat such caps as opaque objects and still be able to do certain manipulation of them (like copying/moving them from one directory to another). The way we build dirnodes tells us that we can put arbitrary bytestrings into them: whatever syntax we use is not constrained by the container they are put in, as long as any non-bytestring syntax we use is encoded into a bytestring on the way in. The WUI display is not a significant constraint: we rarely accept filecaps as input in the WUI, and we only display them on the "More Info" page, which could use repr(). We don't need to construct URLs out of unrecognized caps, because the WUI doesn't provide any operations to work with them. The WAPI t=json encoding is big constraint: since we pass filecaps as dictionary values in the JSON body, which limits them to being ASCII or unicode objects, unless we change the definition of the t=json API to declare that the values it contains are something else. (sometimes you can retroactively change protocol definitions like this in ways that magically retain backwards compatibility: for example, if we declared that the V1 encoding of filecaps has in fact actually been unicode, but that everybody thought it was just ASCII because nobody had ever created a unicode filecap before, then the t=json body definition could similarly be retroactively defined as "UTF-8 encoding of the V1 filecap", and all the ASCII filecaps coming from those APIs would still look the same) So anyways, the reason that I think filecaps are ASCII or bytestrings is because they contain machine-readable data like cryptovalues and numbers, not human-generated/human-readable data like names. I'd strongly recommend that, if we're going to plan for expansion of the "V1" filecap-as-string syntax beyond ASCII, then we should plan for them to be bytestrings. You can reliably compare bytestrings for equality (which is not generally the case for unicode strings), there is an unambigious mapping from filecap-as-bundle-of-data to filecap-as-bytestring (which is not the case for unicode: even if we tell everyone to use UTF-8 instead of UTF-16/etc, there are still too many options). A related but separate issue is how to plan for expansion to the V2/V3/etc syntaxes. The V1 syntax, as currently (narrowly) defined, is always printable ASCII and always starts with "URI:". We could define a V2 dense-binary syntax which, given a single leading version byte, would not overlap with the V1 syntax. Likewise a V3 real-URI syntax, which started with `tahoe://`, would not overlap. We might then retroactively define the filecaps stored in dirnodes to be bytestrings that parse in one of these three forms (allowing smaller dirnodes with dense binary caps). If the current dirnode-handling code can tolerate+ignore arbitrary bytestrings, then this might be safe. (t=json might not, however).
Author

I need to read, understand, respond to Brian's objection.

I need to read, understand, respond to Brian's objection.
warner was unassigned by zooko 2010-07-08 16:54:38 +00:00
zooko self-assigned this 2010-07-08 16:54:38 +00:00
Author

Okay I've read this through a few times now and I'm not sure I understand all of it.

To start with, the "related but separate issue" at the end of comment:378383 can safely go into a separate ticket, right?

Next, I'm fairly sure that this can also go into a separate ticket, possibly that same one: I'd strongly recommend that, if we're going to plan for expansion of the "V1" filecap-as-string syntax beyond ASCII, then we should plan for them to be bytestrings..

Maybe this ticket could be named "capabilities from the future could be non-ascii and non-unicode bytestrings". Does that make sense at all?

So, to focus on what I see as the point of this ticket I would like to ask Brian and David-Sarah a few questions. "Socratic" questioning often sounds condescending and irritating to me. These are actual questions that I don't already know the "right answers" to.

Suppose Alice is running Tahoe-LAFS v1.8.0, in the year 2020, and suppose hypothetically that for some reason that is currently unimaginable to us, we have in the year 2019 defined an "expression syntax" for Tahoe-LAFS caps which are unicode, like this: lafs://from_the_future_fw-蜔쳨欝遃䝦舜琇襇邤䍏㵦☚✸킾궑蒴犏띎냔㳆㼿졨浴䒉ΐ屝稜퍙鉧.

Now suppose, that Bob has a cap like that, and he conveys it to Alice, either by sending it to her and inviting her to click on it or cut and paste it (or wave her magic wand at it or whatever they do in 2020) to enter it into her Tahoe-LAFS v1.8.0 client. Or, suppose Bob puts it into a Tahoe-LAFS directory which Alice has read-access to and asks her to look at that directory.

Question 1: What would you want to happen (in this hypothetical scenario) when Alice waves her wand at it or lists that Tahoe-LAFS directory?

Question 2: What would happen if Alice were using Tahoe-LAFS v1.7.0? (I don't know the answer to this question. Wouldn't her client incur an internal TypeError of some kind?)

Question 3: How would you write a unit test which answers Question 2? My attempt at that was attachment:test-nonascii-future-caps.dpatch , but maybe that test doesn't actually answer Question 2. I'm not sure.

Okay I've read this through a few times now and I'm not sure I understand all of it. To start with, the "related but separate issue" at the end of [comment:378383](/tahoe-lafs/trac/issues/1051#issuecomment-378383) can safely go into a separate ticket, right? Next, I'm *fairly* sure that this can also go into a separate ticket, possibly that same one: *I'd strongly recommend that, if we're going to plan for expansion of the "V1" filecap-as-string syntax beyond ASCII, then we should plan for them to be bytestrings.*. Maybe this ticket could be named "capabilities from the future could be non-ascii and non-unicode bytestrings". Does that make sense at all? So, to focus on what I see as the point of *this* ticket I would like to ask Brian and David-Sarah a few questions. "Socratic" questioning often sounds condescending and irritating to me. These are actual questions that I don't already know the "right answers" to. Suppose Alice is running Tahoe-LAFS v1.8.0, in the year 2020, and suppose *hypothetically* that for some reason that is currently unimaginable to us, we have in the year 2019 defined an "expression syntax" for Tahoe-LAFS caps which are unicode, like this: `lafs://from_the_future_fw-蜔쳨欝遃䝦舜琇襇邤䍏㵦☚✸킾궑蒴犏띎냔㳆㼿졨浴䒉ΐ屝稜퍙鉧`. Now suppose, that Bob has a cap like that, and he conveys it to Alice, either by sending it to her and inviting her to click on it or cut and paste it (or wave her magic wand at it or whatever they do in 2020) to enter it into her Tahoe-LAFS v1.8.0 client. Or, suppose Bob puts it into a Tahoe-LAFS directory which Alice has read-access to and asks her to look at that directory. Question 1: What would you want to happen (in this hypothetical scenario) when Alice waves her wand at it or lists that Tahoe-LAFS directory? Question 2: What would happen if Alice were using Tahoe-LAFS v1.7.0? (I don't know the answer to this question. Wouldn't her client incur an internal TypeError of some kind?) Question 3: How would you write a unit test which answers Question 2? My attempt at that was [attachment:test-nonascii-future-caps.dpatch](/tahoe-lafs/trac/attachments/000078ac-929f-9640-bc82-d383dad09a16) , but maybe that test doesn't actually answer Question 2. I'm not sure.

My objection was slightly different to Brian's. I was objecting to the internal representation of URIs being "either a Unicode string or a bytestring", as the patch assumed. From experience, that's extremely error-prone and leads to horrible problems with implicit conversions.

Representing URIs as UTF-8 doesn't have that problem, and satisfies Brian's criteria in comment:378383 :

  • you can reliably compare UTF-8 bytestrings for equality
  • there is an unambiguous mapping from filecap-as-bundle-of-data to filecap-as-UTF-8-string (some bundles-of-data will be invalid, but that's fine)

I can't speak for Brian, but I believe from IRC conversations that he was skeptical of the whole idea of Unicode in caps. I share some of that skepticism, but I think it's relatively harmless to allow the possibility; it doesn't add much complexity.

Note that if we do use Unicode in caps in future, we should limit the character set to characters for which normalization is not an issue. (There are big blocks of Han characters with no equivalences, for example.)

Re: the JSON encoding -- JSON strings are by definition Unicode, and the JSON bodies are already assumed to be UTF-8 (which is necessary for filenames). The only other compatible option for encoding URIs in JSON would be ISO-Latin-1 (i.e. encode bytes 0x80..0xFF as \x80..\xFF), but it makes no sense to me to use a mixture of UTF-8 and ISO-Latin-1. Also see below for the current behaviour when using simplejson.dumps.

Dense cap encoding is a separate issue that has nothing to do with Unicode. At some point we will probably change the dirnode format for other reasons (e.g. to support deep-verify caps), and then we can consider whether the new format uses dense encoding. I don't think there's much space to be saved, though.

Replying to warner:

We don't need to construct URLs out of unrecognized caps, because the WUI doesn't provide any operations to work with them.

That's not quite true; the WUI accepts unrecognized caps in the form field to link a cap into a directory.

Replying to zooko:

Okay I've read this through a few times now and I'm not sure I understand all of it.

To start with, the "related but separate issue" at the end of comment:378383 can safely go into a separate ticket, right?

I think so.

Next, I'm fairly sure that this can also go into a separate ticket, possibly that same one: I'd strongly recommend that, if we're going to plan for expansion of the "V1" filecap-as-string syntax beyond ASCII, then we should plan for them to be bytestrings..

No; we have to make a choice between UTF-8 and arbitrary bytestrings. That's part of this ticket.

Maybe this ticket could be named "capabilities from the future could be non-ascii and non-unicode bytestrings". Does that make sense at all?

Err, no. All bytestrings are non-Unicode. The question is whether they represent Unicode, i.e. whether they need to be valid UTF-8.

[...]

Suppose Alice is running Tahoe-LAFS v1.8.0, in the year 2020, and suppose hypothetically that for some reason that is currently unimaginable to us, we have in the year 2019 defined an "expression syntax" for Tahoe-LAFS caps which are unicode, like this: lafs://from_the_future_fw-蜔쳨欝遃䝦舜琇襇邤䍏㵦☚✸킾궑蒴犏띎냔㳆㼿졨浴䒉ΐ屝稜퍙鉧.

OK.

Now suppose, that Bob has a cap like that, and he conveys it to Alice, either by sending it to her and inviting her to click on it or cut and paste it (or wave her magic wand at it or whatever they do in 2020) to enter it into her Tahoe-LAFS v1.8.0 client. Or, suppose Bob puts it into a Tahoe-LAFS directory which Alice has read-access to and asks her to look at that directory.

Question 1: What would you want to happen (in this hypothetical scenario) when Alice waves her wand at it or lists that Tahoe-LAFS directory?

She should get an unlinked entry with '?', '?-IMM', or '?-RO' in the first column.

Question 2: What would happen if Alice were using Tahoe-LAFS v1.7.0? (I don't know the answer to this question. Wouldn't her client incur an internal TypeError of some kind?)

I haven't tried an end-to-end test, but from browsing the code for HTML directory listings (web/directory.py), I believe that it won't attempt to decode the URI, so it will be treated like any other unknown URI -- i.e. she will get the desired unlinked entry. I don't know what will happen for the Info page.

For JSON directory listings, again I haven't tried an end-to-end test, but the behaviour of simplejson.dumps is to assume that bytestrings in the input are UTF-8, and (if they are valid UTF-8) encode them with a \u escape in the resulting JSON. In other words, I think this may entirely accidentally do the right thing. If the directory contains an URI that is not valid UTF-8, then a UnicodeDecodeError will probably occur [here]source:src/allmydata/web/directory.py@4527#L851.

Question 3: How would you write a unit test which answers Question 2? My attempt at that was attachment:test-nonascii-future-caps.dpatch , but maybe that test doesn't actually answer Question 2. I'm not sure.

I'd use something like that patch, but with .encode('utf-8') added to all of the Unicode URI strings. (Also I prefer to use Unicode escapes rather than UTF-8 in source files.)

My objection was slightly different to Brian's. I was objecting to the internal representation of URIs being "either a Unicode string or a bytestring", as the patch assumed. From experience, that's extremely error-prone and leads to horrible problems with implicit conversions. Representing URIs as UTF-8 doesn't have that problem, and satisfies Brian's criteria in [comment:378383](/tahoe-lafs/trac/issues/1051#issuecomment-378383) : * you can reliably compare UTF-8 bytestrings for equality * there is an unambiguous mapping from filecap-as-bundle-of-data to filecap-as-UTF-8-string (some bundles-of-data will be invalid, but that's fine) I can't speak for Brian, but I believe from IRC conversations that he was skeptical of the whole idea of Unicode in caps. I share some of that skepticism, but I think it's relatively harmless to allow the possibility; it doesn't add much complexity. Note that if we do use Unicode in caps in future, we should limit the character set to characters for which normalization is not an issue. (There are big blocks of Han characters with no equivalences, for example.) Re: the JSON encoding -- JSON strings are by definition Unicode, and the JSON bodies are already assumed to be UTF-8 (which is necessary for filenames). The only other compatible option for encoding URIs in JSON would be ISO-Latin-1 (i.e. encode bytes 0x80..0xFF as \x80..\xFF), but it makes no sense to me to use a mixture of UTF-8 and ISO-Latin-1. Also see below for the current behaviour when using `simplejson.dumps`. Dense cap encoding is a separate issue that has nothing to do with Unicode. At some point we will probably change the dirnode format for other reasons (e.g. to support deep-verify caps), and then we can consider whether the new format uses dense encoding. I don't think there's much space to be saved, though. Replying to [warner](/tahoe-lafs/trac/issues/1051#issuecomment-378383): > We don't need to construct URLs out of unrecognized caps, because the WUI doesn't provide any operations to work with them. That's not quite true; the WUI accepts unrecognized caps in the form field to link a cap into a directory. Replying to [zooko](/tahoe-lafs/trac/issues/1051#issuecomment-378387): > Okay I've read this through a few times now and I'm not sure I understand all of it. > > To start with, the "related but separate issue" at the end of [comment:378383](/tahoe-lafs/trac/issues/1051#issuecomment-378383) can safely go into a separate ticket, right? I think so. > Next, I'm *fairly* sure that this can also go into a separate ticket, possibly that same one: *I'd strongly recommend that, if we're going to plan for expansion of the "V1" filecap-as-string syntax beyond ASCII, then we should plan for them to be bytestrings.*. No; we have to make a choice between UTF-8 and arbitrary bytestrings. That's part of this ticket. > Maybe this ticket could be named "capabilities from the future could be non-ascii and non-unicode bytestrings". Does that make sense at all? Err, no. All bytestrings are non-Unicode. The question is whether they *represent* Unicode, i.e. whether they need to be valid UTF-8. [...] > Suppose Alice is running Tahoe-LAFS v1.8.0, in the year 2020, and suppose *hypothetically* that for some reason that is currently unimaginable to us, we have in the year 2019 defined an "expression syntax" for Tahoe-LAFS caps which are unicode, like this: `lafs://from_the_future_fw-蜔쳨欝遃䝦舜琇襇邤䍏㵦☚✸킾궑蒴犏띎냔㳆㼿졨浴䒉ΐ屝稜퍙鉧`. OK. > Now suppose, that Bob has a cap like that, and he conveys it to Alice, either by sending it to her and inviting her to click on it or cut and paste it (or wave her magic wand at it or whatever they do in 2020) to enter it into her Tahoe-LAFS v1.8.0 client. Or, suppose Bob puts it into a Tahoe-LAFS directory which Alice has read-access to and asks her to look at that directory. > > Question 1: What would you want to happen (in this hypothetical scenario) when Alice waves her wand at it or lists that Tahoe-LAFS directory? She should get an unlinked entry with '?', '?-IMM', or '?-RO' in the first column. > Question 2: What would happen if Alice were using Tahoe-LAFS v1.7.0? (I don't know the answer to this question. Wouldn't her client incur an internal TypeError of some kind?) I haven't tried an end-to-end test, but from browsing the code for HTML directory listings (web/directory.py), I believe that it won't attempt to decode the URI, so it will be treated like any other unknown URI -- i.e. she will get the desired unlinked entry. I don't know what will happen for the Info page. For JSON directory listings, again I haven't tried an end-to-end test, but the behaviour of `simplejson.dumps` is to assume that bytestrings in the input are UTF-8, and (if they are valid UTF-8) encode them with a `\u` escape in the resulting JSON. In other words, I think this *may* entirely accidentally do the right thing. If the directory contains an URI that is not valid UTF-8, then a `UnicodeDecodeError` will probably occur [here]source:src/allmydata/web/directory.py@4527#L851. > Question 3: How would you write a unit test which answers Question 2? My attempt at that was [attachment:test-nonascii-future-caps.dpatch](/tahoe-lafs/trac/attachments/000078ac-929f-9640-bc82-d383dad09a16) , but maybe that test doesn't actually answer Question 2. I'm not sure. I'd use something like that patch, but with `.encode('utf-8')` added to all of the Unicode URI strings. (Also I prefer to use Unicode escapes rather than UTF-8 in source files.)

Attachment test-utf8-future-caps.dpatch (162820 bytes) added

Add tests of caps from the future that have non-ASCII characters in them (encoded as UTF-8). The changes to test_uri .py, test_client.py, and test_dirnode.py add tests of non-ASCII future caps in addition to the current tests. The change s to test_web.py just replace the tests of all-ASCII future caps with tests of non-ASCII future caps. We also change use s of failUnlessEqual to failUnlessReallyEqual, in order to catch cases where the type of a string is not as expected.

**Attachment** test-utf8-future-caps.dpatch (162820 bytes) added Add tests of caps from the future that have non-ASCII characters in them (encoded as UTF-8). The changes to test_uri .py, test_client.py, and test_dirnode.py add tests of non-ASCII future caps in addition to the current tests. The change s to test_web.py just replace the tests of all-ASCII future caps with tests of non-ASCII future caps. We also change use s of failUnlessEqual to failUnlessReallyEqual, in order to catch cases where the type of a string is not as expected.

Attachment behaviour-utf8-future-caps.dpatch (5160 bytes) added

Allow URIs passed in the initial JSON for t=mkdir-with-children, t=mkdir-immutable to be Unicode (this makes 'test-utf8-future-caps.dpatch' pass). Also pass the name of each child into nodemaker.create_from_cap for error reporting.

**Attachment** behaviour-utf8-future-caps.dpatch (5160 bytes) added Allow URIs passed in the initial JSON for t=mkdir-with-children, t=mkdir-immutable to be Unicode (this makes 'test-utf8-future-caps.dpatch' pass). Also pass the name of each child into nodemaker.create_from_cap for error reporting.
daira modified the milestone from 1.8.0 to 1.7.1 2010-07-11 20:51:52 +00:00

Incidentally, these tests show that the directory listing case already worked in 1.7.0 (as far as I can tell having only run them on Windows). I think the only thing that didn't work was passing Unicode URIs in the JSON for t=mkdir-with-children and t=mkdir-immutable.

Incidentally, these tests show that the directory listing case already worked in 1.7.0 (as far as I can tell having only run them on Windows). I think the only thing that didn't work was passing Unicode URIs in the JSON for `t=mkdir-with-children` and `t=mkdir-immutable`.
Author

Okay, I still haven't fully understood Brian's objection. Brian: please review!

As far as I dimly understand the whole issue, with these patches we will effectively have no constraints on future representations of caps (except that they can't start with 'URI:'). Tahoe-LAFS v1.7.0 turns out to already allow any future-caps in Tahoe-LAFS dirs and just show them as '?', but would have raised an exception if you tried to write such future-caps in with the WAPI's t=mkdir-with-children and t=mkdir-immutable. With these patches, Tahoe-LAFS v1.7.1 would also accept any future-caps through the WAPI.

Okay, I still haven't fully understood Brian's objection. Brian: please review! As far as I dimly understand the whole issue, with these patches we will effectively have no constraints on future representations of caps (except that they can't start with 'URI:'). Tahoe-LAFS v1.7.0 turns out to already allow any future-caps in Tahoe-LAFS dirs and just show them as '?', but would have raised an exception if you tried to write such future-caps in with the WAPI's `t=mkdir-with-children` and `t=mkdir-immutable`. With these patches, Tahoe-LAFS v1.7.1 would also accept any future-caps through the WAPI.
zooko removed their assignment 2010-07-12 04:16:02 +00:00
warner was assigned by zooko 2010-07-12 04:16:02 +00:00
Author

I reviewed attachment:behaviour-utf8-future-caps.dpatch and it looks correct to me.

I reviewed [attachment:behaviour-utf8-future-caps.dpatch](/tahoe-lafs/trac/attachments/000078ac-929f-9640-bc82-3a0801ebc6eb) and it looks correct to me.

David-Sarah's analysis in comment:14 is mostly in line with my thinking.

I object less to "filecaps are UTF-8 encoding of some unicode string" than
"filecaps are unicode strings". This would let us say that filecaps are
bytestrings but with a constraint that filecap.decode("utf-8") must not
throw an exception, and perhaps the additional constraint that
filecap.decode("utf-8").encode("utf-8")==filecap. If we went this way,
we should say that the UTF-8 -encoded form is the primary one (i.e., if you
want to compare two filecaps, use filecap1==filecap2, not
filecap1.decode("utf-8")==filecap2.decode("utf-8").

That still feels weird, though: UTF-8 is an encoding of something else, and
in general you want to be comparing the primary form, not some encoding
thereof. And filecaps must be unambiguous. If you wanted to visually
compare two ASCII filecaps, you could do it easily (in fact the base32 takes
out the o/0 1/i/I/l/L homoglyphs). While I don't expect people to do this
much, the fact that two unicode strings simply cannot be safely compared this
way has got to be a bad sign.

If we really must accept more than just ASCII, then I'd prefer to accept
completely arbitrary bytestrings. The biggest problem with doing this is the
t=json WAPI: if I'd taken this issue at all seriously when I built the
webapi, I would have defined the t=json format to emit base64-encoded
filecaps or something similar. (actually, at that point I did not yet realize
that JSON could not handle arbitrary binary data.. if I had, I might have
skipped JSON altogether and used protocol buffers or netstrings or
something).

But one option would be to have the t=json response leave out any filecap
that cannot be expressed in printable ASCII (i.e., run a regexp against it
before populating the child-info dictionary, replace it with an "unknown cap"
marker if that fails). I can't remember if we covered this one during the
earlier caps-from-the-future discussion.

If we go with "filecaps are UTF-8 encoding of a unicode string", then the
t=json API doesn't give enough information to clients to compare the real
filecaps: all they can get is filecap.decode("utf-8") . In addition, at
some point inside the webapi, we'd have to convert the filecaps into unicode
before adding them to the JSON response. I'm really nervous about the
information-losing behavior of unicode conversions, and security problems
that can result.

Note that if we do use Unicode in caps in future, we should limit the
character set to characters for which normalization is not an issue. (There
are big blocks of Han characters with no equivalences, for example.)

Ugh.. how can we make this safe? That is, when somebody pastes in a cap, how
do we verify that it isn't using any characters in this set? Is this set even
constant? When we're all speaking Lojban or Ilaksh or Marain or something in
the future, won't there be new codepoints which the old code can't recognize
as being non-normalizable?

A related but separate issue is how to plan for expansion to the V2/V3/etc
syntaxes.

While parts of this may belong in other tickets, I think it remains relevant
for this one. Your desire to plan for new things in our V1 filecaps might
actually be a desire to define and implement those V2/V3 syntaxes (and
improve the webapi to accept them, etc). So it may be better to leave the V1
syntax definition alone, leave certain Tahoe interfaces intolerant to the
potential new forms, and declare that we'll replace those interfaces with
V2+-tolerant ones before we start using those forms.

== Re: behaviour-utf8-future-caps.dpatch ==

Why the s/name/namex/g ? Did you maybe mean to say "name = unicode(namex)"
to highlight the transition from "unicode or bytestring" to "really unicode",
and then leave the other instances of "name" alone?

The writecap = to_str(propropdict.get("rw_uri")) line performs the
unicode-to-UTF8 conversion. This means that webapi users calling
t=mkdir-with-children or t=mkdir-immutable are giving us unicode,
not UTF-8 bytestrings (i.e. tahoe gets
callerwritecap.decode("utf-8").encode("utf-8"), because the JSON
library is doing a decode before tahoe proper sees the data). Worse yet, the
decode and the encode are being done by different pieces of code (I'd hope
that the JSON library uses python's .decode logic, but who knows?).
That's the best way to implement the unicode-caps design, but it also makes
it clear that this is not an exact transformation.

I didn't review it earlier, but nodemaker.create_from_cap(name=) is weird.
I'd be concerned about unicode creeping into an exception instance and then
causing bytestring-only logging to break (such as when it is written to
twistd.log). I'm not sure what a good solution is: I see how it's a bit
easier to pass "extraneous" information down into a function that might raise
an exception (and stuff it into the exception message down there), rather
than e.g. catch the exception higher up (where knowing name= is a bit more
natural) and somehow gluing the name into the already-constructed exception
object.

== Re: test-utf8-future-caps.dpatch ==

Hrm, could you reduce the instances of "failUnlessReallyEqual" to things that
just test caps? Seeing it on things like
(c.getServiceNamed("storage"}.reserved_space, 0) makes the patch
awfully big. Hm, and if there were some clever way to make it the same length
as "failUnlessEqual", that would reduce the noise even further (if you do
this, which I don't think you should, note that
len(assertTypeEqual)==len(failUnlessEqual)).

I don't think using failUnlessReallyEqual in test_dirnode.py on things
like set(metadata.keys() does everything you want it to: it will assert
that both sides are of type Set, but it won't assert that the members of
those sets are both of type string.

In test_dirnode.py, I would call the new variables
"future_unicode_write_uri", rather than "future_nonascii_write_uri", to make
it clear that this is one possible direction (and that there are others).

== Conclusions ==

behaviour-utf8-future-caps.dpatch: yes, this patch is pretty harmless, I
don't mind it going in.

test-utf8-future-caps.dpatch: I see no problems with the patch per se, but I
think the examples it uses set a bad precedent, by causing anyone reading the
test to believe that tahoe's future caps will be unicode, which I think is a
bad idea.

I don't object to these two patches going in, but I will continue to object
to the idea that the filecaps accepted by our existing interfaces (and stored
in existing dirnodes) should be defined as unicode-encoded-to-UTF8. I think
the best approaches are, in order of preference:

  1. continue to restrict filecaps to printable ASCII
  2. define filecaps as arbitrary bytestrings and replace the t=json WAPI
    interface which is unable to tolerate such a wide range

I don't want to define filecaps to be unicode. Unicode exists to represent
strings of written human languages. Filecaps are records/structs of
cryptovalues. We have more tools to manipulate printable/copypastable strings
than to manipulate abstract records of cryptovalues, so expressing filecaps
as strings is convenient, but we should pick the encoding to serve tahoe's
needs, rather than trying to make any conceivable written-human-language
string meaningful as a tahoe filecap.

That said, for users who have a solid unicode-friendly set of tools and want
to tweet their filecaps, I don't object to an encoding scheme that somehow
takes a filecap and expresses it as a string of unicode characters (this
would be a "V4", in my V1/V2/V3 scheme from comment:11). But the tahoe
interfaces that accept this need to be clearly marked, and I think the
current t=json is not one of them.

David-Sarah's analysis in comment:14 is mostly in line with my thinking. I object less to "filecaps are UTF-8 encoding of some unicode string" than "filecaps are unicode strings". This would let us say that filecaps are bytestrings but with a constraint that `filecap.decode("utf-8")` must not throw an exception, and perhaps the additional constraint that `filecap.decode("utf-8").encode("utf-8")==filecap`. If we went this way, we should say that the UTF-8 -encoded form is the primary one (i.e., if you want to compare two filecaps, use `filecap1==filecap2`, not `filecap1.decode("utf-8")==filecap2.decode("utf-8")`. That still feels weird, though: UTF-8 is an encoding of something else, and in general you want to be comparing the primary form, not some encoding thereof. And filecaps *must* be unambiguous. If you wanted to visually compare two ASCII filecaps, you could do it easily (in fact the base32 takes out the o/0 1/i/I/l/L homoglyphs). While I don't expect people to do this much, the fact that two unicode strings simply cannot be safely compared this way has got to be a bad sign. If we really must accept more than just ASCII, then I'd prefer to accept completely arbitrary bytestrings. The biggest problem with doing this is the t=json WAPI: if I'd taken this issue at all seriously when I built the webapi, I would have defined the t=json format to emit base64-encoded filecaps or something similar. (actually, at that point I did not yet realize that JSON could not handle arbitrary binary data.. if I had, I might have skipped JSON altogether and used protocol buffers or netstrings or something). But one option would be to have the t=json response leave out any filecap that cannot be expressed in printable ASCII (i.e., run a regexp against it before populating the child-info dictionary, replace it with an "unknown cap" marker if that fails). I can't remember if we covered this one during the earlier caps-from-the-future discussion. If we go with "filecaps are UTF-8 encoding of a unicode string", then the t=json API doesn't give enough information to clients to compare the real filecaps: all they can get is `filecap.decode("utf-8")` . In addition, at some point inside the webapi, we'd have to convert the filecaps into unicode before adding them to the JSON response. I'm really nervous about the information-losing behavior of unicode conversions, and security problems that can result. > Note that if we do use Unicode in caps in future, we should limit the > character set to characters for which normalization is not an issue. (There > are big blocks of Han characters with no equivalences, for example.) Ugh.. how can we make this safe? That is, when somebody pastes in a cap, how do we verify that it isn't using any characters in this set? Is this set even constant? When we're all speaking Lojban or Ilaksh or Marain or something in the future, won't there be new codepoints which the old code can't recognize as being non-normalizable? > A related but separate issue is how to plan for expansion to the V2/V3/etc > syntaxes. While parts of this may belong in other tickets, I think it remains relevant for this one. Your desire to plan for new things in our V1 filecaps might actually be a desire to define and implement those V2/V3 syntaxes (and improve the webapi to accept them, etc). So it may be better to leave the V1 syntax definition alone, leave certain Tahoe interfaces intolerant to the potential new forms, and declare that we'll replace those interfaces with V2+-tolerant ones before we start using those forms. == Re: behaviour-utf8-future-caps.dpatch == Why the s/name/namex/g ? Did you maybe mean to say "`name = unicode(namex)`" to highlight the transition from "unicode or bytestring" to "really unicode", and then leave the other instances of "name" alone? The `writecap = to_str(propropdict.get("rw_uri"))` line performs the unicode-to-UTF8 conversion. This means that webapi users calling `t=mkdir-with-children` or `t=mkdir-immutable` are giving us unicode, not UTF-8 bytestrings (i.e. tahoe gets `callerwritecap.decode("utf-8").encode("utf-8")`, because the JSON library is doing a decode before tahoe proper sees the data). Worse yet, the decode and the encode are being done by different pieces of code (I'd hope that the JSON library uses python's `.decode` logic, but who knows?). That's the best way to implement the unicode-caps design, but it also makes it clear that this is not an exact transformation. I didn't review it earlier, but nodemaker.create_from_cap(name=) is weird. I'd be concerned about unicode creeping into an exception instance and then causing bytestring-only logging to break (such as when it is written to twistd.log). I'm not sure what a good solution is: I see how it's a bit easier to pass "extraneous" information down into a function that might raise an exception (and stuff it into the exception message down there), rather than e.g. catch the exception higher up (where knowing name= is a bit more natural) and somehow gluing the name into the already-constructed exception object. == Re: test-utf8-future-caps.dpatch == Hrm, could you reduce the instances of "failUnlessReallyEqual" to things that just test caps? Seeing it on things like `(c.getServiceNamed("storage"}.reserved_space, 0)` makes the patch awfully big. Hm, and if there were some clever way to make it the same length as "failUnlessEqual", that would reduce the noise even further (if you do this, which I don't think you should, note that len(assertTypeEqual)==len(failUnlessEqual)). I don't think using `failUnlessReallyEqual` in test_dirnode.py on things like `set(metadata.keys()` does everything you want it to: it will assert that both sides are of type Set, but it won't assert that the members of those sets are both of type string. In test_dirnode.py, I would call the new variables "future_unicode_write_uri", rather than "future_nonascii_write_uri", to make it clear that this is one possible direction (and that there are others). == Conclusions == behaviour-utf8-future-caps.dpatch: yes, this patch is pretty harmless, I don't mind it going in. test-utf8-future-caps.dpatch: I see no problems with the patch per se, but I think the examples it uses set a bad precedent, by causing anyone reading the test to believe that tahoe's future caps will be unicode, which I think is a bad idea. I don't object to these two patches going in, but I will continue to object to the idea that the filecaps accepted by our existing interfaces (and stored in existing dirnodes) should be defined as unicode-encoded-to-UTF8. I think the best approaches are, in order of preference: 1. continue to restrict filecaps to printable ASCII 2. define filecaps as arbitrary bytestrings and replace the t=json WAPI interface which is unable to tolerate such a wide range I don't want to define filecaps to be unicode. Unicode exists to represent strings of written human languages. Filecaps are records/structs of cryptovalues. We have more tools to manipulate printable/copypastable strings than to manipulate abstract records of cryptovalues, so expressing filecaps as strings is convenient, but we should pick the encoding to serve tahoe's needs, rather than trying to make any conceivable written-human-language string meaningful as a tahoe filecap. That said, for users who have a solid unicode-friendly set of tools and want to tweet their filecaps, I don't object to an encoding scheme that somehow takes a filecap and expresses it as a string of unicode characters (this would be a "V4", in my V1/V2/V3 scheme from comment:11). But the tahoe interfaces that accept this need to be clearly marked, and I think the current t=json is not one of them.
Author

Wow! Thanks for the detailed review.

I think we need to take the broader precedent-setting discussion somewhere else, such as the mailing list and then once it starts to gel move it to the wiki/NewCapDesign page.

I'll try to focus on the narrower issues in this comment.

Replying to warner:

But one option would be to have the t=json response leave out any filecap
that cannot be expressed in printable ASCII (i.e., run a regexp against it
before populating the child-info dictionary, replace it with an "unknown cap"
marker if that fails). I can't remember if we covered this one during the
earlier caps-from-the-future discussion.

What purpose would that serve?

test-utf8-future-caps.dpatch: I see no problems with the patch per se, but I
think the examples it uses set a bad precedent, by causing anyone reading the
test to believe that tahoe's future caps will be unicode, which I think is a
bad idea.

Maybe add a comment saying that we do not intend to invent caps like these--these are only examples of possibilities for testing.

Wow! Thanks for the detailed review. I think we need to take the broader precedent-setting discussion somewhere else, such as the mailing list and then once it starts to gel move it to the [wiki/NewCapDesign](wiki/NewCapDesign) page. I'll try to focus on the narrower issues in this comment. Replying to [warner](/tahoe-lafs/trac/issues/1051#issuecomment-378395): > > But one option would be to have the t=json response leave out any filecap > that cannot be expressed in printable ASCII (i.e., run a regexp against it > before populating the child-info dictionary, replace it with an "unknown cap" > marker if that fails). I can't remember if we covered this one during the > earlier caps-from-the-future discussion. What purpose would that serve? > test-utf8-future-caps.dpatch: I see no problems with the patch per se, but I > think the examples it uses set a bad precedent, by causing anyone reading the > test to believe that tahoe's future caps will be unicode, which I think is a > bad idea. Maybe add a comment saying that we do not intend to invent caps like these--these are only examples of possibilities for testing.

behaviour-utf8-future-caps.dpatch applied in changeset:fa0fd66e17fe845b.

behaviour-utf8-future-caps.dpatch applied in changeset:fa0fd66e17fe845b.
warner was unassigned by zooko 2010-07-18 03:38:53 +00:00
zooko self-assigned this 2010-07-18 03:38:53 +00:00
Author

applied tests as changeset:d346e0853d9b0b4b. added comment about the tests of caps "from the future" being actually from an alternate reality future: changeset:7cc98759bd1baca3

applied tests as changeset:d346e0853d9b0b4b. added comment about the tests of caps "from the future" being actually from an alternate reality future: changeset:7cc98759bd1baca3
zooko added the
r/fixed
label 2010-07-18 05:51:04 +00:00
zooko closed this issue 2010-07-18 05:51:04 +00:00

Some of the tests added in changeset:d346e0853d9b0b4b were too strict in testing the type of values parsed from JSON (which is different depending on the simplejson version). This was fixed in changeset:74c41ebb8bb772c2.

Some of the tests added in changeset:d346e0853d9b0b4b were too strict in testing the type of values parsed from JSON (which is different depending on the `simplejson` version). This was fixed in changeset:74c41ebb8bb772c2.
Sign in to join this conversation.
No labels
c/code
c/code-dirnodes
c/code-encoding
c/code-frontend
c/code-frontend-cli
c/code-frontend-ftp-sftp
c/code-frontend-magic-folder
c/code-frontend-web
c/code-mutable
c/code-network
c/code-nodeadmin
c/code-peerselection
c/code-storage
c/contrib
c/dev-infrastructure
c/docs
c/operational
c/packaging
c/unknown
c/website
kw:2pc
kw:410
kw:9p
kw:ActivePerl
kw:AttributeError
kw:DataUnavailable
kw:DeadReferenceError
kw:DoS
kw:FileZilla
kw:GetLastError
kw:IFinishableConsumer
kw:K
kw:LeastAuthority
kw:Makefile
kw:RIStorageServer
kw:StringIO
kw:UncoordinatedWriteError
kw:about
kw:access
kw:access-control
kw:accessibility
kw:accounting
kw:accounting-crawler
kw:add-only
kw:aes
kw:aesthetics
kw:alias
kw:aliases
kw:aliens
kw:allmydata
kw:amazon
kw:ambient
kw:annotations
kw:anonymity
kw:anonymous
kw:anti-censorship
kw:api_auth_token
kw:appearance
kw:appname
kw:apport
kw:archive
kw:archlinux
kw:argparse
kw:arm
kw:assertion
kw:attachment
kw:auth
kw:authentication
kw:automation
kw:avahi
kw:availability
kw:aws
kw:azure
kw:backend
kw:backoff
kw:backup
kw:backupdb
kw:backward-compatibility
kw:bandwidth
kw:basedir
kw:bayes
kw:bbfreeze
kw:beta
kw:binaries
kw:binutils
kw:bitcoin
kw:bitrot
kw:blacklist
kw:blocker
kw:blocks-cloud-deployment
kw:blocks-cloud-merge
kw:blocks-magic-folder-merge
kw:blocks-merge
kw:blocks-raic
kw:blocks-release
kw:blog
kw:bom
kw:bonjour
kw:branch
kw:branding
kw:breadcrumbs
kw:brians-opinion-needed
kw:browser
kw:bsd
kw:build
kw:build-helpers
kw:buildbot
kw:builders
kw:buildslave
kw:buildslaves
kw:cache
kw:cap
kw:capleak
kw:captcha
kw:cast
kw:centos
kw:cffi
kw:chacha
kw:charset
kw:check
kw:checker
kw:chroot
kw:ci
kw:clean
kw:cleanup
kw:cli
kw:cloud
kw:cloud-backend
kw:cmdline
kw:code
kw:code-checks
kw:coding-standards
kw:coding-tools
kw:coding_tools
kw:collection
kw:compatibility
kw:completion
kw:compression
kw:confidentiality
kw:config
kw:configuration
kw:configuration.txt
kw:conflict
kw:connection
kw:connectivity
kw:consistency
kw:content
kw:control
kw:control.furl
kw:convergence
kw:coordination
kw:copyright
kw:corruption
kw:cors
kw:cost
kw:coverage
kw:coveralls
kw:coveralls.io
kw:cpu-watcher
kw:cpyext
kw:crash
kw:crawler
kw:crawlers
kw:create-container
kw:cruft
kw:crypto
kw:cryptography
kw:cryptography-lib
kw:cryptopp
kw:csp
kw:curl
kw:cutoff-date
kw:cycle
kw:cygwin
kw:d3
kw:daemon
kw:darcs
kw:darcsver
kw:database
kw:dataloss
kw:db
kw:dead-code
kw:deb
kw:debian
kw:debug
kw:deep-check
kw:defaults
kw:deferred
kw:delete
kw:deletion
kw:denial-of-service
kw:dependency
kw:deployment
kw:deprecation
kw:desert-island
kw:desert-island-build
kw:design
kw:design-review-needed
kw:detection
kw:dev-infrastructure
kw:devpay
kw:directory
kw:directory-page
kw:dirnode
kw:dirnodes
kw:disconnect
kw:discovery
kw:disk
kw:disk-backend
kw:distribute
kw:distutils
kw:dns
kw:do_http
kw:doc-needed
kw:docker
kw:docs
kw:docs-needed
kw:dokan
kw:dos
kw:download
kw:downloader
kw:dragonfly
kw:drop-upload
kw:duplicity
kw:dusty
kw:earth-dragon
kw:easy
kw:ec2
kw:ecdsa
kw:ed25519
kw:egg-needed
kw:eggs
kw:eliot
kw:email
kw:empty
kw:encoding
kw:endpoint
kw:enterprise
kw:enum34
kw:environment
kw:erasure
kw:erasure-coding
kw:error
kw:escaping
kw:etag
kw:etch
kw:evangelism
kw:eventual
kw:example
kw:excess-authority
kw:exec
kw:exocet
kw:expiration
kw:extensibility
kw:extension
kw:failure
kw:fedora
kw:ffp
kw:fhs
kw:figleaf
kw:file
kw:file-descriptor
kw:filename
kw:filesystem
kw:fileutil
kw:fips
kw:firewall
kw:first
kw:floatingpoint
kw:flog
kw:foolscap
kw:forward-compatibility
kw:forward-secrecy
kw:forwarding
kw:free
kw:freebsd
kw:frontend
kw:fsevents
kw:ftp
kw:ftpd
kw:full
kw:furl
kw:fuse
kw:garbage
kw:garbage-collection
kw:gateway
kw:gatherer
kw:gc
kw:gcc
kw:gentoo
kw:get
kw:git
kw:git-annex
kw:github
kw:glacier
kw:globalcaps
kw:glossary
kw:google-cloud-storage
kw:google-drive-backend
kw:gossip
kw:governance
kw:grid
kw:grid-manager
kw:gridid
kw:gridsync
kw:grsec
kw:gsoc
kw:gvfs
kw:hackfest
kw:hacktahoe
kw:hang
kw:hardlink
kw:heartbleed
kw:heisenbug
kw:help
kw:helper
kw:hint
kw:hooks
kw:how
kw:how-to
kw:howto
kw:hp
kw:hp-cloud
kw:html
kw:http
kw:https
kw:i18n
kw:i2p
kw:i2p-collab
kw:illustration
kw:image
kw:immutable
kw:impressions
kw:incentives
kw:incident
kw:init
kw:inlineCallbacks
kw:inotify
kw:install
kw:installer
kw:integration
kw:integration-test
kw:integrity
kw:interactive
kw:interface
kw:interfaces
kw:interoperability
kw:interstellar-exploration
kw:introducer
kw:introduction
kw:iphone
kw:ipkg
kw:iputil
kw:ipv6
kw:irc
kw:jail
kw:javascript
kw:joke
kw:jquery
kw:json
kw:jsui
kw:junk
kw:key-value-store
kw:kfreebsd
kw:known-issue
kw:konqueror
kw:kpreid
kw:kvm
kw:l10n
kw:lae
kw:large
kw:latency
kw:leak
kw:leasedb
kw:leases
kw:libgmp
kw:license
kw:licenss
kw:linecount
kw:link
kw:linux
kw:lit
kw:localhost
kw:location
kw:locking
kw:logging
kw:logo
kw:loopback
kw:lucid
kw:mac
kw:macintosh
kw:magic-folder
kw:manhole
kw:manifest
kw:manual-test-needed
kw:map
kw:mapupdate
kw:max_space
kw:mdmf
kw:memcheck
kw:memory
kw:memory-leak
kw:mesh
kw:metadata
kw:meter
kw:migration
kw:mime
kw:mingw
kw:minimal
kw:misc
kw:miscapture
kw:mlp
kw:mock
kw:more-info-needed
kw:mountain-lion
kw:move
kw:multi-users
kw:multiple
kw:multiuser-gateway
kw:munin
kw:music
kw:mutability
kw:mutable
kw:mystery
kw:names
kw:naming
kw:nas
kw:navigation
kw:needs-review
kw:needs-spawn
kw:netbsd
kw:network
kw:nevow
kw:new-user
kw:newcaps
kw:news
kw:news-done
kw:news-needed
kw:newsletter
kw:newurls
kw:nfc
kw:nginx
kw:nixos
kw:no-clobber
kw:node
kw:node-url
kw:notification
kw:notifyOnDisconnect
kw:nsa310
kw:nsa320
kw:nsa325
kw:numpy
kw:objects
kw:old
kw:openbsd
kw:openitp-packaging
kw:openssl
kw:openstack
kw:opensuse
kw:operation-helpers
kw:operational
kw:operations
kw:ophandle
kw:ophandles
kw:ops
kw:optimization
kw:optional
kw:options
kw:organization
kw:os
kw:os.abort
kw:ostrom
kw:osx
kw:osxfuse
kw:otf-magic-folder-objective1
kw:otf-magic-folder-objective2
kw:otf-magic-folder-objective3
kw:otf-magic-folder-objective4
kw:otf-magic-folder-objective5
kw:otf-magic-folder-objective6
kw:p2p
kw:packaging
kw:partial
kw:password
kw:path
kw:paths
kw:pause
kw:peer-selection
kw:performance
kw:permalink
kw:permissions
kw:persistence
kw:phone
kw:pickle
kw:pip
kw:pipermail
kw:pkg_resources
kw:placement
kw:planning
kw:policy
kw:port
kw:portability
kw:portal
kw:posthook
kw:pratchett
kw:preformance
kw:preservation
kw:privacy
kw:process
kw:profile
kw:profiling
kw:progress
kw:proxy
kw:publish
kw:pyOpenSSL
kw:pyasn1
kw:pycparser
kw:pycrypto
kw:pycrypto-lib
kw:pycryptopp
kw:pyfilesystem
kw:pyflakes
kw:pylint
kw:pypi
kw:pypy
kw:pysqlite
kw:python
kw:python3
kw:pythonpath
kw:pyutil
kw:pywin32
kw:quickstart
kw:quiet
kw:quotas
kw:quoting
kw:raic
kw:rainhill
kw:random
kw:random-access
kw:range
kw:raspberry-pi
kw:reactor
kw:readonly
kw:rebalancing
kw:recovery
kw:recursive
kw:redhat
kw:redirect
kw:redressing
kw:refactor
kw:referer
kw:referrer
kw:regression
kw:rekey
kw:relay
kw:release
kw:release-blocker
kw:reliability
kw:relnotes
kw:remote
kw:removable
kw:removable-disk
kw:rename
kw:renew
kw:repair
kw:replace
kw:report
kw:repository
kw:research
kw:reserved_space
kw:response-needed
kw:response-time
kw:restore
kw:retrieve
kw:retry
kw:review
kw:review-needed
kw:reviewed
kw:revocation
kw:roadmap
kw:rollback
kw:rpm
kw:rsa
kw:rss
kw:rst
kw:rsync
kw:rusty
kw:s3
kw:s3-backend
kw:s3-frontend
kw:s4
kw:same-origin
kw:sandbox
kw:scalability
kw:scaling
kw:scheduling
kw:schema
kw:scheme
kw:scp
kw:scripts
kw:sdist
kw:sdmf
kw:security
kw:self-contained
kw:server
kw:servermap
kw:servers-of-happiness
kw:service
kw:setup
kw:setup.py
kw:setup_requires
kw:setuptools
kw:setuptools_darcs
kw:sftp
kw:shared
kw:shareset
kw:shell
kw:signals
kw:simultaneous
kw:six
kw:size
kw:slackware
kw:slashes
kw:smb
kw:sneakernet
kw:snowleopard
kw:socket
kw:solaris
kw:space
kw:space-efficiency
kw:spam
kw:spec
kw:speed
kw:sqlite
kw:ssh
kw:ssh-keygen
kw:sshfs
kw:ssl
kw:stability
kw:standards
kw:start
kw:startup
kw:static
kw:static-analysis
kw:statistics
kw:stats
kw:stats_gatherer
kw:status
kw:stdeb
kw:storage
kw:streaming
kw:strports
kw:style
kw:stylesheet
kw:subprocess
kw:sumo
kw:survey
kw:svg
kw:symlink
kw:synchronous
kw:tac
kw:tahoe-*
kw:tahoe-add-alias
kw:tahoe-admin
kw:tahoe-archive
kw:tahoe-backup
kw:tahoe-check
kw:tahoe-cp
kw:tahoe-create-alias
kw:tahoe-create-introducer
kw:tahoe-debug
kw:tahoe-deep-check
kw:tahoe-deepcheck
kw:tahoe-lafs-trac-stream
kw:tahoe-list-aliases
kw:tahoe-ls
kw:tahoe-magic-folder
kw:tahoe-manifest
kw:tahoe-mkdir
kw:tahoe-mount
kw:tahoe-mv
kw:tahoe-put
kw:tahoe-restart
kw:tahoe-rm
kw:tahoe-run
kw:tahoe-start
kw:tahoe-stats
kw:tahoe-unlink
kw:tahoe-webopen
kw:tahoe.css
kw:tahoe_files
kw:tahoewapi
kw:tarball
kw:tarballs
kw:tempfile
kw:templates
kw:terminology
kw:test
kw:test-and-set
kw:test-from-egg
kw:test-needed
kw:testgrid
kw:testing
kw:tests
kw:throttling
kw:ticket999-s3-backend
kw:tiddly
kw:time
kw:timeout
kw:timing
kw:to
kw:to-be-closed-on-2011-08-01
kw:tor
kw:tor-protocol
kw:torsocks
kw:tox
kw:trac
kw:transparency
kw:travis
kw:travis-ci
kw:trial
kw:trickle
kw:trivial
kw:truckee
kw:tub
kw:tub.location
kw:twine
kw:twistd
kw:twistd.log
kw:twisted
kw:twisted-14
kw:twisted-trial
kw:twitter
kw:twn
kw:txaws
kw:type
kw:typeerror
kw:ubuntu
kw:ucwe
kw:ueb
kw:ui
kw:unclean
kw:uncoordinated-writes
kw:undeletable
kw:unfinished-business
kw:unhandled-error
kw:unhappy
kw:unicode
kw:unit
kw:unix
kw:unlink
kw:update
kw:upgrade
kw:upload
kw:upload-helper
kw:uri
kw:url
kw:usability
kw:use-case
kw:utf-8
kw:util
kw:uwsgi
kw:ux
kw:validation
kw:variables
kw:vdrive
kw:verify
kw:verlib
kw:version
kw:versioning
kw:versions
kw:video
kw:virtualbox
kw:virtualenv
kw:vista
kw:visualization
kw:visualizer
kw:vm
kw:volunteergrid2
kw:volunteers
kw:vpn
kw:wapi
kw:warners-opinion-needed
kw:warning
kw:weapi
kw:web
kw:web.port
kw:webapi
kw:webdav
kw:webdrive
kw:webport
kw:websec
kw:website
kw:websocket
kw:welcome
kw:welcome-page
kw:welcomepage
kw:wiki
kw:win32
kw:win64
kw:windows
kw:windows-related
kw:winscp
kw:workaround
kw:world-domination
kw:wrapper
kw:write-enabler
kw:wui
kw:x86
kw:x86-64
kw:xhtml
kw:xml
kw:xss
kw:zbase32
kw:zetuptoolz
kw:zfec
kw:zookos-opinion-needed
kw:zope
kw:zope.interface
p/blocker
p/critical
p/major
p/minor
p/normal
p/supercritical
p/trivial
r/cannot reproduce
r/duplicate
r/fixed
r/invalid
r/somebody else's problem
r/was already fixed
r/wontfix
r/worksforme
t/defect
t/enhancement
t/task
v/0.2.0
v/0.3.0
v/0.4.0
v/0.5.0
v/0.5.1
v/0.6.0
v/0.6.1
v/0.7.0
v/0.8.0
v/0.9.0
v/1.0.0
v/1.1.0
v/1.10.0
v/1.10.1
v/1.10.2
v/1.10a2
v/1.11.0
v/1.12.0
v/1.12.1
v/1.13.0
v/1.14.0
v/1.15.0
v/1.15.1
v/1.2.0
v/1.3.0
v/1.4.1
v/1.5.0
v/1.6.0
v/1.6.1
v/1.7.0
v/1.7.1
v/1.7β
v/1.8.0
v/1.8.1
v/1.8.2
v/1.8.3
v/1.8β
v/1.9.0
v/1.9.0-s3branch
v/1.9.0a1
v/1.9.0a2
v/1.9.0b1
v/1.9.1
v/1.9.2
v/1.9.2a1
v/cloud-branch
v/unknown
No milestone
No project
No assignees
3 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: tahoe-lafs/trac#1051
No description provided.