reorg, copy relevant notes from referenced tickets (unrolling the loops, so to speak)
[Imported from Trac: page NewCapDesign, version 9]
parent
f7db5622c6
commit
e5a2340879
126
NewCapDesign.md
126
NewCapDesign.md
|
@ -6,8 +6,9 @@ across separate tickets: this page is here to consolidate them. We should not
|
||||||
release a new filecap format without checking it against everything on this
|
release a new filecap format without checking it against everything on this
|
||||||
list.
|
list.
|
||||||
|
|
||||||
There will be a related pair of new encoding designs. The [NewImmutableEncodingDesign](NewImmutableEncodingDesign)
|
There will be a related pair of new encoding designs. The
|
||||||
and [NewMutableEncodingDesign](NewMutableEncodingDesign) pages will hold those design discussions.
|
[NewImmutableEncodingDesign](NewImmutableEncodingDesign) and [NewMutableEncodingDesign](NewMutableEncodingDesign) pages will hold those
|
||||||
|
design discussions.
|
||||||
|
|
||||||
Ticket #432 was the starting point: it contained a list of features.
|
Ticket #432 was the starting point: it contained a list of features.
|
||||||
|
|
||||||
|
@ -25,8 +26,9 @@ established sense). To make them real, we need to:
|
||||||
necessarily provide enough information to actually access it (i.e. if you
|
necessarily provide enough information to actually access it (i.e. if you
|
||||||
have a URI and somebody pointed you at a file, you could confidently tell
|
have a URI and somebody pointed you at a file, you could confidently tell
|
||||||
them whether or not it was the right file, but if you only have the URI,
|
them whether or not it was the right file, but if you only have the URI,
|
||||||
then you might not be able to find the file without additional information). If
|
then you might not be able to find the file without additional
|
||||||
the cap has both identifying and location information, it's called a URL.
|
information). If the cap has both identifying and location information,
|
||||||
|
it's called a URL.
|
||||||
* Tahoe filecaps are meant to be URLs (they are intended to provide location
|
* Tahoe filecaps are meant to be URLs (they are intended to provide location
|
||||||
information), but to really make that work, you also need to define which
|
information), but to really make that work, you also need to define which
|
||||||
grid you're talking about. So far this has always been implicit, but that
|
grid you're talking about. So far this has always been implicit, but that
|
||||||
|
@ -45,8 +47,7 @@ established sense). To make them real, we need to:
|
||||||
them, and that we have a clear procedure for starting with a filecap and
|
them, and that we have a clear procedure for starting with a filecap and
|
||||||
a gateway HTTP URL, and ending with the contents of the file.
|
a gateway HTTP URL, and ending with the contents of the file.
|
||||||
|
|
||||||
|
## make them shorter, prettier, and easier to use
|
||||||
## other features
|
|
||||||
|
|
||||||
* Short and not so ugly. This is important to enable
|
* Short and not so ugly. This is important to enable
|
||||||
cut-and-paste (see below), but also just because people are
|
cut-and-paste (see below), but also just because people are
|
||||||
|
@ -54,8 +55,14 @@ established sense). To make them real, we need to:
|
||||||
notes in which dozens of people have spontaneously complained
|
notes in which dozens of people have spontaneously complained
|
||||||
about the current URLs. By contrast, tiny URLs such as
|
about the current URLs. By contrast, tiny URLs such as
|
||||||
tinyurl.com, bit.ly, etc. are ubiquitous nowadays; users have
|
tinyurl.com, bit.ly, etc. are ubiquitous nowadays; users have
|
||||||
no problem with those -- see Twitter. See below for notes on
|
no problem with those -- see Twitter.
|
||||||
cap length.
|
* I (warner) am curious about where the suspicion comes from. Do long URLs
|
||||||
|
make people think they're being attacked, some sort of browser buffer
|
||||||
|
overrun thing? Or that they're being phished, with a URL that a human
|
||||||
|
would evaluate differently than their browser? I agree that people
|
||||||
|
(including me) don't like long URLs, but I've never pushed anyone to
|
||||||
|
explain the "suspicion" aspect. One comment in #217 says "smells a bit
|
||||||
|
spammy", and a later one says "Spooks me every time".
|
||||||
* Enable convenient cut-and-paste. If caps are too long they'll wrap in
|
* Enable convenient cut-and-paste. If caps are too long they'll wrap in
|
||||||
email. If they contain lots of word-breaking characters then you have to
|
email. If they contain lots of word-breaking characters then you have to
|
||||||
drag after you've double clicked (this is probably ok). If the word-broken
|
drag after you've double clicked (this is probably ok). If the word-broken
|
||||||
|
@ -77,18 +84,55 @@ established sense). To make them real, we need to:
|
||||||
webbrowser (i.e. when you click on `tahoe:foo`, a helper program is
|
webbrowser (i.e. when you click on `tahoe:foo`, a helper program is
|
||||||
launched with `tahoe:foo`, and that in turn launches your web browser
|
launched with `tahoe:foo`, and that in turn launches your web browser
|
||||||
with `<http://localhost:8123/foo>`). (#52)
|
with `<http://localhost:8123/foo>`). (#52)
|
||||||
|
|
||||||
|
## make them long enough to be secure
|
||||||
|
|
||||||
|
We want filecaps to be as possible, but no shorter. There are
|
||||||
|
several lower bounds on the length:
|
||||||
|
|
||||||
|
* confidentiality: A large computing effort should not be able
|
||||||
|
to obtain the plaintext of a tahoe file without knowing the
|
||||||
|
readcap. We require reasonable margin against improvements in
|
||||||
|
hardware speed and organization efficiency/motivation of
|
||||||
|
distributed efforts (e.g. could a million PS3 owners break a
|
||||||
|
filecap?). This currently implies a 128 bit confidentiality
|
||||||
|
field.
|
||||||
|
* integrity: a large computing effort should not be able to
|
||||||
|
produce shares which will be accepted by the readcap holder
|
||||||
|
but which do not result in the same file as created the
|
||||||
|
original uploader (and retrieved by other downloaders). We
|
||||||
|
desire all three of the standard hash properties (collision
|
||||||
|
resistance, first-pre-image resistance, second-pre-image
|
||||||
|
resistance) to also apply to tahoe immutable files and their
|
||||||
|
filecaps. This currently implies a 128bit (or 256bit?)
|
||||||
|
integrity field.
|
||||||
|
* variable-length integrity field (#102, comment 16+17),
|
||||||
|
allowing users to decide between short caps and strong
|
||||||
|
integrity guarantees
|
||||||
|
* storage collision resistance (#753): a Tahoe grid should be
|
||||||
|
able to store trillions of files and still have a vanishingly
|
||||||
|
small chance of two files using the same storage-index (and
|
||||||
|
thus confusing each other's shares). The storage-index is
|
||||||
|
generally compressed out of the filecap, by deriving it with
|
||||||
|
various hashing stages on the other filecap parameters. The
|
||||||
|
shortest value in this derivation chain must be at least
|
||||||
|
128bits long, and preferably about 192bits long.
|
||||||
|
|
||||||
|
|
||||||
|
## other features
|
||||||
|
|
||||||
* Self-identifying. It should be visually clear what sort of filecap the
|
* Self-identifying. It should be visually clear what sort of filecap the
|
||||||
string represents: read-write or read-only, mutable-or-immutable,
|
string represents: read-write or read-only, mutable-or-immutable,
|
||||||
file-or-directory. This is especially important when sharing tahoe objects
|
file-or-directory. This is especially important when sharing tahoe objects
|
||||||
over out-of-band channels like IM and email: it should be easy for the
|
over out-of-band channels like IM and email: it should be easy for the
|
||||||
user to tell whether they're giving away readonly access or read-write
|
user to tell whether they're giving away readonly access or read-write
|
||||||
access. We've considered prefixes like `DWM..` for "Directory
|
access. We've considered prefixes like `DWM..` for "Directory
|
||||||
Writeable Mutable" and `FRI..` for "File Readonly Immutable". If these
|
Writeable Mutable" and `FRI..` for "File Readonly Immutable" (#102
|
||||||
are jammed against the (base62) crypto bits it may be difficult to tell
|
comment 12). If these are jammed against the (base62) crypto bits it may
|
||||||
where the prefix ends and the crypto bits begin, especially because the
|
be difficult to tell where the prefix ends and the crypto bits begin,
|
||||||
crypto bits will be using the same character set (`FRIDWM...`). It
|
especially because the crypto bits will be using the same character set
|
||||||
might be a good idea to separate the type prefix from the cryptobits:
|
(`FRIDWM...`). It might be a good idea to separate the type prefix
|
||||||
`FRI-cryptobits` or `FRI/cryptobits`.
|
from the cryptobits: `FRI-cryptobits` or `FRI/cryptobits`.
|
||||||
* in addition, tahoe URIs should be distinguishable from local filenames by
|
* in addition, tahoe URIs should be distinguishable from local filenames by
|
||||||
a CLI tool, so that `tahoe cp $CAP local/foo.txt` is unambiguous.
|
a CLI tool, so that `tahoe cp $CAP local/foo.txt` is unambiguous.
|
||||||
(unfortunately, the current practice of using "tahoe:" as a default alias
|
(unfortunately, the current practice of using "tahoe:" as a default alias
|
||||||
|
@ -110,15 +154,16 @@ established sense). To make them real, we need to:
|
||||||
trivial. Another way to think about this is that if our filecaps were
|
trivial. Another way to think about this is that if our filecaps were
|
||||||
verbose s-expressions, these caps could be expressed as "(readonly
|
verbose s-expressions, these caps could be expressed as "(readonly
|
||||||
(mutable cryptobits))" and "(directory (readonly (mutable cryptobits)))".
|
(mutable cryptobits))" and "(directory (readonly (mutable cryptobits)))".
|
||||||
* provide for verifycaps, repaircaps, and traversalcaps. Repaircaps in
|
* provide for verifycaps, repaircaps, and traversalcaps (#308, #217).
|
||||||
particular may require a grant of storage authority, which might entail a
|
Repaircaps in particular may require a grant of storage authority, which
|
||||||
cap format that can accept arbitrary extra non-hierarchical fields.
|
might entail a cap format that can accept arbitrary extra non-hierarchical
|
||||||
Appendcaps or "drop-box" writecaps might fall into this same space. But
|
fields. Appendcaps or "drop-box" writecaps might fall into this same
|
||||||
remember that URIs should identify objects, not the action that you want
|
space. But remember that URIs should identify objects, not the action that
|
||||||
to do on it: a webapi scheme may use a POST/PUT/DELETE method, or append a
|
you want to do on it: a webapi scheme may use a POST/PUT/DELETE method, or
|
||||||
t=json adverb, or alternatively encode the verb/adverb into the HTTP url
|
append a t=json adverb, or alternatively encode the verb/adverb into the
|
||||||
(think `GET .../filecap/json` or `PUT unlinked/ciphertext`), but
|
HTTP url (think `GET .../filecap/json` or ```PUT
|
||||||
these are independent of the underlying filecap.
|
unlinked/ciphertext```), but these are independent of the underlying
|
||||||
|
filecap.
|
||||||
* provide ciphertext access. Reading from a verifycap should give you
|
* provide ciphertext access. Reading from a verifycap should give you
|
||||||
ciphertext. It should be possible to upload ciphertext directly.
|
ciphertext. It should be possible to upload ciphertext directly.
|
||||||
* provide for a grid-identifier, possibly on the MSB end, e.g.
|
* provide for a grid-identifier, possibly on the MSB end, e.g.
|
||||||
|
@ -127,36 +172,5 @@ established sense). To make them real, we need to:
|
||||||
mean `tahoe://grid1234/IR/cryptobits`. Something like
|
mean `tahoe://grid1234/IR/cryptobits`. Something like
|
||||||
`tahoe://grid1234/D/MR/cryptobits` should reference
|
`tahoe://grid1234/D/MR/cryptobits` should reference
|
||||||
`tahoe://grid1234/MR/cryptobits`. (#403)
|
`tahoe://grid1234/MR/cryptobits`. (#403)
|
||||||
* #102 and #217 have notes on dircaps
|
* permit multiple encodings of the same file (same k, different N) to use
|
||||||
* #678 (converge same file, same K, different M)
|
each other's shares (#678)
|
||||||
|
|
||||||
## filecap length
|
|
||||||
|
|
||||||
We want filecaps to be as possible, but no shorter. There are
|
|
||||||
several lower bounds on the length:
|
|
||||||
|
|
||||||
* confidentiality: A large computing effort should not be able
|
|
||||||
to obtain the plaintext of a tahoe file without knowing the
|
|
||||||
readcap. We require reasonable margin against improvements in
|
|
||||||
hardware speed and organization efficiency/motivation of
|
|
||||||
distributed efforts (e.g. could a million PS3 owners break a
|
|
||||||
filecap?). This currently implies a 128 bit confidentiality
|
|
||||||
parameter.
|
|
||||||
* integrity: a large computing effort should not be able to
|
|
||||||
produce shares which will be accepted by the readcap holder
|
|
||||||
but which do not result in the same file as created the
|
|
||||||
original uploader (and retrieved by other downloaders). We
|
|
||||||
desire all three of the standard hash properties (collision
|
|
||||||
resistance, first-pre-image resistance, second-pre-image
|
|
||||||
resistance) to also apply to tahoe immutable files and their
|
|
||||||
filecaps. This currently implies a 128bit (or 256bit?) integrity
|
|
||||||
parameter.
|
|
||||||
* storage collision resistance (#753): a Tahoe grid should be
|
|
||||||
able to store trillions of files and still have a vanishingly
|
|
||||||
small chance of two files using the same storage-index (and
|
|
||||||
thus confusing each other's shares). The storage-index is
|
|
||||||
generally compressed out of the filecap, by deriving it with
|
|
||||||
various hashing stages on the other filecap parameters. The
|
|
||||||
shortest value in this derivation chain must be at least
|
|
||||||
128bits long, and preferably about 192bits long.
|
|
||||||
|
|
||||||
|
|
Loading…
Reference in a new issue