distributed authorization of access to nodes

warner commented

2008-01-30 21:03:41 +00:00

Owner

Zooko and I just got of the phone, and we sketched out a proposal for a
distributed introduction scheme that can accomplish the following goals:

no single Introducer: all nodes can act as introducers, if they wish. This
improves reliability.
it becomes easy for nodes to not run a storage server, which is what
we want for client nodes
clients will only use storage servers that have been "blessed", to allow
clients of the commercial grid to be confident that their shares are
placed on reliable+available servers
(eventually) storage servers will only accept shares from clients
that have been "blessed", to allow the commercial grid to enforce quotas
and reject data from non-customers

The idea is that the introduction mechanism is uncontrolled and gossip-based,
making it highly robust but unsuitable for enforcing access control. An
additional later of public-key signatures is used to filter down the
unmanaged server list into a set of "blessed" servers.

This builds upon the scheme described in #68. Each Introducer has a publish
and subscribe interface that deal in terms of "announcements". Each
announcement is a tuple of (FURL, Purpose, Blessing). The 'Purpose' might be
a string like "storage". The 'Blessing' is a public-key signature (by some
specific "blesser" private key) of the (FURL+purpose) pair, which might
include a timestamp, validity period, revocation info, etc.

Subscribers will typically be connected to several Introducers. They'll tell
each one that they are interested in hearing about all announcements for a
specific Purpose which are signed by a set of blesser public keys. As an
efficiency measure, Introducers will only forward announcements that match
those restrictions, but the subscriber is ultimately responsible for checking
the signatures themselves.

As described in #68, introducers subscribe to each other, Introducers are
announced to each other just like all other servers, and anybody can publish
announcements into the mesh. As a result, any one Introducer FURL is
sufficient to bootstrap a connection to many/all of them, and peer-to-peer
gossip will provide some sort of limited-flood broadcast of all server
announcements.

Clients will be configured with a set of "blesser pubkeys" for each purpose.
Clients will use all storage servers that arrive with correctly signed
announcements.

(eventually) To protect storage servers against unauthorized clients, we'll
change the meaning of the server FURL to point to a credential-accepting
interface. Clients will present credentials to the server to convince it to
provide a client-specific reference (the "personal storage server facet"),
and then use that facet to upload share data. These credentials will come in
the form of a signed message (TODO: signing what, exactly? a tubid? a
callback FURL? how should we identify the "sender" of a message? we must make
sure that the server can't steal the credentials and use them on other
storage servers) that blesses the client's public key as authorized to
consume storage space (and bound to an "account ID", for accounting
purposes). The server will be configured with a set of client blesser pubkeys
for this purpose.

The server will also be configured with the FURL of the server-blesser. All
storage servers for the commercial grid will be configured with the same
FURL. When the storage server boots, it submits its server FURL to the
blesser, and gets back the blessing (signed message). After that, it can
submit the pair to the introducers. The client can be configured with an
analogous client-blesser.

This means all servers will be configured with two FURLs: one that gives them
access to the introduction mesh, and a second that distinguishes them from
non-blessed storage servers. Clients get the introducer FURL and an account
FURL. All nodes get the same introducer data, but not the blessing FURLs.

There is probably a way to express this in strict ocap terms, with Sealers
and Unsealers, but right now we seem to have a better handle on the
public-key approach.

Zooko and I just got of the phone, and we sketched out a proposal for a distributed introduction scheme that can accomplish the following goals: * no single Introducer: all nodes can act as introducers, if they wish. This improves reliability. * it becomes easy for nodes to *not* run a storage server, which is what we want for client nodes * clients will only use storage servers that have been "blessed", to allow clients of the commercial grid to be confident that their shares are placed on reliable+available servers * (eventually) storage servers will only accept shares from clients that have been "blessed", to allow the commercial grid to enforce quotas and reject data from non-customers The idea is that the introduction mechanism is uncontrolled and gossip-based, making it highly robust but unsuitable for enforcing access control. An additional later of public-key signatures is used to filter down the unmanaged server list into a set of "blessed" servers. This builds upon the scheme described in #68. Each Introducer has a publish and subscribe interface that deal in terms of "announcements". Each announcement is a tuple of (FURL, Purpose, Blessing). The 'Purpose' might be a string like "storage". The 'Blessing' is a public-key signature (by some specific "blesser" private key) of the (FURL+purpose) pair, which might include a timestamp, validity period, revocation info, etc. Subscribers will typically be connected to several Introducers. They'll tell each one that they are interested in hearing about all announcements for a specific Purpose which are signed by a set of blesser public keys. As an efficiency measure, Introducers will only forward announcements that match those restrictions, but the subscriber is ultimately responsible for checking the signatures themselves. As described in #68, introducers subscribe to each other, Introducers are announced to each other just like all other servers, and anybody can publish announcements into the mesh. As a result, any one Introducer FURL is sufficient to bootstrap a connection to many/all of them, and peer-to-peer gossip will provide some sort of limited-flood broadcast of all server announcements. Clients will be configured with a set of "blesser pubkeys" for each purpose. Clients will use all storage servers that arrive with correctly signed announcements. (eventually) To protect storage servers against unauthorized clients, we'll change the meaning of the server FURL to point to a credential-accepting interface. Clients will present credentials to the server to convince it to provide a client-specific reference (the "personal storage server facet"), and then use that facet to upload share data. These credentials will come in the form of a signed message (TODO: signing what, exactly? a tubid? a callback FURL? how should we identify the "sender" of a message? we must make sure that the server can't steal the credentials and use them on other storage servers) that blesses the client's public key as authorized to consume storage space (and bound to an "account ID", for accounting purposes). The server will be configured with a set of client blesser pubkeys for this purpose. The server will also be configured with the FURL of the server-blesser. All storage servers for the commercial grid will be configured with the same FURL. When the storage server boots, it submits its server FURL to the blesser, and gets back the blessing (signed message). After that, it can submit the pair to the introducers. The client can be configured with an analogous client-blesser. This means all servers will be configured with two FURLs: one that gives them access to the introduction mesh, and a second that distinguishes them from non-blessed storage servers. Clients get the introducer FURL and an account FURL. All nodes get the same introducer data, but not the blessing FURLs. There is probably a way to express this in strict ocap terms, with Sealers and Unsealers, but right now we seem to have a better handle on the public-key approach.

tahoe-lafs added the

labels 2008-01-30 21:03:41 +00:00

tahoe-lafs added this to the eventually milestone 2008-01-30 21:03:41 +00:00

warner commented

2008-01-30 21:24:04 +00:00

Author

Owner

The immediate questions are:

how much engineering time will this take?
is this the best way to accomplish our short-term (0.9.0) goals?:
1. make it easy to not publish a storage server
2. build a commercial grid in which the storage servers are all run by allmydata.com
3. avoid forcing ourselves into centralization-heavy design corners

The immediate questions are: * how much engineering time will this take? * is this the best way to accomplish our short-term (0.9.0) goals?: 1. make it easy to not publish a storage server 2. build a commercial grid in which the storage servers are all run by allmydata.com 3. avoid forcing ourselves into centralization-heavy design corners

warner commented

2008-01-30 21:30:15 +00:00

Author

Owner

Oh, and also I think we really need to nail down "grid ID" issues as we build this. Moving to gossip-based distributed introduction makes it all the more important to allow clean separation between one grid and a different one. Experience with the previous allmydata.com architecture emphasized this point: several problems were attributed to a node on the test network managing to announce itself to the production network.

I don't even know how to define a "grid ID".. a random number? With a centralized introducer, it's easy.. just use the FURL or tubid of that one introducer. With decentralized introduction, I can't think of anything better to use than a random (i.e. hopefully unique) string.

This would suggest that all introduction messages include the grid id in them, and the publish interface should completely ignore any messages that use a different grid id than their own. I think it's reasonable to declare that no node exist on multiple grids at once (or at least make this a special case, that must be enabled separately.. I can imagine wanting "bridges" between otherwise-distinct grids, for scaling purposes, especially if filenode/dirnode caps included a grid id, which could point you to a suitable bridge to retrieve the data).

Oh, and also I think we really need to nail down "grid ID" issues as we build this. Moving to gossip-based distributed introduction makes it all the more important to allow clean separation between one grid and a different one. Experience with the previous allmydata.com architecture emphasized this point: several problems were attributed to a node on the test network managing to announce itself to the production network. I don't even know how to define a "grid ID".. a random number? With a centralized introducer, it's easy.. just use the FURL or tubid of that one introducer. With decentralized introduction, I can't think of anything better to use than a random (i.e. hopefully unique) string. This would suggest that all introduction messages include the grid id in them, and the publish interface should completely ignore any messages that use a different grid id than their own. I think it's reasonable to declare that no node exist on multiple grids at once (or at least make this a special case, that must be enabled separately.. I can imagine wanting "bridges" between otherwise-distinct grids, for scaling purposes, especially if filenode/dirnode caps included a grid id, which could point you to a suitable bridge to retrieve the data).

zooko commented

2008-01-31 13:17:30 +00:00

Author

Owner

Hm. I think maybe the public key of the storage-server-blesser is the grid id. If the problem was nodes on the test network managing to announce themselves to the production network, then a solution is to require a public key signature from the production network blesser.

Hm. I think maybe the public key of the storage-server-blesser *is* the grid id. If the problem was nodes on the test network managing to announce themselves to the production network, then a solution is to require a public key signature from the production network blesser.

warner commented

2008-01-31 19:19:38 +00:00

Author

Owner

hm. My first reaction is "why should there only be one storage-server-blesser?". I.e., there might be a friendnet-type scenario in which everybody is running their own storage-server-blesser, and they achieve uniformity by accepting all the blessers at once. OTOH, it might be easier to accomplish this case by having just one blesser key and share the privkey with everyone in the grid.

zooko commented

2008-01-31 19:57:09 +00:00

Author

Owner

I see no reason why you wouldn't want to have more than one storage-server-blesser.

But still, wouldn't requiring a blessing from at least one of your storage-server-blessers be sufficient to solve this problem of errant nodes wandering into parties where they aren't welcome?

I see no reason why you wouldn't want to have more than one storage-server-blesser. But still, wouldn't requiring a blessing from at least one of your storage-server-blessers be sufficient to solve this problem of errant nodes wandering into parties where they aren't welcome?

zooko commented

2008-04-29 21:44:43 +00:00

Author

Owner

To make it easier to deploy this, we should update clients to ignore parts of announcements which they don't know how to deal with:

http://allmydata.org/trac/tahoe/browser/src/allmydata/introducer.py?rev=55dfb697a448dbc7#L291

            (furl, service_name, ri_name, nickname, ver, oldest) = ann

could be changed to

            (furl, service_name, ri_name, nickname, ver, oldest) = ann[:6]

In the future we might prefer to use named arguments instead of positional, for example the ann[6] element could be a dict and could be used to hold arguments by name for future evolutions of introduction.

To make it easier to deploy this, we should update clients to ignore parts of announcements which they don't know how to deal with: <http://allmydata.org/trac/tahoe/browser/src/allmydata/introducer.py?rev=55dfb697a448dbc7#L291> ``` (furl, service_name, ri_name, nickname, ver, oldest) = ann ``` could be changed to ``` (furl, service_name, ri_name, nickname, ver, oldest) = ann[:6] ``` In the future we might prefer to use named arguments instead of positional, for example the ann[6] element could be a dict and could be used to hold arguments by name for future evolutions of introduction.

tahoe-lafs modified the milestone from eventually to undecided

2008-06-01 20:48:40 +00:00

zooko commented

2008-09-24 13:22:43 +00:00

Author

Owner

Is this a duplicate of #466?

warner commented

2008-09-24 17:20:22 +00:00

Author

Owner

not exactly. Let's say that #295 is about a distributed introducer, while #466 is about signed/blessed extendable announcements. I'm changing the summary to match.

tahoe-lafs changed title from ~~distributed introduction and public-key-based server blessing~~ to distributed introduction: robust, gossip-based

2008-09-24 17:20:22 +00:00

zooko commented

2010-03-12 22:32:52 +00:00

Author

Owner

Changing the name to reflect my understanding of #68 as being about distributed introducer and this ticket as being about distributed control of access to nodes.

Changing the name to reflect my understanding of #68 as being about distributed *introducer* and this ticket as being about distributed control of access to nodes.

tahoe-lafs changed title from ~~distributed introduction: robust, gossip-based~~ to distributed control of access to nodes

2010-03-12 22:32:52 +00:00

tahoe-lafs changed title from ~~distributed control of access to nodes~~ to distributed authorization of access to nodes

2010-04-06 15:08:33 +00:00

distributed authorization of access to nodes #295