stop grovelling the whole storage backend looking for externally-added shares to add a lease to #1835

New issue

Open

opened 2012-10-30 23:02:37 +00:00 by zooko · 7 comments

zooko commented

2012-10-30 23:02:37 +00:00

I propose that we stop supporting this method of installing shares. If we stop supporting this, that would leave three options for if you want to add a share to a server:

Send it through the front door — use a tool that speaks the LAFS protocol, connects to the storage server over a network socket, and delivers the share. This will make the server write the share out to persistent storage, and also update the leasedb to reflect the share's existence, so that the share can get garbage-collected when appropriate. This would be a good way to do it if you have few shares or if they are on a remote server that can connect to this storage server over a network.
Copy the shares directly into place in the storage backend and then remove the leasedb. The next time the storage server starts, it will initiate a crawl that will eventually reconstruct the leasedb, and the newly reconstructed leasedb will include lease information about the new share so that it can eventually be garbage collected. This might be a reasonable thing to do when you are adding a large number of shares and it is easier/more efficient for you to add them directly to the storage backend, and you don't mind temporarily losing the lease information on the shares that are already there.
Copy the shares into place, but don't do anything that would register them in the leasedb. They are now immortal, unless a client subsequently adds a lease to them.

The combination of these two options might suffice for most real use cases. If there are use cases where these aren't good enough, i.e. it is too inconvenient or slow to send all of the shares through the LAFS storage protocol, and you don't want to destroy the extant lease information, and you don't want the new shares to possibly become immortal, then we could invent other ways to do it:

Copy the shares into place and then use a newly added feature of storage server which tells it to notice the existence of each new share (by storage index). This newly added feature doesn't need to be exported over the network to remote foolscap clients, it could just be a "tahoe" command-line that connects to the storage server's local WAPI. What the server does when it is informed this way about the existence of a share is check if the share is really there and then add it to the leasedb.
Copy the shares into place and then use a newly added feature of storage server which performs a full crawl to update the leasedb without first deleting it.

4 would be a bit more efficient than 5 when used, but a lot more complication for the server administrator, who has to figure out how to call tahoe add-share-to-lease-db $STORAGEINDEX for each share that he's added, or else that share will be immortal. It is also more work for us to implement.

5 is really simple both for us to implement and storage server operators to use. It is exactly like the current crawler code, except that instead of continuously restarting itself and going to look for new shares, it quiesces and doesn't restart unless the server operator invokes tahoe resync-lease-db.

So my proposal boils down to: change the accounting crawler never to run unless the leasedb is missing or corrupted (which also happens the first time you upgrade your server to a leasedb-capable version), or unless the operator has specifically indicated that the accounting crawler should run.

This is part of an "overarching ticket" to eliminate most uses of crawler — ticket #1834.

Currently, storage server operators can manually add share files into the storage backend, such as with "mv" or "rsync" or what have you, and a crawler will eventually discover that share and add a lease to it. I propose that we stop supporting this method of installing shares. If we stop supporting this, that would leave three options for if you want to add a share to a server: 1. Send it through the front door — use a tool that speaks the LAFS protocol, connects to the storage server over a network socket, and delivers the share. This will make the server write the share out to persistent storage, and also update the leasedb to reflect the share's existence, so that the share can get garbage-collected when appropriate. This would be a good way to do it if you have few shares or if they are on a remote server that can connect to this storage server over a network. 2. Copy the shares directly into place in the storage backend and then remove the leasedb. The next time the storage server starts, it will initiate a crawl that will eventually reconstruct the leasedb, and the newly reconstructed leasedb will include lease information about the new share so that it can eventually be garbage collected. This might be a reasonable thing to do when you are adding a large number of shares and it is easier/more efficient for you to add them directly to the storage backend, and you don't mind temporarily losing the lease information on the shares that are already there. 3. Copy the shares into place, but don't do anything that would register them in the leasedb. They are now immortal, unless a client subsequently adds a lease to them. The combination of these two options *might* suffice for most real use cases. If there are use cases where these aren't good enough, i.e. it is too inconvenient or slow to send all of the shares through the LAFS storage protocol, and you don't want to destroy the extant lease information, and you don't want the new shares to possibly become immortal, then we could invent other ways to do it: 4. Copy the shares into place and then use a newly added feature of storage server which tells it to notice the existence of each new share (by storage index). This newly added feature doesn't need to be exported over the network to remote foolscap clients, it could just be a "tahoe" command-line that connects to the storage server's local WAPI. What the server does when it is informed this way about the existence of a share is check if the share is really there and then add it to the leasedb. 5. Copy the shares into place and then use a newly added feature of storage server which performs a full crawl to update the leasedb without first deleting it. 4 would be a bit more efficient than 5 when used, but a lot more complication for the server administrator, who has to figure out how to call `tahoe add-share-to-lease-db $STORAGEINDEX` for each share that he's added, or else that share will be immortal. It is also more work for us to implement. 5 is really simple both for us to implement and storage server operators to use. It is exactly like the current crawler code, except that instead of continuously restarting itself and going to look for new shares, it quiesces and doesn't restart unless the server operator invokes `tahoe resync-lease-db`. So my proposal boils down to: change the accounting crawler never to run unless the leasedb is missing or corrupted (which also happens the first time you upgrade your server to a leasedb-capable version), or unless the operator has specifically indicated that the accounting crawler should run. This is part of an "overarching ticket" to eliminate most uses of crawler — ticket #1834.

zooko added the

labels 2012-10-30 23:02:37 +00:00

zooko added this to the undecided milestone 2012-10-30 23:02:37 +00:00

daira commented

2012-10-31 00:02:40 +00:00

Replying to zooko:

Currently, storage server operators can manually add share files into the storage backend, such as with "mv" or "rsync" or what have you, and a crawler will eventually discover that share and add a lease to it.

I propose that we stop supporting this method of installing shares. If we stop supporting this, that would leave three options for if you want to add a share to a server:

Send it through the front door — use a tool that speaks the LAFS protocol, connects to the storage server over a network socket, and delivers the share. [...]

Copy the shares directly into place in the storage backend and then remove the leasedb.

I don't like this option because it unnecessarily loses accounting information.

Copy the shares into place, but don't do anything that would register them in the leasedb. They are now immortal, unless a client subsequently adds a lease to them.

The fact that not doing anything to register the existence of the share is safe (doesn't lose data) is a useful property.

Note that this would only work if a server still queries the backend for a share even if it does not exist in the leasedb, rather than taking the leasedb as authoritative.

Perhaps if the share is ever requested, the server could then notice that it exists and add it to the leasedb. In that case, doing a filecheck on that file would be sufficient.

Copy the shares into place and then use a newly added feature of storage server which tells it to notice the existence of each new share (by storage index). This newly added feature doesn't need to be exported over the network to remote foolscap clients, it could just be a "tahoe" command-line that connects to the storage server's local WAPI.

Note that currently, running a WAPI is optional for storage servers.

Copy the shares into place and then use a newly added feature of storage server which performs a full crawl to update the leasedb without first deleting it.

Yes.

4 would be a bit more efficient than 5 when used, but a lot more complication for the server administrator, who has to figure out how to call tahoe add-share-to-lease-db $STORAGEINDEX for each share that he's added, or else that share will be immortal. It is also more work for us to implement.

I think the variant where requesting the share is sufficient to make the server notice it is simpler.

5 is really simple both for us to implement and storage server operators to use. It is exactly like the current crawler code, except that instead of continuously restarting itself and going to look for new shares, it quiesces and doesn't restart unless the server operator invokes tahoe resync-lease-db.

We can support that as well, subject to the caveat that the storage server WAPI is optional.

Replying to [zooko](/tahoe-lafs/trac/issues/27749): > Currently, storage server operators can manually add share files into the storage backend, such as with "mv" or "rsync" or what have you, and a crawler will eventually discover that share and add a lease to it. > > I propose that we stop supporting this method of installing shares. If we stop supporting this, that would leave three options for if you want to add a share to a server: > > 1. Send it through the front door — use a tool that speaks the LAFS protocol, connects to the storage server over a network socket, and delivers the share. [...] > 2. Copy the shares directly into place in the storage backend and then remove the leasedb. I don't like this option because it unnecessarily loses accounting information. > 3. Copy the shares into place, but don't do anything that would register them in the leasedb. They are now immortal, unless a client subsequently adds a lease to them. The fact that not doing anything to register the existence of the share is safe (doesn't lose data) is a useful property. Note that this would only work if a server still queries the backend for a share even if it does not exist in the leasedb, rather than taking the leasedb as authoritative. Perhaps if the share is ever requested, the server could then notice that it exists and add it to the leasedb. In that case, doing a filecheck on that file would be sufficient. > 4. Copy the shares into place and then use a newly added feature of storage server which tells it to notice the existence of each new share (by storage index). This newly added feature doesn't need to be exported over the network to remote foolscap clients, it could just be a "tahoe" command-line that connects to the storage server's local WAPI. Note that currently, running a WAPI is optional for storage servers. > 5. Copy the shares into place and then use a newly added feature of storage server which performs a full crawl to update the leasedb without first deleting it. Yes. > 4 would be a bit more efficient than 5 when used, but a lot more complication for the server administrator, who has to figure out how to call `tahoe add-share-to-lease-db $STORAGEINDEX` for each share that he's added, or else that share will be immortal. It is also more work for us to implement. I think the variant where requesting the share is sufficient to make the server notice it is simpler. > 5 is really simple both for us to implement and storage server operators to use. It is exactly like the current crawler code, except that instead of continuously restarting itself and going to look for new shares, it quiesces and doesn't restart unless the server operator invokes `tahoe resync-lease-db`. We can support that as well, subject to the caveat that the storage server WAPI is optional.

zooko commented

2012-10-31 07:35:51 +00:00

Author

Unfortunately, if we did treat the leasedb as authoritative for the existence of a share, then the approach — which I like after reading your comment:1 — would not work: using a filecheck through the storage server's foolscap interface as the way to alert the server to the existence of an externally-imported share.

Hrm… ☹

I kind of think it might be worth it, to get the improved efficiency and performance of relying on the leasedb as the authority for share existence, to separate a LAFS client saying "Please let me know if you have this share!" from a server operator saying "I know more than you know about this: go look and you might find out that you have this share now!".

Very good points, David-Sarah. I agree with everything you said in comment:1. I hadn't thought of the option of treating the leasedb as authoritative, but now that you mentioned it, I like that option because it is very efficient. Especially in the cloud-backend case where querying the leasedb is a purely local, synchronous, very fast, and zero-dollar-cost operation, but querying the storage backend is asynchronous (which complicates code), slow, and might even have a (very small) monetary cost. Unfortunately, if we *did* treat the leasedb as authoritative for the existence of a share, then the approach — which I like after reading your comment:1 — would not work: using a filecheck through the storage server's foolscap interface as the way to alert the server to the existence of an externally-imported share. Hrm… ☹ I kind of think it might be worth it, to get the improved efficiency and performance of relying on the leasedb as the authority for share existence, to separate a LAFS client saying "Please let me know if you have this share!" from a server operator saying "I know more than you know about this: go look and you might find out that you have this share now!".

daira commented

2012-10-31 13:55:23 +00:00

Currently the server always does a list query to the backend. The leasedb allows us to skip that list query in the case where the share is present in the DB. If the leasedb is not authoritative, then we still do the query in the case where the share is not present in the DB, but this only prevents us from improving the latency of reporting that a server does not have a share. So, given that the downloader uses the first k servers to respond to a DYHB, it does not affect the performance of a (successful) download.

Falling back to the list query when a share is not in the DB does increase complexity, though.

Currently the server always does a list query to the backend. The leasedb allows us to skip that list query in the case where the share is present in the DB. If the leasedb is not authoritative, then we still do the query in the case where the share is not present in the DB, but this only prevents us from improving the latency of reporting that a server does *not* have a share. So, given that the downloader uses the first k servers to respond to a DYHB, it does not affect the performance of a (successful) download. Falling back to the list query when a share is not in the DB does increase complexity, though.

daira commented

2014-02-28 13:38:56 +00:00

Replying to davidsarah:

Currently the server always does a list query to the backend. The leasedb allows us to skip that list query in the case where the share is present in the DB. If the leasedb is not authoritative, then we still do the query in the case where the share is not present in the DB, but this only prevents us from improving the latency of reporting that a server does not have a share. So, given that the downloader uses the first k servers to respond to a DYHB, it does not affect the performance of a (successful) download.

See /tahoe-lafs/trac/issues/26231#comment:29 for more information about what the downloader does. I think it may wait for the 10 second timeout if there are servers that haven't responded, rather than proceeding immediately after the first k servers have responded -- in which case, my above argument isn't valid unless that is fixed.

Replying to [davidsarah](/tahoe-lafs/trac/issues/1835#issuecomment-391387): > Currently the server always does a list query to the backend. The leasedb allows us to skip that list query in the case where the share is present in the DB. If the leasedb is not authoritative, then we still do the query in the case where the share is not present in the DB, but this only prevents us from improving the latency of reporting that a server does *not* have a share. So, given that the downloader uses the first k servers to respond to a DYHB, it does not affect the performance of a (successful) download. See [/tahoe-lafs/trac/issues/26231](/tahoe-lafs/trac/issues/26231)#comment:29 for more information about what the downloader does. I think it may wait for the 10 second timeout if there are servers that haven't responded, rather than proceeding immediately after the first k servers have responded -- in which case, my above argument isn't valid unless that is fixed.

dquintela commented

2014-12-23 10:41:40 +00:00

Owner

Hello, first time tahoe user here,

I've testing cloud storage for 3 or 4 days (branch 2237-cloud-backend-s4) into my raspberry-pi, running raspbian.
Despite it having very long startup times, I noticed .tahoe/logs/twistd.log filled up of lines like this,
that seems indicating the shares crawler is being run to often. This has the nasty side effecting that on amazon billing page
I am already with 80000 requests. Rough estimate this could mean 3 to 5 USD per month on an idle storage node alone.

This seems to be related to #1835 and #1886 - sorry for the crossposting.

2014-12-22 15:48:37+0000 [-] Starting factory <HTTPClientFactory: http://bucket_identifier.s3.amazonaws.com/?prefix=shares%2Fod%2F>
2014-12-22 15:48:37+0000 HTTPPageGetter,client Stopping factory <HTTPClientFactory: http://bucket_identifier.s3.amazonaws.com/?prefix=shares%2Fod%2F>
2014-12-22 15:48:37+0000 [-] Starting factory <HTTPClientFactory: http://bucket_identifier.s3.amazonaws.com/?prefix=shares%2Foe%2F>
2014-12-22 15:48:38+0000 HTTPPageGetter,client Stopping factory <HTTPClientFactory: http://bucket_identifier.s3.amazonaws.com/?prefix=shares%2Foe%2F>
2014-12-22 15:48:38+0000 [-] Starting factory <HTTPClientFactory: http://bucket_identifier.s3.amazonaws.com/?prefix=shares%2Fof%2F>
2014-12-22 15:48:38+0000 HTTPPageGetter,client Stopping factory <HTTPClientFactory: http://bucket_identifier.s3.amazonaws.com/?prefix=shares%2Fof%2F>
2014-12-22 15:48:38+0000 [-] Starting factory <HTTPClientFactory: http://bucket_identifier.s3.amazonaws.com/?prefix=shares%2Fog%2F>
2014-12-22 15:48:38+0000 HTTPPageGetter,client Stopping factory <HTTPClientFactory: http://bucket_identifier.s3.amazonaws.com/?prefix=shares%2Fog%2F>
...

Hello, first time tahoe user here, I've testing cloud storage for 3 or 4 days (branch 2237-cloud-backend-s4) into my raspberry-pi, running raspbian. Despite it having very long startup times, I noticed .tahoe/logs/twistd.log filled up of lines like this, that seems indicating the shares crawler is being run to often. This has the nasty side effecting that on amazon billing page I am already with 80000 requests. Rough estimate this could mean 3 to 5 USD per month on an idle storage node alone. This seems to be related to #1835 and #1886 - sorry for the crossposting. 2014-12-22 15:48:37+0000 [-] Starting factory <HTTPClientFactory: <http://bucket_identifier.s3.amazonaws.com/?prefix=shares%2Fod%2F>> 2014-12-22 15:48:37+0000 HTTPPageGetter,client Stopping factory <HTTPClientFactory: <http://bucket_identifier.s3.amazonaws.com/?prefix=shares%2Fod%2F>> 2014-12-22 15:48:37+0000 [-] Starting factory <HTTPClientFactory: <http://bucket_identifier.s3.amazonaws.com/?prefix=shares%2Foe%2F>> 2014-12-22 15:48:38+0000 HTTPPageGetter,client Stopping factory <HTTPClientFactory: <http://bucket_identifier.s3.amazonaws.com/?prefix=shares%2Foe%2F>> 2014-12-22 15:48:38+0000 [-] Starting factory <HTTPClientFactory: <http://bucket_identifier.s3.amazonaws.com/?prefix=shares%2Fof%2F>> 2014-12-22 15:48:38+0000 HTTPPageGetter,client Stopping factory <HTTPClientFactory: <http://bucket_identifier.s3.amazonaws.com/?prefix=shares%2Fof%2F>> 2014-12-22 15:48:38+0000 [-] Starting factory <HTTPClientFactory: <http://bucket_identifier.s3.amazonaws.com/?prefix=shares%2Fog%2F>> 2014-12-22 15:48:38+0000 HTTPPageGetter,client Stopping factory <HTTPClientFactory: <http://bucket_identifier.s3.amazonaws.com/?prefix=shares%2Fog%2F>> ...

warner commented

2016-10-19 01:15:02 +00:00

I must admit that I've never actually used this feature in practice. I've had development cycles where I'd upload a file, then corrupt or delete or move shares around manually, then re-upload or repair to see what happened. I could imagine adding a new tahoe debug command to delete a share, or add one, that I could use for this sort of development work instead of relying on automatic discovery of sharefiles.

I originally wanted it so that sysadmins could feel comfortable treating shares as plain files (without associated magic), so they could e.g. migrate a server to a new machine with 'scp', or merge two servers, or merge a plain backup of the shares/ directory with shares that were added later, or something. Having real databases is a super-useful performance improvement, but it does give up on this "cp-based sysadmin" technique a bit. But I don't think I could argue that it's particularly important to keep it around.

I'd be ok with either requiring a specific 'add a foreign share' command, or maybe a magic directory that you drop the share files into, instead of expecting spontaneous discovery of new shares in their final location. I think I want it to be reasonably efficient for moving a large number of shares at once (so I suspect that pushing them in over HTTP wouldn't count). I must admit that I've never actually used this feature in practice. I've had development cycles where I'd upload a file, then corrupt or delete or move shares around manually, then re-upload or repair to see what happened. I could imagine adding a new `tahoe debug` command to delete a share, or add one, that I could use for this sort of development work instead of relying on automatic discovery of sharefiles. I originally wanted it so that sysadmins could feel comfortable treating shares as plain files (without associated magic), so they could e.g. migrate a server to a new machine with 'scp', or merge two servers, or merge a plain backup of the shares/ directory with shares that were added later, or something. Having real databases is a super-useful performance improvement, but it does give up on this "cp-based sysadmin" technique a bit. But I don't think I could argue that it's particularly important to keep it around.

lpirl commented

2016-10-19 07:36:33 +00:00

Vote for

keeping the functionality around (e.g. merging servers is a nice use case)
not doing expensive operations that the user/admin didn't ask for
a command instead of a magic folder since it is discoverable via --help

Vote for * keeping the functionality around (e.g. merging servers is a nice use case) * not doing expensive operations that the user/admin didn't ask for * a command instead of a magic folder since it is discoverable via `--help`

Rows
Columns