improve alacrity by downloading only the part of the Merkle Tree that you need #800

Closed
opened 2009-09-04 02:45:38 +00:00 by zooko · 8 comments
zooko commented 2009-09-04 02:45:38 +00:00
Owner

The downloader currently reads the entire Merkle Tree over the blocks when it needs to read only a log-N-sized subset of it. Here is the function ReadBucketProxy._get_block_hashes() which is informed by its caller about which hashes are needed, but which then dumbly goes ahead and downloads the entire set: source:src/allmydata/immutable/layout.py@4048#L415. Here is the caller which figures out exactly which subset of the Merkle Tree that it needs, asks the ReadBucketProxy for it, then stores all the ones it got that it didn't need: source:src/allmydata/immutable/download.py@4054#L334. (Good thing it stores them, because it usually does need them later, when it proceeds to download the rest of the file.)

This issue was mentioned in this mailing list thread:

http://allmydata.org/pipermail/tahoe-dev/2009-September/002779.html

Ticket #670 mentioned this issue.

The downloader currently reads the entire Merkle Tree over the blocks when it needs to read only a log-N-sized subset of it. Here is the function `ReadBucketProxy._get_block_hashes()` which is informed by its caller about which hashes are needed, but which then dumbly goes ahead and downloads the entire set: source:src/allmydata/immutable/layout.py@4048#L415. Here is the caller which figures out exactly which subset of the Merkle Tree that it needs, asks the `ReadBucketProxy` for it, then stores all the ones it got that it didn't need: source:src/allmydata/immutable/download.py@4054#L334. (Good thing it stores them, because it usually *does* need them later, when it proceeds to download the rest of the file.) This issue was mentioned in this mailing list thread: <http://allmydata.org/pipermail/tahoe-dev/2009-September/002779.html> Ticket #670 mentioned this issue.
tahoe-lafs added the
code
major
enhancement
1.5.0
labels 2009-09-04 02:45:38 +00:00
tahoe-lafs added this to the undecided milestone 2009-09-04 02:45:38 +00:00
zooko commented 2009-09-04 02:47:16 +00:00
Author
Owner

I'm marking this as "easy". :-)

I'm marking this as "easy". :-)
kevan commented 2009-09-19 10:41:43 +00:00
Author
Owner

In playing around with this, I noticed that if I change the logic in http://allmydata.org/trac/tahoe/browser/src/allmydata/immutable/layout.py?rev=4a4a4f95202ec976#L415 to download only the requested hashes, a number of unit tests that enforce upper limits on the number of reads start to fail. This makes sense -- instead of downloading the entire hash tree at once with one read operation when we need any part of it, we download (with my modifications) each chunk of the hash tree in a separate read operation, so there will of course be more reads.

I probably need to look more deeply into how block hashes are used before coming up with an opinion on this; so I guess this is an FYI note for the moment.

In playing around with this, I noticed that if I change the logic in <http://allmydata.org/trac/tahoe/browser/src/allmydata/immutable/layout.py?rev=4a4a4f95202ec976#L415> to download only the requested hashes, a number of unit tests that enforce upper limits on the number of reads start to fail. This makes sense -- instead of downloading the entire hash tree at once with one read operation when we need any part of it, we download (with my modifications) each chunk of the hash tree in a separate read operation, so there will of course be more reads. I probably need to look more deeply into how block hashes are used before coming up with an opinion on this; so I guess this is an FYI note for the moment.
zooko commented 2009-10-27 04:34:17 +00:00
Author
Owner

I think it is okay for it to use more reads, so the test should be loosened to allow it to pass even if it does. The existence of that test of the number of reads does serve to remind me, however, that multiple small reads of the of the hash tree would actually be a performance loss for small files. We should do some more measurements of performance. Perhaps it would be a win to heuristically over-read by fetching a few more than the required number of hashes would be a a win.

I think it is okay for it to use more reads, so the test should be loosened to allow it to pass even if it does. The existence of that test of the number of reads does serve to remind me, however, that multiple small reads of the of the hash tree *would* actually be a performance loss for small files. We should do some more measurements of performance. Perhaps it would be a win to heuristically over-read by fetching a few more than the required number of hashes would be a a win.
zooko commented 2009-12-04 05:02:09 +00:00
Author
Owner

#442 was a duplicate of this.

#442 was a duplicate of this.
zooko commented 2010-02-10 04:20:45 +00:00
Author
Owner

This may fit into Brian's New Downloader so I'm assigning this ticket to him to get his attention. If it is much later than February 9, and Brian hasn't clicked "accept" on this ticket, then you can safely assume he isn't currently working on it and you can click "accept" on it yourself to indicate that you are working on it.

This may fit into Brian's New Downloader so I'm assigning this ticket to him to get his attention. If it is much later than February 9, and Brian hasn't clicked "accept" on this ticket, then you can safely assume he isn't currently working on it and you can click "accept" on it yourself to indicate that you are working on it.
tahoe-lafs modified the milestone from undecided to 1.7.0 2010-02-27 06:41:55 +00:00
zooko commented 2010-05-08 19:47:36 +00:00
Author
Owner

If you like this ticket, you might also like the "Brian's New Downloader" bundle of tickets: #605 (two-hour delay to connect to a grid from Win32, if there are many storage servers unreachable), #798 (improve random-access download to retrieve/decrypt less data), #809 (Measure how segment size affects upload/download speed.), #287 (download: tolerate lost or missing servers), and #448 (download: speak to as few servers as possible).

If you like this ticket, you might also like the "Brian's New Downloader" bundle of tickets: #605 (two-hour delay to connect to a grid from Win32, if there are many storage servers unreachable), #798 (improve random-access download to retrieve/decrypt less data), #809 (Measure how segment size affects upload/download speed.), #287 (download: tolerate lost or missing servers), and #448 (download: speak to as few servers as possible).
zooko commented 2010-05-08 22:46:40 +00:00
Author
Owner

Brian's New Downloader is now planned for v1.8.0.

Brian's New Downloader is now planned for v1.8.0.
tahoe-lafs modified the milestone from 1.7.0 to 1.8.0 2010-05-08 22:46:40 +00:00
warner commented 2010-08-14 18:18:51 +00:00
Author
Owner

#798 (which has landed) includes this feature. Specifically, source:src/allmydata/immutable/downloader/share.py@4688#L681 (Share._desire_block_hashes) asks the partially-filled hashtree which nodes it needs, and only sends read requests for those.

The small reads are only coalesced if they are adjacent: except for the first few kB of the share, the downloader does not read extra not-always-needed data for the purpose of reducing the number of remote read() messages. That might be a nice feature to have post-1.8.0; we need to measure the performance tradeoffs: each read() message probably carries about 30-40 bytes of overhead, so I'd expect that coalescing gaps of more than that to be a net loss. Adding a multi-segment readv() message to the remote-read protocol might help, but is more disruptive.

So I'm closing this ticket.

#798 (which has landed) includes this feature. Specifically, source:src/allmydata/immutable/downloader/share.py@4688#L681 (`Share._desire_block_hashes`) asks the partially-filled hashtree which nodes it needs, and only sends read requests for those. The small reads are only coalesced if they are adjacent: except for the first few kB of the share, the downloader does not read extra not-always-needed data for the purpose of reducing the number of remote `read()` messages. That might be a nice feature to have post-1.8.0; we need to measure the performance tradeoffs: each `read()` message probably carries about 30-40 bytes of overhead, so I'd expect that coalescing gaps of more than that to be a net loss. Adding a multi-segment `readv()` message to the remote-read protocol might help, but is more disruptive. So I'm closing this ticket.
tahoe-lafs added the
fixed
label 2010-08-14 18:18:51 +00:00
warner closed this issue 2010-08-14 18:18:51 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: tahoe-lafs/trac-2024-07-25#800
No description provided.