improve random-access download to retrieve/decrypt less data #798

Closed
opened 2009-08-27 06:29:56 +00:00 by warner · 30 comments
warner commented 2009-08-27 06:29:56 +00:00
Owner

Currently, using a Range: header on an HTTP GET (to fetch just a portion of a file, instead of the whole thing) causes the tahoe client to download the entire file into a tmpfile, then serve out just the portion that was requested. To make this faster, we should only fetch the segments that contain the desired range. Two changes need to happen to make this work:

  • the Downloader must be rewritten, to fetch segments on demand, to (sometimes) cache previously cached segments, and decrypt just the necessary data
  • pycryptopp needs to provide random-access AES processing, so we can decrypt data starting at some point other than the beginning of the file (pycryptopp#18)

The new Downloader should have a couple of layers:

  • top layer receives a read(offset, length) request (maybe a even fully-general readv)
  • that layer looks at the set of cached segments, tries to satisfy the request from cache
  • if not, submit requests for segment fetches to the lower layer
  • lower layer looks to see what servers it has available, which requests are already in flight
  • it sends out more requests, or prepares new servers (querying for share presence, fetching hash trees, fetching block data)

Similar code will be needed for MDMF mutable files, since those are specified to contain multiple segments, and we'll want random-access for them too.

Currently, using a `Range:` header on an HTTP GET (to fetch just a portion of a file, instead of the whole thing) causes the tahoe client to download the entire file into a tmpfile, then serve out just the portion that was requested. To make this faster, we should only fetch the segments that contain the desired range. Two changes need to happen to make this work: * the Downloader must be rewritten, to fetch segments on demand, to (sometimes) cache previously cached segments, and decrypt just the necessary data * pycryptopp needs to provide random-access AES processing, so we can decrypt data starting at some point other than the beginning of the file ([pycryptopp#18](http://allmydata.org/trac/pycryptopp/ticket/18)) The new Downloader should have a couple of layers: * top layer receives a `read(offset, length)` request (maybe a even fully-general `readv`) * that layer looks at the set of cached segments, tries to satisfy the request from cache * if not, submit requests for segment fetches to the lower layer * lower layer looks to see what servers it has available, which requests are already in flight * it sends out more requests, or prepares new servers (querying for share presence, fetching hash trees, fetching block data) Similar code will be needed for MDMF mutable files, since those are specified to contain multiple segments, and we'll want random-access for them too.
tahoe-lafs added the
major
enhancement
1.5.0
labels 2009-08-27 06:29:56 +00:00
tahoe-lafs added this to the eventually milestone 2009-08-27 06:29:56 +00:00
tahoe-lafs added the
code-network
label 2009-12-04 04:56:40 +00:00
tahoe-lafs modified the milestone from eventually to 1.7.0 2010-02-02 03:33:07 +00:00
davidsarah commented 2010-02-02 03:49:45 +00:00
Author
Owner

Brian is writing the new Downloader.

Brian is writing the new Downloader.
warner commented 2010-03-11 19:35:25 +00:00
Author
Owner

Attachment new-downloader-v1.diff (124193 bytes) added

work-in-progress of new downloader, maybe 80% complete

**Attachment** new-downloader-v1.diff (124193 bytes) added work-in-progress of new downloader, maybe 80% complete
davidsarah commented 2010-03-14 05:12:23 +00:00
Author
Owner

As discussed in #990, a positive security side-effect of this rewrite will be to avoid caching plaintext in order to satisfy byterange requests.

As discussed in #990, a positive security side-effect of this rewrite will be to avoid caching plaintext in order to satisfy byterange requests.
warner commented 2010-04-23 23:35:11 +00:00
Author
Owner

Attachment new-downloader-v2.diff (130758 bytes) added

latest WIP patch

**Attachment** new-downloader-v2.diff (130758 bytes) added latest WIP patch
warner commented 2010-04-26 09:53:14 +00:00
Author
Owner

Attachment new-downloader-v3.diff (145044 bytes) added

latest WIP patch, a few tests pass

**Attachment** new-downloader-v3.diff (145044 bytes) added latest WIP patch, a few tests pass
warner commented 2010-04-28 16:16:55 +00:00
Author
Owner

Attachment new-downloader-v4.diff (153208 bytes) added

one system test works

**Attachment** new-downloader-v4.diff (153208 bytes) added one system test works
warner commented 2010-04-28 16:32:23 +00:00
Author
Owner

in the v4 patch, SystemTest.test_filesystem now passes. I had to disable the "download status" checking code.. that part isn't implemented yet.

Next steps:

  • add download status: require rework, since we're no longer downloading whole files at a time, and the status-display code thinks in such terms
  • integration work:
  • split download2.py into different pieces. Probably move existing download.py's LiteralFileNode into a new literal.py, let the top-level (non-literal) ImmutableFileNode live in filenode.py, move most of the rest of download2.py (the ShareFinder and Share classes) into download.py
  • get rid of the old ImmutableFileNode class
  • write specific tests for new interesting cases: guessing offsets wrong, read requests fail, others
  • implement "response overdue" behavior
in the v4 patch, `SystemTest.test_filesystem` now passes. I had to disable the "download status" checking code.. that part isn't implemented yet. Next steps: * add download status: require rework, since we're no longer downloading whole files at a time, and the status-display code thinks in such terms * integration work: * split download2.py into different pieces. Probably move existing download.py's `LiteralFileNode` into a new literal.py, let the top-level (non-literal) `ImmutableFileNode` live in filenode.py, move most of the rest of download2.py (the `ShareFinder` and `Share` classes) into download.py * get rid of the old `ImmutableFileNode` class * write specific tests for new interesting cases: guessing offsets wrong, read requests fail, others * implement "response overdue" behavior
warner commented 2010-04-29 19:00:21 +00:00
Author
Owner

Attachment new-downloader-v5.diff (158411 bytes) added

rest of SystemTest now passes

**Attachment** new-downloader-v5.diff (158411 bytes) added rest of [SystemTest](wiki/SystemTest) now passes
warner commented 2010-05-01 02:37:16 +00:00
Author
Owner

Attachment new-downloader-v6.diff (169397 bytes) added

v6 patch: most existing tests pass

**Attachment** new-downloader-v6.diff (169397 bytes) added v6 patch: most existing tests pass
warner commented 2010-05-01 02:44:37 +00:00
Author
Owner

In the v6 patch, most of the existing tests pass. Remaining old-test work to do:

  • test_hung_download (touches internals, needs rewrite)
  • test_immutable (not sure, looks interesting)
  • test_web (format of NotEnoughSharesError changed)
    • either need to update tests to match, or change format to match old one
  • test_system (disabled download-status)

And all the TODOs from above.

I've been thinking about the download-status data. What I'd like to record is a list of messages, one per share, with (send time, start+offset, response size, response time) for each one (plus information about the initial DYHB query). For now I'd just dump this information as text in the download-status page, but eventually I'd like a scrolling JS/Canvas-based diagram. It would have time along the X axis, and probably (server,share) along the Y axis. Each message send would be a horizontal line that spans from the send to the receipt, maybe with a thickness related to the number of bytes being requested. There will be lots of overlapping requests, so they must be handled cleanly.

There should be vertical lines to indicate when each completed segment is delivered to the caller (running from the top-most active share to the bottom of the chart, with an arrow pointing downwards). There should also be vertical lines to indicate when the caller requests the next segment: when callers are stalling (i.e. streaming music not buffering too much) it should be obvious by looking at the chart.

Another sort of chart that would be interesting would have byte-offset-in-share as the Y axis, showing how we fetch different pieces of the share at different times.

In the v6 patch, most of the existing tests pass. Remaining old-test work to do: * test_hung_download (touches internals, needs rewrite) * test_immutable (not sure, looks interesting) * test_web (format of NotEnoughSharesError changed) * either need to update tests to match, or change format to match old one * test_system (disabled download-status) And all the TODOs from above. I've been thinking about the download-status data. What I'd like to record is a list of messages, one per share, with (send time, start+offset, response size, response time) for each one (plus information about the initial DYHB query). For now I'd just dump this information as text in the download-status page, but eventually I'd like a scrolling JS/Canvas-based diagram. It would have time along the X axis, and probably (server,share) along the Y axis. Each message send would be a horizontal line that spans from the send to the receipt, maybe with a thickness related to the number of bytes being requested. There will be lots of overlapping requests, so they must be handled cleanly. There should be vertical lines to indicate when each completed segment is delivered to the caller (running from the top-most active share to the bottom of the chart, with an arrow pointing downwards). There should also be vertical lines to indicate when the caller requests the next segment: when callers are stalling (i.e. streaming music not buffering too much) it should be obvious by looking at the chart. Another sort of chart that would be interesting would have byte-offset-in-share as the Y axis, showing how we fetch different pieces of the share at different times.
davidsarah commented 2010-05-01 03:13:51 +00:00
Author
Owner

Replying to warner:

In the v6 patch, most of the existing tests pass. Remaining old-test work to do:

  • test_hung_download (touches internals, needs rewrite)
  • test_immutable (not sure, looks interesting)
    [...]

You might consider merging these two. When I wrote test_hung_download, I wasn't familiar enough with the existing tests to see that it was partially duplicating some of test_immutable.

Replying to [warner](/tahoe-lafs/trac-2024-07-25/issues/798#issuecomment-114628): > In the v6 patch, most of the existing tests pass. Remaining old-test work to do: > > * test_hung_download (touches internals, needs rewrite) > * test_immutable (not sure, looks interesting) [...] You might consider merging these two. When I wrote test_hung_download, I wasn't familiar enough with the existing tests to see that it was partially duplicating some of test_immutable.
zooko commented 2010-05-08 20:20:19 +00:00
Author
Owner

If you like this ticket, you might also like the "Brian's New Downloader" bundle of tickets: #605 (two-hour delay to connect to a grid from Win32, if there are many storage servers unreachable), #800 (improve alacrity by downloading only the part of the Merkle Tree that you need), #809 (Measure how segment size affects upload/download speed.), #287 (download: tolerate lost or missing servers), and #448 (download: speak to as few servers as possible).

If you like this ticket, you might also like the "Brian's New Downloader" bundle of tickets: #605 (two-hour delay to connect to a grid from Win32, if there are many storage servers unreachable), #800 (improve alacrity by downloading only the part of the Merkle Tree that you need), #809 (Measure how segment size affects upload/download speed.), #287 (download: tolerate lost or missing servers), and #448 (download: speak to as few servers as possible).
zooko commented 2010-05-08 22:47:48 +00:00
Author
Owner

Brian's New Downloader is now planned for v1.8.0.

Brian's New Downloader is now planned for v1.8.0.
tahoe-lafs modified the milestone from 1.7.0 to 1.8.0 2010-05-08 22:47:48 +00:00
warner commented 2010-05-10 03:55:47 +00:00
Author
Owner

Attachment new-downloader-v7.diff (224567 bytes) added

12 uncovered lines left, some tests disabled

**Attachment** new-downloader-v7.diff (224567 bytes) added 12 uncovered lines left, some tests disabled
warner commented 2010-05-10 03:58:31 +00:00
Author
Owner

I did a lot of code-coverage work, so the v7 patch fixes a number of real bugs in the previous ones. There are still some significant functional things to fix, though, notably the state=OVERDUE heuristic is missing, and ShareFinder isn't sending requests in parallel. Both fronts will benefit from download-status displays. I found an interesting JS library which might be good for generating the display: http://www.simile-widgets.org/timeline/

I did a lot of code-coverage work, so the v7 patch fixes a number of real bugs in the previous ones. There are still some significant functional things to fix, though, notably the state=OVERDUE heuristic is missing, and ShareFinder isn't sending requests in parallel. Both fronts will benefit from download-status displays. I found an interesting JS library which might be good for generating the display: <http://www.simile-widgets.org/timeline/>
warner commented 2010-05-27 23:59:51 +00:00
Author
Owner

Attachment new-downloader-v8.diff (288500 bytes) added

more integration, refactoring

**Attachment** new-downloader-v8.diff (288500 bytes) added more integration, refactoring
warner commented 2010-05-28 00:12:07 +00:00
Author
Owner

The v8 patch has a lot of integration work: the new downloader now mostly
lives in immutable/downloader/ , all the filenodes live in
immutable/filenode.py (except for Literal, which was moved into
immutable/literal.py). Repairer was rewritten to use the new downloader (and
it's waaaay simpler now). Some of the old downloader code was removed (except
that Verifier still uses it). The download-status display is functional, but
shows mostly raw request-response timestamps (the SIMILE Timeline work is not
in this patch, and might not really want to be in the Tahoe node at all,
maybe it should go into a separate tool).

At this point, the new downloader is almost as good as the old downloader, so
I'm starting to think about landing it after 1.7 is released. It doesn't yet
handle servers hanging very well, but the old downloader didn't either. The
biggest problem is that the new downloader will basically pick share-holding
servers at random, rather than preferring the fastest responders. The old
downloader would, I think, stick with the first 'k' servers that responded
positively to the DYHB requests, so even though it couldn't tolerate the loss
of any server, it would use the fastest-responding ones. The new downloader
could easily pick the slowest responders by accident.

The biggest feature I want to write next is the hanging-server handling
(state=OVERDUE). I know that it will take some fiddling with the heuristics
before we find something that feels right. I'd prefer to have the timeline
visualization in place to support this work, but it's looking to be a decent
amount of work, so I may plunge ahead without it, or find some alternative
approach (GD or some python image-drawing library, probably without zooming).
To do state=OVERDUE right will also provoke structural changes to the way we
manage remote servers (specifically I want an object with a lifetime equal to
the TCP connection's, which remembers how long requests took in the past, so
it can help guess if an outstanding request is overdue or merely slow). I
might want to land this patch first, before starting on that work, since I'm
really far out on a branch right now. My patch is updated to current trunk,
but if anyone gets busy and makes some deep changes to the tree, I may have a
challenging merge job ahead. So merging sooner rather than later feels like a
good idea.

The v8 patch has a lot of integration work: the new downloader now mostly lives in immutable/downloader/ , all the filenodes live in immutable/filenode.py (except for Literal, which was moved into immutable/literal.py). Repairer was rewritten to use the new downloader (and it's waaaay simpler now). Some of the old downloader code was removed (except that Verifier still uses it). The download-status display is functional, but shows mostly raw request-response timestamps (the SIMILE Timeline work is not in this patch, and might not really want to be in the Tahoe node at all, maybe it should go into a separate tool). At this point, the new downloader is almost as good as the old downloader, so I'm starting to think about landing it after 1.7 is released. It doesn't yet handle servers hanging very well, but the old downloader didn't either. The biggest problem is that the new downloader will basically pick share-holding servers at random, rather than preferring the fastest responders. The old downloader would, I think, stick with the first 'k' servers that responded positively to the DYHB requests, so even though it couldn't tolerate the loss of any server, it would use the fastest-responding ones. The new downloader could easily pick the slowest responders by accident. The biggest feature I want to write next is the hanging-server handling (state=OVERDUE). I know that it will take some fiddling with the heuristics before we find something that feels right. I'd prefer to have the timeline visualization in place to support this work, but it's looking to be a decent amount of work, so I may plunge ahead without it, or find some alternative approach (GD or some python image-drawing library, probably without zooming). To do state=OVERDUE right will also provoke structural changes to the way we manage remote servers (specifically I want an object with a lifetime equal to the TCP connection's, which remembers how long requests took in the past, so it can help guess if an outstanding request is overdue or merely slow). I might want to land this patch first, before starting on that work, since I'm really far out on a branch right now. My patch is updated to current trunk, but if anyone gets busy and makes some deep changes to the tree, I may have a challenging merge job ahead. So merging sooner rather than later feels like a good idea.
warner commented 2010-06-28 22:32:24 +00:00
Author
Owner

Attachment new-downloader-v9.diff (409611 bytes) added

more small improvements

**Attachment** new-downloader-v9.diff (409611 bytes) added more small improvements
warner commented 2010-07-26 02:26:52 +00:00
Author
Owner

Attachment new-downloader-v10.diff (388251 bytes) added

added OVERDUE for get_buckets calls, reenabled some "hung server" tests

**Attachment** new-downloader-v10.diff (388251 bytes) added added OVERDUE for get_buckets calls, reenabled some "hung server" tests
zooko commented 2010-07-29 21:27:57 +00:00
Author
Owner

Whoo! Ready for review! And user testing. Try it out!

Whoo! Ready for review! And user testing. Try it out!
davidsarah commented 2010-08-01 05:31:30 +00:00
Author
Owner

I needed to fix a few minor issues when converting this to a darcs patch:

  • the patch to client.py touches a line next to another line changed in the ticket1074 branch (no real conflict)

  • 'patch' failed to create the empty file src/allmydata/download/*init*.py

  • pyflakes warning:
    src\allmydata\test\test_hung_server.py:12: '_corrupt_share_data' imported but unused

  • DEFAULT_MAX_SEGMENT_SIZE missing from interfaces.py

  • some uses of find_shares (two in test/no_network.py and one in test/test_download.py)
    needed to be changed to find_uri_shares.

  • 'print r_ev' statement in util/spans.py should be commented out.

With these changes it doesn't fail any tests on my machine.

I needed to fix a few minor issues when converting this to a darcs patch: - the patch to client.py touches a line next to another line changed in the ticket1074 branch (no real conflict) - '`patch`' failed to create the empty file `src/allmydata/download/*init*.py` - pyflakes warning: `src\allmydata\test\test_hung_server.py:12: '_corrupt_share_data' imported but unused` - `DEFAULT_MAX_SEGMENT_SIZE` missing from `interfaces.py` - some uses of `find_shares` (two in test/no_network.py and one in test/test_download.py) needed to be changed to `find_uri_shares`. - '`print r_ev`' statement in util/spans.py should be commented out. With these changes it doesn't fail any tests on my machine.
davidsarah commented 2010-08-01 05:49:45 +00:00
Author
Owner

Attachment new-downloader-v10a.dpatch (274169 bytes) added

Brian's New Downloader, for testing in 1.8beta (or alpha)

**Attachment** new-downloader-v10a.dpatch (274169 bytes) added Brian's New Downloader, for testing in 1.8beta (or alpha)
davidsarah commented 2010-08-09 01:51:42 +00:00
Author
Owner

Applied to trunk in changeset:cbcb728e7ea0031d changeset:88d7ec2d5451a00c changeset:22a07e9bbe682d6e changeset:797828f47fe1aa44 changeset:7b7b0c9709d8ade6 changeset:63b61ce7bd112af7 changeset:20847dd8768a1622 changeset:919938dd95ded529 (corresponding to 1.8.0beta), then changeset:abcd6e0e96298a76 changeset:2a05aa2d9142ceea changeset:fa34e4dd16813923 changeset:2bd87498498d7c44.

The bugs in ticket:1154 were fixed in changeset:8844655705e4fb76 changeset:43c5032105288a58 changeset:f6f9a97627d210a6, and changeset:a0124e95eee4c1fd reenabled some commented-out tests.

Accepting for review.

Applied to trunk in changeset:cbcb728e7ea0031d changeset:88d7ec2d5451a00c changeset:22a07e9bbe682d6e changeset:797828f47fe1aa44 changeset:7b7b0c9709d8ade6 changeset:63b61ce7bd112af7 changeset:20847dd8768a1622 changeset:919938dd95ded529 (corresponding to 1.8.0beta), then changeset:abcd6e0e96298a76 changeset:2a05aa2d9142ceea changeset:fa34e4dd16813923 changeset:2bd87498498d7c44. The bugs in ticket:1154 were fixed in changeset:8844655705e4fb76 changeset:43c5032105288a58 changeset:f6f9a97627d210a6, and changeset:a0124e95eee4c1fd reenabled some commented-out tests. Accepting for review.
davidsarah commented 2010-08-09 05:12:53 +00:00
Author
Owner

Reviewing changeset:cbcb728e7ea0031d:

  • the (start, length) form of the SimpleSpans constructor is not used outside test code (and the test code can be changed to pass a [(start, length)] array). Removing this would slightly simplify the constructor and avoid a possibly error-prone overloading.

  • in the Spans class comment, ", frequently used to represent .newsrc contents" is out-of-context and not needed.

  • in the _check method of Spans, if assertions are switched on then the self._spans array is re-sorted in order to check whether it is ordered. This is unnecessary: if you add an assert length > 0, length in the loop, then the loop will be checking a condition that is stronger than the array being ordered, given that the starts and lengths are numbers. (The sorting actually takes O(n) time rather than O(n log n) on each call to _check, because Timsort will detect the ordered input, but it's still unnecessary overhead.)

  • the assert statements should include the variables they use in the exception message, e.g. assert start > prev_end, (start, prev_end).

  • "overlap(s_start, s_length, start, length) or adjacent(s_start, s_length, start, length)" is equivalent to "overlap(s_start, s_length, start-1, length+2)".

  • in the only other use of adjacent (in DataSpans.add), only the start0 < start1 case should be able to occur. Inline and simplify.

  • in the loop over enumerate(self._spans), you could exit early when s_start > start+length. At that point you know where to insert the (start, length) span without having to re-sort.

  • a Spans object behaves somewhat like a set of the elements in all of its spans, but the *contains* and *iter* methods are not consistent with that (instead viewing it as a set of (start, length) pairs). I realize this may allow for more convenient use of in and iteration, but it should at least be documented.

  • _check and assert_invariants do similar things; give them the same name.

  • DataSpans._dump is poorly named.

  • DataSpans.assert_invariants should check that none of the data strings are zero-length.

  • is it intentional that DataSpans.add calls self.assert_invariants() but remove (and pop, although that's much simpler) don't?

  • if s_start <= start < s_end: I find this Python construct too clever by half. Anyway, at this point s_start <= start is always true (otherwise we won't get past the first continue), so I would write this as

assert s_start <= start, (s_start, start)
if start < s_end:
  • Perhaps rename s_* to old_*.

  • DataSpans.add: if I understand correctly, case A also covers:

    OLD      OLD    OLD    OLD
NEW       NEW      NEWW   NEEWW

This isn't immediately clear from the comment. Depicting it as

    OLD
NEW.....

might help. Also, use uppercase consistently for the cases.

  • suffix_len in case D has a different meaning to suffix_len in case E. Maybe rename it to replace_len for case D.

Tests:

  • The Spans class is tested by ByteSpans, but it just stores spans of integers, not necessarily byte offsets. I would suggest s/ByteSpans/TestSpans/ and s/StringSpans/TestDataSpans/.

  • I could be wrong, but I don't think the deterministic tests are covering all cases in add and remove. I'd prefer to see those tests have full coverage rather than relying on the randomized tests to make up the gap.

Reviewing changeset:cbcb728e7ea0031d: * the `(start, length)` form of the `SimpleSpans` constructor is not used outside test code (and the test code can be changed to pass a `[(start, length)]` array). Removing this would slightly simplify the constructor and avoid a possibly error-prone overloading. * in the `Spans` class comment, ", frequently used to represent .newsrc contents" is out-of-context and not needed. * in the `_check` method of `Spans`, if assertions are switched on then the `self._spans` array is re-sorted in order to check whether it is ordered. This is unnecessary: if you add an `assert length > 0, length` in the loop, then the loop will be checking a condition that is stronger than the array being ordered, given that the starts and lengths are numbers. (The sorting actually takes O(n) time rather than O(n log n) on each call to `_check`, because [Timsort](http://bugs.python.org/file4451/timsort.txt) will detect the ordered input, but it's still unnecessary overhead.) * the assert statements should include the variables they use in the exception message, e.g. `assert start > prev_end, (start, prev_end)`. * "`overlap(s_start, s_length, start, length) or adjacent(s_start, s_length, start, length)`" is equivalent to "`overlap(s_start, s_length, start-1, length+2)`". * in the only other use of `adjacent` (in `DataSpans.add`), only the `start0 < start1` case should be able to occur. Inline and simplify. * in the loop over `enumerate(self._spans)`, you could exit early when `s_start > start+length`. At that point you know where to insert the `(start, length)` span without having to re-sort. * a `Spans` object behaves somewhat like a set of the elements in all of its spans, but the `*contains*` and `*iter*` methods are not consistent with that (instead viewing it as a set of `(start, length)` pairs). I realize this may allow for more convenient use of `in` and iteration, but it should at least be documented. * `_check` and `assert_invariants` do similar things; give them the same name. * `DataSpans._dump` is poorly named. * `DataSpans.assert_invariants` should check that none of the data strings are zero-length. * is it intentional that `DataSpans.add` calls `self.assert_invariants()` but `remove` (and `pop`, although that's much simpler) don't? * `if s_start <= start < s_end:` I find this Python construct too clever by half. Anyway, at this point `s_start <= start` is always true (otherwise we won't get past the first `continue`), so I would write this as ``` assert s_start <= start, (s_start, start) if start < s_end: ``` * Perhaps rename `s_*` to `old_*`. * `DataSpans.add`: if I understand correctly, case A also covers: ``` OLD OLD OLD OLD NEW NEW NEWW NEEWW ``` This isn't immediately clear from the comment. Depicting it as ``` OLD NEW..... ``` might help. Also, use uppercase consistently for the cases. * `suffix_len` in case D has a different meaning to `suffix_len` in case E. Maybe rename it to `replace_len` for case D. Tests: * The `Spans` class is tested by `ByteSpans`, but it just stores spans of integers, not necessarily byte offsets. I would suggest `s/ByteSpans/TestSpans/` and `s/StringSpans/TestDataSpans/`. * I could be wrong, but I don't think the deterministic tests are covering all cases in `add` and `remove`. I'd prefer to see those tests have full coverage rather than relying on the randomized tests to make up the gap.
warner commented 2010-08-09 20:10:51 +00:00
Author
Owner

good points! I think I'll implement most of them, but after the 1.8.0 release.

good points! I think I'll implement most of them, but after the 1.8.0 release.
zooko commented 2010-08-14 05:33:40 +00:00
Author
Owner

This patch broke some docs:

  • [architecture.txt]source:trunk/docs/architecture.txt@4481#L232 says:

When downloading a file, the current version just asks all known servers for any shares they might have…

  • [performance.txt]source:trunk/docs/performance.txt@4341#L41 says:

When asked to read an arbitrary range of an immutable file, Tahoe-LAFS will download from the beginning of the file up until it has enough of the file to satisfy the requested read…

I think these documentation issues should be fixed before we release Tahoe-LAFS v1.8.0 final.

This patch broke some docs: * [architecture.txt]source:trunk/docs/architecture.txt@4481#L232 says: When downloading a file, the current version just asks all known servers for any shares they might have… * [performance.txt]source:trunk/docs/performance.txt@4341#L41 says: When asked to read an arbitrary range of an immutable file, Tahoe-LAFS will download from the beginning of the file up until it has enough of the file to satisfy the requested read… I think these documentation issues should be fixed before we release Tahoe-LAFS v1.8.0 final.
warner commented 2010-08-14 18:20:02 +00:00
Author
Owner

Let's make a new ticket for the improvements suggested in comment:114638, so we can close this ticket as soon as the docs fixes in comment:114640 are resolved.

Let's make a new ticket for the improvements suggested in [comment:114638](/tahoe-lafs/trac-2024-07-25/issues/798#issuecomment-114638), so we can close this ticket as soon as the docs fixes in [comment:114640](/tahoe-lafs/trac-2024-07-25/issues/798#issuecomment-114640) are resolved.
davidsarah commented 2010-08-30 03:46:28 +00:00
Author
Owner

Attachment 798-docs.dpatch (4007 bytes) added

docs/performance.txt, architecture.txt: updates taking into account new downloader. refs #798

**Attachment** 798-docs.dpatch (4007 bytes) added docs/performance.txt, architecture.txt: updates taking into account new downloader. refs #798
zooko commented 2010-09-10 19:36:56 +00:00
Author
Owner

Replying to warner:

Let's make a new ticket for the improvements suggested in comment:114638, so we can close this ticket as soon as the docs fixes in comment:114640 are resolved.

Okay I created #1196 (clean up and optimize spans).

Replying to [warner](/tahoe-lafs/trac-2024-07-25/issues/798#issuecomment-114641): > Let's make a new ticket for the improvements suggested in [comment:114638](/tahoe-lafs/trac-2024-07-25/issues/798#issuecomment-114638), so we can close this ticket as soon as the docs fixes in [comment:114640](/tahoe-lafs/trac-2024-07-25/issues/798#issuecomment-114640) are resolved. Okay I created #1196 (clean up and optimize spans).
david-sarah@jacaranda.org commented 2010-09-10 20:14:33 +00:00
Author
Owner

In changeset:f32dddbcedea3c7c:

docs/frontends/FTP-and-SFTP.txt: docs/performance.txt, architecture.txt: updates taking into account new downloader (revised). refs #798
In changeset:f32dddbcedea3c7c: ``` docs/frontends/FTP-and-SFTP.txt: docs/performance.txt, architecture.txt: updates taking into account new downloader (revised). refs #798 ```
tahoe-lafs added the
fixed
label 2010-09-10 20:54:17 +00:00
zooko closed this issue 2010-09-10 20:54:17 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: tahoe-lafs/trac-2024-07-25#798
No description provided.