Wed Jan 27 16:34:17 MST 2010 zooko@zooko.com * immutable: download from the first servers which provide at least K buckets instead of waiting for all servers to reply This should put an end to the phenomenon I've been seeing that a single hung server can cause all downloads on a grid to hang. Also it should speed up all downloads by (a) not-waiting for responses to queries that it doesn't need, and (b) downloading shares from the servers which answered the initial query the fastest. Also, do not count how many buckets you've gotten when deciding whether the download has enough shares or not -- instead count how many buckets to *unique* shares that you've gotten. This appears to improve a slightly weird behavior in the current download code in which receiving >= K different buckets all to the same sharenumber would make it think it had enough to download the file when in fact it hadn't. This patch needs tests before it is actually ready for trunk. New patches: [immutable: download from the first servers which provide at least K buckets instead of waiting for all servers to reply zooko@zooko.com**20100127233417 Ignore-this: c855355a40d96827e1d0c469a8d8ab3f This should put an end to the phenomenon I've been seeing that a single hung server can cause all downloads on a grid to hang. Also it should speed up all downloads by (a) not-waiting for responses to queries that it doesn't need, and (b) downloading shares from the servers which answered the initial query the fastest. Also, do not count how many buckets you've gotten when deciding whether the download has enough shares or not -- instead count how many buckets to *unique* shares that you've gotten. This appears to improve a slightly weird behavior in the current download code in which receiving >= K different buckets all to the same sharenumber would make it think it had enough to download the file when in fact it hadn't. This patch needs tests before it is actually ready for trunk. ] { hunk ./src/allmydata/immutable/download.py 791 self._opened = False self.active_buckets = {} # k: shnum, v: bucket - self._share_buckets = [] # list of (sharenum, bucket) tuples + self._share_buckets = {} # k: sharenum, v: list of buckets self._share_vbuckets = {} # k: shnum, v: set of ValidatedBuckets self._fetch_failures = {"uri_extension": 0, "crypttext_hash_tree": 0, } hunk ./src/allmydata/immutable/download.py 872 return d def _get_all_shareholders(self): - dl = [] + """ Once the number of buckets that I know about is >= K then I + callback the Deferred that I return. + + If all of the get_buckets deferreds have fired (whether callback or + errback) and I still don't have enough buckets then I'll callback the + Deferred that I return. + """ + self._wait_for_enough_buckets_d = defer.Deferred() + + self._queries_sent = 0 + self._responses_received = 0 + self._queries_failed = 0 sb = self._storage_broker servers = sb.get_servers_for_index(self._storage_index) if not servers: hunk ./src/allmydata/immutable/download.py 892 self.log(format="sending DYHB to [%(peerid)s]", peerid=idlib.shortnodeid_b2a(peerid), level=log.NOISY, umid="rT03hg") + self._queries_sent += 1 d = ss.callRemote("get_buckets", self._storage_index) d.addCallbacks(self._got_response, self._got_error, callbackArgs=(peerid,)) hunk ./src/allmydata/immutable/download.py 896 - dl.append(d) - self._responses_received = 0 - self._queries_sent = len(dl) if self._status: self._status.set_status("Locating Shares (%d/%d)" % (self._responses_received, hunk ./src/allmydata/immutable/download.py 900 self._queries_sent)) - return defer.DeferredList(dl) + return self._wait_for_enough_buckets_d def _got_response(self, buckets, peerid): self.log(format="got results from [%(peerid)s]: shnums %(shnums)s", hunk ./src/allmydata/immutable/download.py 918 for sharenum, bucket in buckets.iteritems(): b = layout.ReadBucketProxy(bucket, peerid, self._storage_index) self.add_share_bucket(sharenum, b) + # If we just got enough buckets for the first time, then fire the + # deferred. Then remove it from self so that we don't fire it + # again. + if self._wait_for_enough_buckets_d and len(self._share_buckets) >= self._verifycap.needed_shares: + self._wait_for_enough_buckets_d.callback(True) + self._wait_for_enough_buckets_d = None + + # Else, if we ran out of outstanding requests then fire it and + # remove it from self. + assert (self._responses_received+self._queries_failed) <= self._queries_sent + if self._wait_for_enough_buckets_d and (self._responses_received+self._queries_failed) == self._queries_sent: + self._wait_for_enough_buckets_d.callback(False) + self._wait_for_enough_buckets_d = None if self._results: if peerid not in self._results.servermap: hunk ./src/allmydata/immutable/download.py 939 def add_share_bucket(self, sharenum, bucket): # this is split out for the benefit of test_encode.py - self._share_buckets.append( (sharenum, bucket) ) + self._share_buckets.setdefault(sharenum, []).append(bucket) def _got_error(self, f): level = log.WEIRD hunk ./src/allmydata/immutable/download.py 947 level = log.UNUSUAL self.log("Error during get_buckets", failure=f, level=level, umid="3uuBUQ") + # If we ran out of outstanding requests then errback it and remove it + # from self. + self._queries_failed += 1 + assert (self._responses_received+self._queries_failed) <= self._queries_sent + if self._wait_for_enough_buckets_d and self._responses_received == self._queries_sent: + self._wait_for_enough_buckets_d.errback() + self._wait_for_enough_buckets_d = None def bucket_failed(self, vbucket): shnum = vbucket.sharenum hunk ./src/allmydata/immutable/download.py 996 uri_extension_fetch_started = time.time() vups = [] - for sharenum, bucket in self._share_buckets: - vups.append(ValidatedExtendedURIProxy(bucket, self._verifycap, self._fetch_failures)) + for sharenum, buckets in self._share_buckets.iteritems(): + for bucket in buckets: + vups.append(ValidatedExtendedURIProxy(bucket, self._verifycap, self._fetch_failures)) vto = ValidatedThingObtainer(vups, debugname="vups", log_id=self._parentmsgid) d = vto.start() hunk ./src/allmydata/immutable/download.py 1034 def _get_crypttext_hash_tree(self, res): vchtps = [] - for sharenum, bucket in self._share_buckets: - vchtp = ValidatedCrypttextHashTreeProxy(bucket, self._crypttext_hash_tree, self._vup.num_segments, self._fetch_failures) - vchtps.append(vchtp) + for sharenum, buckets in self._share_buckets.iteritems(): + for bucket in buckets: + vchtp = ValidatedCrypttextHashTreeProxy(bucket, self._crypttext_hash_tree, self._vup.num_segments, self._fetch_failures) + vchtps.append(vchtp) _get_crypttext_hash_tree_started = time.time() if self._status: hunk ./src/allmydata/immutable/download.py 1088 def _download_all_segments(self, res): - for sharenum, bucket in self._share_buckets: - vbucket = ValidatedReadBucketProxy(sharenum, bucket, self._share_hash_tree, self._vup.num_segments, self._vup.block_size, self._vup.share_size) - self._share_vbuckets.setdefault(sharenum, set()).add(vbucket) + for sharenum, buckets in self._share_buckets.iteritems(): + for bucket in buckets: + vbucket = ValidatedReadBucketProxy(sharenum, bucket, self._share_hash_tree, self._vup.num_segments, self._vup.block_size, self._vup.share_size) + self._share_vbuckets.setdefault(sharenum, set()).add(vbucket) # after the above code, self._share_vbuckets contains enough # buckets to complete the download, and some extra ones to } Context: [test_runner: cleanup, refactor common code into a non-executable method Brian Warner **20100127224040 Ignore-this: 4cb4aada87777771f688edfd8129ffca Having both test_node() and test_client() (one of which calls the other) felt confusing to me, so I changed it to have test_node(), test_client(), and a common do_create() helper method. ] [scripts/runner.py: simplify David-Sarah's clever grouped-commands usage trick Brian Warner **20100127223758 Ignore-this: 70877ebf06ae59f32960b0aa4ce1d1ae ] [tahoe backup: skip all symlinks, with warning. Fixes #850, addresses #641. Brian Warner **20100127223517 Ignore-this: ab5cf05158d32a575ca8efc0f650033f ] [NEWS: update with all recent user-visible changes Brian Warner **20100127222209 Ignore-this: 277d24568018bf4f3fb7736fda64eceb ] ["tahoe backup": fix --exclude-vcs docs to include Git Brian Warner **20100127201044 Ignore-this: 756a58dde21bdc65aa62b81803605b5 ] [docs: fix references to --no-storage, explanation of [storage] section Brian Warner **20100127200956 Ignore-this: f4be1763a585e1ac6299a4f1b94a59e0 ] [docs: further CREDITS level-ups for Nils, Kevan, David-Sarah zooko@zooko.com**20100126170021 Ignore-this: 1e513e85cf7b7abf57f056e6d7544b38 ] [Patch to accept t=set-children as well as t=set_children david-sarah@jacaranda.org**20100124030020 Ignore-this: 2c061f12af817cdf77feeeb64098ec3a ] [Fix boodlegrid use of set_children david-sarah@jacaranda.org**20100126063414 Ignore-this: 3aa2d4836f76303b2bacecd09611f999 ] [ftpd: clearer error message if Twisted needs a patch (by Nils Durner) zooko@zooko.com**20100126143411 Ignore-this: 440e6831ae6da5135c1edd081c93871f ] [Add 'docs/performance.txt', which (for the moment) describes mutable file performance issues Kevan Carstensen **20100115204500 Ignore-this: ade4e500217db2509aee35aacc8c5dbf ] [docs: more CREDITS for François, Kevan, and David-Sarah zooko@zooko.com**20100126132133 Ignore-this: f37d4977c13066fcac088ba98a31b02e ] [tahoe_backup.py: display warnings on errors instead of stopping the whole backup. Fix #729. francois@ctrlaltdel.ch**20100120094249 Ignore-this: 7006ea4b0910b6d29af6ab4a3997a8f9 This patch displays a warning to the user in two cases: 1. When special files like symlinks, fifos, devices, etc. are found in the local source. 2. If files or directories are not readables by the user running the 'tahoe backup' command. In verbose mode, the number of skipped files and directories is printed at the end of the backup. Exit status returned by 'tahoe backup': - 0 everything went fine - 1 the backup failed - 2 files were skipped during the backup ] [Warn about test failures due to setting FLOG* env vars david-sarah@jacaranda.org**20100124220629 Ignore-this: 1c25247ca0f0840390a1b7259a9f4a3c ] [Message saying that we couldn't find bin/tahoe should say where we looked david-sarah@jacaranda.org**20100116204556 Ignore-this: 1068576fd59ea470f1e19196315d1bb ] [Change running.html to describe 'tahoe run' david-sarah@jacaranda.org**20100112044409 Ignore-this: 23ad0114643ce31b56e19bb14e011e4f ] [cli: merge the better version of David-Sarah's split-usage-and-help patch with the earlier version that I mistakenly committed zooko@zooko.com**20100126044559 Ignore-this: 284d188e13b7901013cbb650168e6447 ] [Split tahoe --help options into groups. david-sarah@jacaranda.org**20100112043935 Ignore-this: 610f9c41b00e6863e3cd047379733e3a ] [cli: split usage strings into groups (patch by David-Sarah Hopwood) zooko@zooko.com**20100126043921 Ignore-this: 51928d266a7292b873f87f7d53c9a01e ] [Add create-node CLI command, and make create-client equivalent to create-node --no-storage (fixes #760) david-sarah@jacaranda.org**20100116052055 Ignore-this: 47d08b18c69738685e13ff365738d5a ] [Remove replace= parameter to mkdir-immutable and mkdir-with-children david-sarah@jacaranda.org**20100124224325 Ignore-this: 25207bcc946c0c43d9528718e76ba7b ] [contrib/fuse/runtests.py: Fix #888, configure settings in tahoe.cfg and don't treat warnings as failure francois@ctrlaltdel.ch**20100109123010 Ignore-this: 2590d44044acd7dfa3690c416cae945c Fix a few bitrotten pieces in the FUSE test script. It now configures tahoe node settings by editing tahoe.cfg which is the new supported method. It alos tolerate warnings issued by the mount command, the cause of these warnings is the same as in #876 (contrib/fuse/runtests.py doesn't tolerate deprecations warnings). ] [Fix webapi t=mkdir with multpart/form-data, as on the Welcome page. Closes #919. Brian Warner **20100121065052 Ignore-this: 1f20ea0a0f1f6d6c1e8e14f193a92c87 ] [tahoe_add_alias.py: minor refactoring Brian Warner **20100115064220 Ignore-this: 29910e81ad11209c9e493d65fd2dab9b ] [test_dirnode.py: reduce scope of a Client instance, suggested by Kevan. Brian Warner **20100115062713 Ignore-this: b35efd9e6027e43de6c6f509bfb4ccaa ] [test_provisioning: STAN is not always a list. Fix by David-Sarah Hopwood. Brian Warner **20100115014632 Ignore-this: 9989de7f1e00907706d2b63153138219 ] [web/directory.py mkdir-immutable: hush pyflakes, add TODO for #903 behavior Brian Warner **20100114222804 Ignore-this: 717cd3b9a1c8aeee76938c9641db7356 ] [hush pyflakes-0.4.0 warnings: slightly less-trivial fixes. Closes #900. Brian Warner **20100114221719 Ignore-this: f774f4637e256ad55502659413a811a8 This includes one fix (in test_web) which was testing the wrong thing. ] [hush pyflakes-0.4.0 warnings: remove trivial unused variables. For #900. Brian Warner **20100114221529 Ignore-this: e96106c8f1a99fbf93306fbfe9a294cf ] [tahoe add-alias/create-alias: don't corrupt non-newline-terminated alias Brian Warner **20100114210246 Ignore-this: 9c994792e53a85159d708760a9b1b000 file. Closes #741. ] [change docs and --help to use "grid" instead of "virtual drive": closes #892. Brian Warner **20100114201119 Ignore-this: a20d4a4dcc4de4e3b404ff72d40fc29b Thanks to David-Sarah Hopwood for the patch. ] [backupdb.txt: fix ST_CTIME reference Brian Warner **20100114194052 Ignore-this: 5a189c7a1181b07dd87f0a08ea31b6d3 ] [client.py: fix/update comments on KeyGenerator Brian Warner **20100113004226 Ignore-this: 2208adbb3fd6a911c9f44e814583cabd ] [Clean up log.err calls, for one of the issues in #889. Brian Warner **20100112013343 Ignore-this: f58455ce15f1fda647c5fb25d234d2db allmydata.util.log.err() either takes a Failure as the first positional argument, or takes no positional arguments and must be invoked in an exception handler. Fixed its signature to match both foolscap.logging.log.err and twisted.python.log.err . Included a brief unit test. ] [tidy up DeadReferenceError handling, ignore them in add_lease calls Brian Warner **20100112000723 Ignore-this: 72f1444e826fd0b9db6d318f89603c38 Stop checking separately for ConnectionDone/ConnectionLost, since those have been folded into DeadReferenceError since foolscap-0.3.1 . Write rrefutil.trap_deadref() in terms of rrefutil.trap_and_discard() to improve code coverage. ] [NEWS: improve "tahoe backup" notes, mention first-backup-after-upgrade duration Brian Warner **20100111190132 Ignore-this: 10347c590b3375964579ba6c2b0edb4f Thanks to Francois Deppierraz for the suggestion. ] [test_repairer: add (commented-out) test_each_byte, to see exactly what the Brian Warner **20100110203552 Ignore-this: 8e84277d5304752edeff052b97821815 Verifier misses The results (described in #819) match our expectations: it misses corruption in unused share fields and in most container fields (which are only visible to the storage server, not the client). 1265 bytes of a 2753 byte share (hosting a 56-byte file with an artifically small segment size) are unused, mostly in the unused tail of the overallocated UEB space (765 bytes), and the allocated-but-unwritten plaintext_hash_tree (480 bytes). ] [repairer: fix some wrong offsets in the randomized verifier tests, debugged by Brian zooko@zooko.com**20100110203721 Ignore-this: 20604a609db8706555578612c1c12feb fixes #819 ] [test_repairer: fix colliding basedir names, which caused test inconsistencies Brian Warner **20100110084619 Ignore-this: b1d56dd27e6ab99a7730f74ba10abd23 ] [repairer: add deterministic test for #819, mark as TODO zooko@zooko.com**20100110013619 Ignore-this: 4cb8bb30b25246de58ed2b96fa447d68 ] [contrib/fuse/runtests.py: Tolerate the tahoe CLI returning deprecation warnings francois@ctrlaltdel.ch**20100109175946 Ignore-this: 419c354d9f2f6eaec03deb9b83752aee Depending on the versions of external libraries such as Twisted of Foolscap, the tahoe CLI can display deprecation warnings on stdout. The tests should not interpret those warnings as a failure if the node is in fact correctly started. See http://allmydata.org/trac/tahoe/ticket/859 for an example of deprecation warnings. fixes #876 ] [docs: CREDITS: add David-Sarah to the CREDITS file zooko@zooko.com**20100109060435 Ignore-this: 896062396ad85f9d2d4806762632f25a ] [mutable/publish: don't loop() right away upon DeadReferenceError. Closes #877 Brian Warner **20100102220841 Ignore-this: b200e707b3f13aa8251981362b8a3e61 The bug was that a disconnected server could cause us to re-enter the initial loop() call, sending multiple queries to a single server, provoking an incorrect UCWE. To fix it, stall the loop() with an eventual.fireEventually() ] [immutable/checker.py: oops, forgot some imports. Also hush pyflakes. Brian Warner **20091229233909 Ignore-this: 4d61bd3f8113015a4773fd4768176e51 ] [mutable repair: return successful=False when numshares**20091229233746 Ignore-this: d881c3275ff8c8bee42f6a80ca48441e instead of weird errors. Closes #874 and #786. Previously, if the file had 0 shares, this would raise TypeError as it tried to call download_version(None). If the file had some shares but fewer than 'k', it would incorrectly raise MustForceRepairError. Added get_successful() to the IRepairResults API, to give repair() a place to report non-code-bug problems like this. ] [node.py/interfaces.py: minor docs fixes Brian Warner **20091229230409 Ignore-this: c86ad6342ef0f95d50639b4f99cd4ddf ] [NEWS: fix 1.4.1 announcement w.r.t. add-lease behavior in older releases Brian Warner **20091229230310 Ignore-this: bbbbb9c961f3bbcc6e5dbe0b1594822 ] [checker: don't let failures in add-lease affect checker results. Closes #875. Brian Warner **20091229230108 Ignore-this: ef1a367b93e4d01298c2b1e6ca59c492 Mutable servermap updates and the immutable checker, when run with add_lease=True, send both the do-you-have-block and add-lease commands in parallel, to avoid an extra round trip time. Many older servers have problems with add-lease and raise various exceptions, which don't generally matter. The client-side code was catching+ignoring some of them, but unrecognized exceptions were passed through to the DYHB code, concealing the DYHB results from the checker, making it think the server had no shares. The fix is to separate the code paths. Both commands are sent at the same time, but the errback path from add-lease is handled separately. Known exceptions are ignored, the others (both unknown-remote and all-local) are logged (log.WEIRD, which will trigger an Incident), but neither will affect the DYHB results. The add-lease message is sent first, and we know that the server handles them synchronously. So when the checker is done, we can be sure that all the add-lease messages have been retired. This makes life easier for unit tests. ] [test_cli: verify fix for "tahoe get" not creating empty file on error (#121) Brian Warner **20091227235444 Ignore-this: 6444d52413b68eb7c11bc3dfdc69c55f ] [addendum to "Fix 'tahoe ls' on files (#771)" Brian Warner **20091227232149 Ignore-this: 6dd5e25f8072a3153ba200b7fdd49491 tahoe_ls.py: tolerate missing metadata web/filenode.py: minor cleanups test_cli.py: test 'tahoe ls FILECAP' ] [Fix 'tahoe ls' on files (#771). Patch adapted from Kevan Carstensen. Brian Warner **20091227225443 Ignore-this: 8bf8c7b1cd14ea4b0ebd453434f4fe07 web/filenode.py: also serve edge metadata when using t=json on a DIRCAP/childname object. tahoe_ls.py: list file objects as if we were listing one-entry directories. Show edge metadata if we have it, which will be true when doing 'tahoe ls DIRCAP/filename' and false when doing 'tahoe ls FILECAP' ] [tahoe_get: don't create the output file on error. Closes #121. Brian Warner **20091227220404 Ignore-this: 58d5e793a77ec6e87d9394ade074b926 ] [webapi: don't accept zero-length childnames during traversal. Closes #358, #676. Brian Warner **20091227201043 Ignore-this: a9119dec89e1c7741f2289b0cad6497b This forbids operations that would implicitly create a directory with a zero-length (empty string) name, like what you'd get if you did "tahoe put local /oops/blah" (#358) or "POST /uri/CAP//?t=mkdir" (#676). The error message is fairly friendly too. Also added code to "tahoe put" to catch this error beforehand and suggest the correct syntax (i.e. without the leading slash). ] [CLI: send 'Accept:' header to ask for text/plain tracebacks. Closes #646. Brian Warner **20091227195828 Ignore-this: 44c258d4d4c7dac0ed58adb22f73331 The webapi has been looking for an Accept header since 1.4.0, but it treats a missing header as equal to */* (to honor RFC2616). This change finally modifies our CLI tools to ask for "text/plain, application/octet-stream", which seems roughly correct (we either want a plain-text traceback or error message, or an uninterpreted chunk of binary data to save to disk). Some day we'll figure out how JSON fits into this scheme. ] [Makefile: upload-tarballs: switch from xfer-client to flappclient, closes #350 Brian Warner **20091227163703 Ignore-this: 3beeecdf2ad9c2438ab57f0e33dcb357 I've also set up a new flappserver on source@allmydata.org to receive the tarballs. We still need to replace the gutsy buildslave (which is where the tarballs used to be generated+uploaded) and give it the new FURL. ] [misc/ringsim.py: make it deterministic, more detail about grid-is-full behavior Brian Warner **20091227024832 Ignore-this: a691cc763fb2e98a4ce1767c36e8e73f ] [misc/ringsim.py: tool to discuss #302 Brian Warner **20091226060339 Ignore-this: fc171369b8f0d97afeeb8213e29d10ed ] [contrib: fix fuse_impl_c to use new Python API zooko@zooko.com**20100109174956 Ignore-this: 51ca1ec7c2a92a0862e9b99e52542179 original patch by Thomas Delaet, fixed by François, reviewed by Brian, committed by me ] [docs/stats.txt: add TOC, notes about controlling gatherer's listening port Brian Warner **20091224202133 Ignore-this: 8eef63b0e18db5aa8249c2eafde02c05 Thanks to Jody Harris for the suggestions. ] [Add docs/stats.py, explaining Tahoe stats, the gatherer, and the munin plugins. Brian Warner **20091223052400 Ignore-this: 7c9eeb6e5644eceda98b59a67730ccd5 ] [more #859: avoid deprecation warning for unit tests too, hush pyflakes Brian Warner **20091215000147 Ignore-this: 193622e24d31077da825a11ed2325fd3 * factor maybe-import-sha logic into util.hashutil ] [docs: fix helper.txt to describe new config style zooko@zooko.com**20091224223522 Ignore-this: 102e7692dc414a4b466307f7d78601fe ] [use hashlib module if available, thus avoiding a DeprecationWarning for importing the old sha module; fixes #859 zooko@zooko.com**20091214212703 Ignore-this: 8d0f230a4bf8581dbc1b07389d76029c ] [docs: reflow architecture.txt to 78-char lines zooko@zooko.com**20091208232943 Ignore-this: 88f55166415f15192e39407815141f77 ] [mutable/retrieve.py: stop reaching into private MutableFileNode attributes Brian Warner **20091208172921 Ignore-this: 61e548798c1105aed66a792bf26ceef7 ] [mutable/servermap.py: stop reaching into private MutableFileNode attributes Brian Warner **20091208172608 Ignore-this: b40a6b62f623f9285ad96fda139c2ef2 ] [mutable/servermap.py: oops, query N+e servers in MODE_WRITE, not k+e Brian Warner **20091208171156 Ignore-this: 3497f4ab70dae906759007c3cfa43bc under normal conditions, this wouldn't cause any problems, but if the shares are really sparse (perhaps because new servers were added), then file-modifies might stop looking too early and leave old shares in place ] [control.py: fix speedtest: use download_best_version (not read) on mutable nodes Brian Warner **20091207060512 Ignore-this: 7125eabfe74837e05f9291dd6414f917 ] [FTP-and-SFTP.txt: fix ssh-keygen pointer Brian Warner **20091207052803 Ignore-this: bc2a70ee8c58ec314e79c1262ccb22f7 ] [remove MutableFileNode.download(), prefer download_best_version() instead Brian Warner **20091201225438 Ignore-this: 5733eb373a902063e09fd52cc858dec0 ] [Simplify immutable download API: use just filenode.read(consumer, offset, size) Brian Warner **20091201225330 Ignore-this: bdedfb488ac23738bf52ae6d4ab3a3fb * remove Downloader.download_to_data/download_to_filename/download_to_filehandle * remove download.Data/FileName/FileHandle targets * remove filenode.download/download_to_data/download_to_filename methods * leave Downloader.download (the whole Downloader will go away eventually) * add util.consumer.MemoryConsumer/download_to_data, for convenience (this is mostly used by unit tests, but it gets used by enough non-test code to warrant putting it in allmydata.util) * update tests * removes about 180 lines of code. Yay negative code days! Overall plan is to rewrite immutable/download.py and leave filenode.read() as the sole read-side API. ] [server.py: undo my bogus 'correction' of David-Sarah's comment fix Brian Warner **20091201024607 Ignore-this: ff4bb58f6a9e045b900ac3a89d6f506a and move it to a better line ] [Implement more coherent behavior when copying with dircaps/filecaps (closes #761). Patch by Kevan Carstensen. "Brian Warner "**20091130211009] [storage.py: update comment "Brian Warner "**20091130195913] [storage server: detect disk space usage on Windows too (fixes #637) david-sarah@jacaranda.org**20091121055644 Ignore-this: 20fb30498174ce997befac7701fab056 ] [make status of finished operations consistently "Finished" david-sarah@jacaranda.org**20091121061543 Ignore-this: 97d483e8536ccfc2934549ceff7055a3 ] [docs: update the about.html a little zooko@zooko.com**20091208212737 Ignore-this: 3fe2d9653c6de0727d3e82bd70f2a8ed ] [setup: ignore _darcs in the "test-clean" test and make the "clean" step remove all .egg's in the root dir zooko@zooko.com**20091206184835 Ignore-this: 6066bd160f0db36d7bf60aba405558d2 ] [NEWS: update with all user-visible changes since the last release Brian Warner **20091127224217 Ignore-this: 741da6cd928e939fb6d21a61ea3daf0b ] [update "tahoe backup" docs, and webapi.txt's mkdir-with-children Brian Warner **20091127055900 Ignore-this: defac1fb9a2335b0af3ef9dbbcc67b7e ] [Add dirnodes to backupdb and "tahoe backup", closes #606. Brian Warner **20091126234257 Ignore-this: fa88796fcad1763c6a2bf81f56103223 * backups now share dirnodes with any previous backup, in any location, so renames and moves are handled very efficiently * "tahoe backup" no longer bothers reading the previous snapshot * if you switch grids, you should delete ~/.tahoe/private/backupdb.sqlite, to force new uploads of all files and directories ] [webapi: fix t=check for DIR2-LIT (i.e. empty immutable directories) Brian Warner **20091126232731 Ignore-this: 8513c890525c69c1eca0e80d53a231f8 ] [PipelineError: fix str() on python2.4 . Closes #842. Brian Warner **20091124212512 Ignore-this: e62c92ea9ede2ab7d11fe63f43b9c942 ] [test_uri.py: s/NewDirnode/Dirnode/ , now that they aren't "new" anymore Brian Warner **20091120075553 Ignore-this: 61c8ef5e45a9d966873a610d8349b830 ] [interface name cleanups: IFileNode, IImmutableFileNode, IMutableFileNode Brian Warner **20091120075255 Ignore-this: e3d193c229e2463e1d0b0c92306de27f The proper hierarchy is: IFilesystemNode +IFileNode ++IMutableFileNode ++IImmutableFileNode +IDirectoryNode Also expand test_client.py (NodeMaker) to hit all IFilesystemNode types. ] [class name cleanups: s/FileNode/ImmutableFileNode/ Brian Warner **20091120072239 Ignore-this: 4b3218f2d0e585c62827e14ad8ed8ac1 also fix test/bench_dirnode.py for recent dirnode changes ] [Use DIR-IMM and t=mkdir-immutable for "tahoe backup", for #828 Brian Warner **20091118192813 Ignore-this: a4720529c9bc6bc8b22a3d3265925491 ] [web/directory.py: use "DIR-IMM" to describe immutable directories, not DIR-RO Brian Warner **20091118191832 Ignore-this: aceafd6ab4bf1cc0c2a719ef7319ac03 ] [web/info.py: hush pyflakes Brian Warner **20091118191736 Ignore-this: edc5f128a2b8095fb20686a75747c8 ] [make get_size/get_current_size consistent for all IFilesystemNode classes Brian Warner **20091118191624 Ignore-this: bd3449cf96e4827abaaf962672c1665a * stop caching most_recent_size in dirnode, rely upon backing filenode for it * start caching most_recent_size in MutableFileNode * return None when you don't know, not "?" * only render None as "?" in the web "more info" page * add get_size/get_current_size to UnknownNode ] [ImmutableDirectoryURIVerifier: fix verifycap handling Brian Warner **20091118164238 Ignore-this: 6bba5c717b54352262eabca6e805d590 ] [Add t=mkdir-immutable to the webapi. Closes #607. Brian Warner **20091118070900 Ignore-this: 311e5fab9a5f28b9e8a28d3d08f3c0d * change t=mkdir-with-children to not use multipart/form encoding. Instead, the request body is all JSON. t=mkdir-immutable uses this format too. * make nodemaker.create_immutable_dirnode() get convergence from SecretHolder, but let callers override it * raise NotDeepImmutableError instead of using assert() * add mutable= argument to DirectoryNode.create_subdirectory(), default True ] [move convergence secret into SecretHolder, next to lease secret Brian Warner **20091118015444 Ignore-this: 312f85978a339f2d04deb5bcb8f511bc ] [nodemaker: implement immutable directories (internal interface), for #607 Brian Warner **20091112002233 Ignore-this: d09fccf41813fdf7e0db177ed9e5e130 * nodemaker.create_from_cap() now handles DIR2-CHK and DIR2-LIT * client.create_immutable_dirnode() is used to create them * no webapi yet ] [stop using IURI()/etc as an adapter Brian Warner **20091111224542 Ignore-this: 9611da7ea6a4696de2a3b8c08776e6e0 ] [clean up uri-vs-cap terminology, emphasize cap instances instead of URI strings Brian Warner **20091111222619 Ignore-this: 93626385f6e7f039ada71f54feefe267 * "cap" means a python instance which encapsulates a filecap/dircap (uri.py) * "uri" means a string with a "URI:" prefix * FileNode instances are created with (and retain) a cap instance, and generate uri strings on demand * .get_cap/get_readcap/get_verifycap/get_repaircap return cap instances * .get_uri/get_readonly_uri return uri strings * add filenode.download_to_filename() for control.py, should find a better way * use MutableFileNode.init_from_cap, not .init_from_uri * directory URI instances: use get_filenode_cap, not get_filenode_uri * update/cleanup bench_dirnode.py to match, add Makefile target to run it ] [add parser for immutable directory caps: DIR2-CHK, DIR2-LIT, DIR2-CHK-Verifier Brian Warner **20091104181351 Ignore-this: 854398cc7a75bada57fa97c367b67518 ] [wui: s/TahoeLAFS/Tahoe-LAFS/ zooko@zooko.com**20091029035050 Ignore-this: 901e64cd862e492ed3132bd298583c26 ] [docs: remove obsolete doc file "codemap.txt" zooko@zooko.com**20091113163033 Ignore-this: 16bc21a1835546e71d1b344c06c61ebb I started to update this to reflect the current codebase, but then I thought (a) nobody seemed to notice that it hasn't been updated since December 2007, and (b) it will just bit-rot again, so I'm removing it. ] [dirnode.pack_children(): add deep_immutable= argument Brian Warner **20091026162809 Ignore-this: d5a2371e47662c4bc6eff273e8181b00 This will be used by DIR2:CHK to enforce the deep-immutability requirement. ] [webapi: use t=mkdir-with-children instead of a children= arg to t=mkdir . Brian Warner **20091026011321 Ignore-this: 769cab30b6ab50db95000b6c5a524916 This is safer: in the earlier API, an old webapi server would silently ignore the initial children, and clients trying to set them would have to fetch the newly-created directory to discover the incompatibility. In the new API, clients using t=mkdir-with-children against an old webapi server will get a clear error. ] [tests: bump up the timeout on test_repairer to see if 120 seconds was too short for François's ARM box to do the test even when it was doing it right. zooko@zooko.com**20091027224800 Ignore-this: 95e93dc2e018b9948253c2045d506f56 ] [nodemaker.create_new_mutable_directory: pack_children() in initial_contents= Brian Warner **20091020005118 Ignore-this: bd43c4eefe06fd32b7492bcb0a55d07e instead of creating an empty file and then adding the children later. This should speed up mkdir(initial_children) considerably, removing two roundtrips and an entire read-modify-write cycle, probably bringing it down to a single roundtrip. A quick test (against the volunteergrid) suggests a 30% speedup. test_dirnode: add new tests to enforce the restrictions that interfaces.py claims for create_new_mutable_directory(): no UnknownNodes, metadata dicts ] [test_dirnode.py: add tests of initial_children= args to client.create_dirnode Brian Warner **20091017194159 Ignore-this: 2e2da28323a4d5d815466387914abc1b and nodemaker.create_new_mutable_directory ] [update many dirnode interfaces to accept dict-of-nodes instead of dict-of-caps Brian Warner **20091017192829 Ignore-this: b35472285143862a856bf4b361d692f0 interfaces.py: define INodeMaker, document argument values, change create_new_mutable_directory() to take dict-of-nodes. Change dirnode.set_nodes() and dirnode.create_subdirectory() too. nodemaker.py: use INodeMaker, update create_new_mutable_directory() client.py: have create_dirnode() delegate initial_children= to nodemaker dirnode.py (Adder): take dict-of-nodes instead of list-of-nodes, which updates set_nodes() and create_subdirectory() web/common.py (convert_initial_children_json): create dict-of-nodes web/directory.py: same web/unlinked.py: same test_dirnode.py: update tests to match ] [dirnode.py: move pack_children() out to a function, for eventual use by others Brian Warner **20091017180707 Ignore-this: 6a823fb61f2c180fd38d6742d3196a7a ] [move dirnode.CachingDict to dictutil.AuxValueDict, generalize method names, Brian Warner **20091017180005 Ignore-this: b086933cf429df0fcea16a308d2640dd improve tests. Let dirnode _pack_children accept either dict or AuxValueDict. ] [test/common.py: update FakeMutableFileNode to new contents= callable scheme Brian Warner **20091013052154 Ignore-this: 62f00a76454a2190d1c8641c5993632f ] [The initial_children= argument to nodemaker.create_new_mutable_directory is Brian Warner **20091013031922 Ignore-this: 72e45317c21f9eb9ec3bd79bd4311f48 now enabled. ] [client.create_mutable_file(contents=) now accepts a callable, which is Brian Warner **20091013031232 Ignore-this: 3c89d2f50c1e652b83f20bd3f4f27c4b invoked with the new MutableFileNode and is supposed to return the initial contents. This can be used by e.g. a new dirnode which needs the filenode's writekey to encrypt its initial children. create_mutable_file() still accepts a bytestring too, or None for an empty file. ] [webapi: t=mkdir now accepts initial children, using the same JSON that t=json Brian Warner **20091013023444 Ignore-this: 574a46ed46af4251abf8c9580fd31ef7 emits. client.create_dirnode(initial_children=) now works. ] [replace dirnode.create_empty_directory() with create_subdirectory(), which Brian Warner **20091013021520 Ignore-this: 6b57cb51bcfcc6058d0df569fdc8a9cf takes an initial_children= argument ] [dirnode.set_children: change return value: fire with self instead of None Brian Warner **20091013015026 Ignore-this: f1d14e67e084e4b2a4e25fa849b0e753 ] [dirnode.set_nodes: change return value: fire with self instead of None Brian Warner **20091013014546 Ignore-this: b75b3829fb53f7399693f1c1a39aacae ] [dirnode.set_children: take a dict, not a list Brian Warner **20091013002440 Ignore-this: 540ce72ce2727ee053afaae1ff124e21 ] [dirnode.set_uri/set_children: change signature to take writecap+readcap Brian Warner **20091012235126 Ignore-this: 5df617b2d379a51c79148a857e6026b1 instead of a single cap. The webapi t=set_children call benefits too. ] [replace Client.create_empty_dirnode() with create_dirnode(), in anticipation Brian Warner **20091012224506 Ignore-this: cbdaa4266ecb3c6496ffceab4f95709d of adding initial_children= argument. Includes stubbed-out initial_children= support. ] [test_web.py: use a less-fake client, making test harness smaller Brian Warner **20091012222808 Ignore-this: 29e95147f8c94282885c65b411d100bb ] [webapi.txt: document t=set_children, other small edits Brian Warner **20091009200446 Ignore-this: 4d7e76b04a7b8eaa0a981879f778ea5d ] [Verifier: check the full cryptext-hash tree on each share. Removed .todos Brian Warner **20091005221849 Ignore-this: 6fb039c5584812017d91725e687323a5 from the last few test_repairer tests that were waiting on this. ] [Verifier: check the full block-hash-tree on each share Brian Warner **20091005214844 Ignore-this: 3f7ccf6d253f32340f1bf1da27803eee Removed the .todo from two test_repairer tests that check this. The only remaining .todos are on the three crypttext-hash-tree tests. ] [Verifier: check the full share-hash chain on each share Brian Warner **20091005213443 Ignore-this: 3d30111904158bec06a4eac22fd39d17 Removed the .todo from two test_repairer tests that check this. ] [test_repairer: rename Verifier test cases to be more precise and less verbose Brian Warner **20091005201115 Ignore-this: 64be7094e33338c7c2aea9387e138771 ] [immutable/checker.py: rearrange code a little bit, make it easier to follow Brian Warner **20091005200252 Ignore-this: 91cc303fab66faf717433a709f785fb5 ] [test/common.py: wrap docstrings to 80cols so I can read them more easily Brian Warner **20091005200143 Ignore-this: b180a3a0235cbe309c87bd5e873cbbb3 ] [immutable/download.py: wrap to 80cols, no functional changes Brian Warner **20091005192542 Ignore-this: 6b05fe3dc6d78832323e708b9e6a1fe ] [CHK-hashes.svg: cross out plaintext hashes, since we don't include Brian Warner **20091005010803 Ignore-this: bea2e953b65ec7359363aa20de8cb603 them (until we finish #453) ] [docs: a few licensing clarifications requested by Ubuntu zooko@zooko.com**20090927033226 Ignore-this: 749fc8c9aeb6dc643669854a3e81baa7 ] [setup: remove binary WinFUSE modules zooko@zooko.com**20090924211436 Ignore-this: 8aefc571d2ae22b9405fc650f2c2062 I would prefer to have just source code, or indications of what 3rd-party packages are required, under revision control, and have the build process generate o r acquire the binaries as needed. Also, having these in our release tarballs is interfering with getting Tahoe-LAFS uploaded into Ubuntu Karmic. (Technicall y, they would accept binary modules as long as they came with the accompanying source so that they could satisfy their obligations under GPL2+ and TGPPL1+, bu t it is easier for now to remove the binaries from the source tree.) In this case, the binaries are from the tahoe-w32-client project: http://allmydata.org/trac/tahoe-w32-client , from which you can also get the source. ] [setup: remove binary _fusemodule.so 's zooko@zooko.com**20090924211130 Ignore-this: 74487bbe27d280762ac5dd5f51e24186 I would prefer to have just source code, or indications of what 3rd-party packages are required, under revision control, and have the build process generate or acquire the binaries as needed. Also, having these in our release tarballs is interfering with getting Tahoe-LAFS uploaded into Ubuntu Karmic. (Technically, they would accept binary modules as long as they came with the accompanying source so that they could satisfy their obligations under GPL2+ and TGPPL1+, but it is easier for now to remove the binaries from the source tree.) In this case, these modules come from the MacFUSE project: http://code.google.com/p/macfuse/ ] [doc: add a copy of LGPL2 for documentation purposes for ubuntu zooko@zooko.com**20090924054218 Ignore-this: 6a073b48678a7c84dc4fbcef9292ab5b ] [setup: remove a convenience copy of figleaf, to ease inclusion into Ubuntu Karmic Koala zooko@zooko.com**20090924053215 Ignore-this: a0b0c990d6e2ee65c53a24391365ac8d We need to carefully document the licence of figleaf in order to get Tahoe-LAFS into Ubuntu Karmic Koala. However, figleaf isn't really a part of Tahoe-LAFS per se -- this is just a "convenience copy" of a development tool. The quickest way to make Tahoe-LAFS acceptable for Karmic then, is to remove figleaf from the Tahoe-LAFS tarball itself. People who want to run figleaf on Tahoe-LAFS (as everyone should want) can install figleaf themselves. I haven't tested this -- there may be incompatibilities between upstream figleaf and the copy that we had here... ] [setup: shebang for misc/build-deb.py to fail quickly zooko@zooko.com**20090819135626 Ignore-this: 5a1b893234d2d0bb7b7346e84b0a6b4d Without this patch, when I ran "chmod +x ./misc/build-deb.py && ./misc/build-deb.py" then it hung indefinitely. (I wonder what it was doing.) ] [docs: Shawn Willden grants permission for his contributions under GPL2+|TGPPL1+ zooko@zooko.com**20090921164651 Ignore-this: ef1912010d07ff2ffd9678e7abfd0d57 ] [docs: Csaba Henk granted permission to license fuse.py under the same terms as Tahoe-LAFS itself zooko@zooko.com**20090921154659 Ignore-this: c61ba48dcb7206a89a57ca18a0450c53 ] [setup: mark setup.py as having utf-8 encoding in it zooko@zooko.com**20090920180343 Ignore-this: 9d3850733700a44ba7291e9c5e36bb91 ] [doc: licensing cleanups zooko@zooko.com**20090920171631 Ignore-this: 7654f2854bf3c13e6f4d4597633a6630 Use nice utf-8 © instead of "(c)". Remove licensing statements on utility modules that have been assigned to allmydata.com by their original authors. (Nattraverso was not assigned to allmydata.com -- it was LGPL'ed -- but I checked and src/allmydata/util/iputil.py was completely rewritten and doesn't contain any line of code from nattraverso.) Add notes to misc/debian/copyright about licensing on files that aren't just allmydata.com-licensed. ] [build-deb.py: run darcsver early, otherwise we get the wrong version later on Brian Warner **20090918033620 Ignore-this: 6635c5b85e84f8aed0d8390490c5392a ] [new approach for debian packaging, sharing pieces across distributions. Still experimental, still only works for sid. warner@lothar.com**20090818190527 Ignore-this: a75eb63db9106b3269badbfcdd7f5ce1 ] [new experimental deb-packaging rules. Only works for sid so far. Brian Warner **20090818014052 Ignore-this: 3a26ad188668098f8f3cc10a7c0c2f27 ] [setup.py: read _version.py and pass to setup(version=), so more commands work Brian Warner **20090818010057 Ignore-this: b290eb50216938e19f72db211f82147e like "setup.py --version" and "setup.py --fullname" ] [test/check_speed.py: fix shbang line Brian Warner **20090818005948 Ignore-this: 7f3a37caf349c4c4de704d0feb561f8d ] [de-Service-ify Helper, pass in storage_broker and secret_holder directly. Brian Warner **20090815201737 Ignore-this: 86b8ac0f90f77a1036cd604dd1304d8b This makes it more obvious that the Helper currently generates leases with the Helper's own secrets, rather than getting values from the client, which is arguably a bug that will likely be resolved with the Accounting project. ] [immutable.Downloader: pass StorageBroker to constructor, stop being a Service Brian Warner **20090815192543 Ignore-this: af5ab12dbf75377640a670c689838479 child of the client, access with client.downloader instead of client.getServiceNamed("downloader"). The single "Downloader" instance is scheduled for demolition anyways, to be replaced by individual filenode.download calls. ] [tests: double the timeout on test_runner.RunNode.test_introducer since feisty hit a timeout zooko@zooko.com**20090815160512 Ignore-this: ca7358bce4bdabe8eea75dedc39c0e67 I'm not sure if this is an actual timing issue (feisty is running on an overloaded VM if I recall correctly), or it there is a deeper bug. ] [stop making History be a Service, it wasn't necessary Brian Warner **20090815114415 Ignore-this: b60449231557f1934a751c7effa93cfe ] [Overhaul IFilesystemNode handling, to simplify tests and use POLA internally. Brian Warner **20090815112846 Ignore-this: 1db1b9c149a60a310228aba04c5c8e5f * stop using IURI as an adapter * pass cap strings around instead of URI instances * move filenode/dirnode creation duties from Client to new NodeMaker class * move other Client duties to KeyGenerator, SecretHolder, History classes * stop passing Client reference to dirnode/filenode constructors - pass less-powerful references instead, like StorageBroker or Uploader * always create DirectoryNodes by wrapping a filenode (mutable for now) * remove some specialized mock classes from unit tests Detailed list of changes (done one at a time, then merged together) always pass a string to create_node_from_uri(), not an IURI instance always pass a string to IFilesystemNode constructors, not an IURI instance stop using IURI() as an adapter, switch on cap prefix in create_node_from_uri() client.py: move SecretHolder code out to a separate class test_web.py: hush pyflakes client.py: move NodeMaker functionality out into a separate object LiteralFileNode: stop storing a Client reference immutable Checker: remove Client reference, it only needs a SecretHolder immutable Upload: remove Client reference, leave SecretHolder and StorageBroker immutable Repairer: replace Client reference with StorageBroker and SecretHolder immutable FileNode: remove Client reference mutable.Publish: stop passing Client mutable.ServermapUpdater: get StorageBroker in constructor, not by peeking into Client reference MutableChecker: reference StorageBroker and History directly, not through Client mutable.FileNode: removed unused indirection to checker classes mutable.FileNode: remove Client reference client.py: move RSA key generation into a separate class, so it can be passed to the nodemaker move create_mutable_file() into NodeMaker test_dirnode.py: stop using FakeClient mockups, use NoNetworkGrid instead. This simplifies the code, but takes longer to run (17s instead of 6s). This should come down later when other cleanups make it possible to use simpler (non-RSA) fake mutable files for dirnode tests. test_mutable.py: clean up basedir names client.py: move create_empty_dirnode() into NodeMaker dirnode.py: get rid of DirectoryNode.create remove DirectoryNode.init_from_uri, refactor NodeMaker for customization, simplify test_web's mock Client to match stop passing Client to DirectoryNode, make DirectoryNode.create_with_mutablefile the normal DirectoryNode constructor, start removing client from NodeMaker remove Client from NodeMaker move helper status into History, pass History to web.Status instead of Client test_mutable.py: fix minor typo ] [setup: remove bundled version of darcsver-1.2.1 zooko@zooko.com**20090816233432 Ignore-this: 5357f26d2803db2d39159125dddb963a That version of darcsver emits a scary error message when the darcs executable or the _darcs subdirectory is not found. This error is hidden (unless the --loud option is passed) in darcsver >= 1.3.1. Fixes #788. ] [docs: edits for docs/running.html from Sam Mason zooko@zooko.com**20090809201416 Ignore-this: 2207e80449943ebd4ed50cea57c43143 ] [docs: install.html: instruct Debian users to use this document and not to go find the DownloadDebianPackages page, ignore the warning at the top of it, and try it zooko@zooko.com**20090804123840 Ignore-this: 49da654f19d377ffc5a1eff0c820e026 http://allmydata.org/pipermail/tahoe-dev/2009-August/002507.html ] [docs: about.html: fix English usage noticed by Amber zooko@zooko.com**20090802050533 Ignore-this: 89965c4650f9bd100a615c401181a956 ] [docs: fix mis-spelled word in about.html zooko@zooko.com**20090802050320 Ignore-this: fdfd0397bc7cef9edfde425dddeb67e5 ] [docs: relnotes.txt: reflow to 63 chars wide because google groups and some web forms seem to wrap to that zooko@zooko.com**20090802135016 Ignore-this: 53b1493a0491bc30fb2935fad283caeb ] [TAG allmydata-tahoe-1.5.0 zooko@zooko.com**20090802031303 Ignore-this: 94e5558e7225c39a86aae666ea00f166 ] Patch bundle hash: b1544ff825b87b13e124e9e8a632a69351fb4d9a