1 patch for repository http://tahoe-lafs.org/source/tahoe/trunk: Tue Sep 20 18:17:37 BST 2011 david-sarah@jacaranda.org * docs/backends: document the configuration options for the pluggable backends scheme. refs #999 New patches: [docs/backends: document the configuration options for the pluggable backends scheme. refs #999 david-sarah@jacaranda.org**20110920171737 Ignore-this: 5947e864682a43cb04e557334cda7c19 ] { adddir ./docs/backends addfile ./docs/backends/S3.rst hunk ./docs/backends/S3.rst 1 +==================================================== +Storing Shares in Amazon Simple Storage Service (S3) +==================================================== + +S3 is a commercial storage service provided by Amazon, described at +``_. + +The Tahoe-LAFS storage server can be configured to store its shares in +an S3 bucket, rather than on local filesystem. To enable this, add the +following keys to the server's ``tahoe.cfg`` file: + +``[storage]`` + +``backend = s3`` + + This turns off the local filesystem backend and enables use of S3. + +``s3.access_key_id = (string, required)`` +``s3.secret_access_key = (string, required)`` + + These two give the storage server permission to access your Amazon + Web Services account, allowing them to upload and download shares + from S3. + +``s3.bucket = (string, required)`` + + This controls which bucket will be used to hold shares. The Tahoe-LAFS + storage server will only modify and access objects in the configured S3 + bucket. + +``s3.url = (URL string, optional)`` + + This URL tells the storage server how to access the S3 service. It + defaults to ``http://s3.amazonaws.com``, but by setting it to something + else, you may be able to use some other S3-like service if it is + sufficiently compatible. + +``s3.max_space = (str, optional)`` + + This tells the server to limit how much space can be used in the S3 + bucket. Before each share is uploaded, the server will ask S3 for the + current bucket usage, and will only accept the share if it does not cause + the usage to grow above this limit. + + The string contains a number, with an optional case-insensitive scale + suffix like "K" or "M" or "G", and an optional "B" or "iB" suffix. So + "100MB", "100M", "100000000B", "100000000", and "100000kb" all mean the + same thing. Likewise, "1MiB", "1024KiB", and "1048576B" all mean the same + thing. + + If ``s3.max_space`` is omitted, the default behavior is to allow + unlimited usage. + + +Once configured, the WUI "storage server" page will provide information about +how much space is being used and how many shares are being stored. + + +Issues +------ + +Objects in an S3 bucket cannot be read for free. As a result, when Tahoe-LAFS +is configured to store shares in S3 rather than on local disk, some common +operations may behave differently: + +* Lease crawling/expiration is not yet implemented. As a result, shares will + be retained forever, and the Storage Server status web page will not show + information about the number of mutable/immutable shares present. + +* Enabling ``s3.max_space`` causes an extra S3 usage query to be sent for + each share upload, causing the upload process to run slightly slower and + incur more S3 request charges. addfile ./docs/backends/disk.rst hunk ./docs/backends/disk.rst 1 +==================================== +Storing Shares on a Local Filesystem +==================================== + +The "disk" backend stores shares on the local filesystem. Versions of +Tahoe-LAFS <= 1.9.0 always stored shares in this way. + +``[storage]`` + +``backend = disk`` + + This enables use of the disk backend, and is the default. + +``reserved_space = (str, optional)`` + + If provided, this value defines how much disk space is reserved: the + storage server will not accept any share that causes the amount of free + disk space to drop below this value. (The free space is measured by a + call to statvfs(2) on Unix, or GetDiskFreeSpaceEx on Windows, and is the + space available to the user account under which the storage server runs.) + + This string contains a number, with an optional case-insensitive scale + suffix like "K" or "M" or "G", and an optional "B" or "iB" suffix. So + "100MB", "100M", "100000000B", "100000000", and "100000kb" all mean the + same thing. Likewise, "1MiB", "1024KiB", and "1048576B" all mean the same + thing. + + "``tahoe create-node``" generates a tahoe.cfg with + "``reserved_space=1G``", but you may wish to raise, lower, or remove the + reservation to suit your needs. + +``expire.enabled =`` + +``expire.mode =`` + +``expire.override_lease_duration =`` + +``expire.cutoff_date =`` + +``expire.immutable =`` + +``expire.mutable =`` + + These settings control garbage collection, causing the server to + delete shares that no longer have an up-to-date lease on them. Please + see ``_ for full details. hunk ./docs/configuration.rst 412 `_ for the current status of this bug. The default value is ``False``. -``reserved_space = (str, optional)`` +``backend = (string, optional)`` hunk ./docs/configuration.rst 414 - If provided, this value defines how much disk space is reserved: the - storage server will not accept any share that causes the amount of free - disk space to drop below this value. (The free space is measured by a - call to statvfs(2) on Unix, or GetDiskFreeSpaceEx on Windows, and is the - space available to the user account under which the storage server runs.) + Storage servers can store the data into different "backends". Clients + need not be aware of which backend is used by a server. The default + value is ``disk``. hunk ./docs/configuration.rst 418 - This string contains a number, with an optional case-insensitive scale - suffix like "K" or "M" or "G", and an optional "B" or "iB" suffix. So - "100MB", "100M", "100000000B", "100000000", and "100000kb" all mean the - same thing. Likewise, "1MiB", "1024KiB", and "1048576B" all mean the same - thing. +``backend = disk`` hunk ./docs/configuration.rst 420 - "``tahoe create-node``" generates a tahoe.cfg with - "``reserved_space=1G``", but you may wish to raise, lower, or remove the - reservation to suit your needs. + The default is to store shares on the local filesystem (in + BASEDIR/storage/shares/). For configuration details (including how to + reserve a minimum amount of free space), see ``_. hunk ./docs/configuration.rst 424 -``expire.enabled =`` +``backend = S3`` hunk ./docs/configuration.rst 426 -``expire.mode =`` - -``expire.override_lease_duration =`` - -``expire.cutoff_date =`` - -``expire.immutable =`` - -``expire.mutable =`` - - These settings control garbage collection, in which the server will - delete shares that no longer have an up-to-date lease on them. Please see - ``_ for full details. + The storage server can store all shares to an Amazon Simple Storage + Service (S3) bucket. For configuration details, see ``_. Running A Helper } Context: [Make platform-detection code tolerate linux-3.0, patch by zooko. Brian Warner **20110915202620 Ignore-this: af63cf9177ae531984dea7a1cad03762 Otherwise address-autodetection can't find ifconfig. refs #1536 ] [test_web.py: fix a bug in _count_leases that was causing us to check only the lease count of one share file, not of all share files as intended. david-sarah@jacaranda.org**20110915185126 Ignore-this: d96632bc48d770b9b577cda1bbd8ff94 ] [docs: insert a newline at the beginning of known_issues.rst to see if this makes it render more nicely in trac zooko@zooko.com**20110914064728 Ignore-this: aca15190fa22083c5d4114d3965f5d65 ] [docs: remove the coding: utf-8 declaration at the to of known_issues.rst, since the trac rendering doesn't hide it zooko@zooko.com**20110914055713 Ignore-this: 941ed32f83ead377171aa7a6bd198fcf ] [docs: more cleanup of known_issues.rst -- now it passes "rst2html --verbose" without comment zooko@zooko.com**20110914055419 Ignore-this: 5505b3d76934bd97d0312cc59ed53879 ] [docs: more formatting improvements to known_issues.rst zooko@zooko.com**20110914051639 Ignore-this: 9ae9230ec9a38a312cbacaf370826691 ] [docs: reformatting of known_issues.rst zooko@zooko.com**20110914050240 Ignore-this: b8be0375079fb478be9d07500f9aaa87 ] [docs: fix formatting error in docs/known_issues.rst zooko@zooko.com**20110914045909 Ignore-this: f73fe74ad2b9e655aa0c6075acced15a ] [merge Tahoe-LAFS v1.8.3 release announcement with trunk zooko@zooko.com**20110913210544 Ignore-this: 163f2c3ddacca387d7308e4b9332516e ] [docs: release notes for Tahoe-LAFS v1.8.3 zooko@zooko.com**20110913165826 Ignore-this: 84223604985b14733a956d2fbaeb4e9f ] [tests: bump up the timeout in this test that fails on FreeStorm's CentOS in order to see if it is just very slow zooko@zooko.com**20110913024255 Ignore-this: 6a86d691e878cec583722faad06fb8e4 ] [interfaces: document that the 'fills-holes-with-zero-bytes' key should be used to detect whether a storage server has that behavior. refs #1528 david-sarah@jacaranda.org**20110913002843 Ignore-this: 1a00a6029d40f6792af48c5578c1fd69 ] [CREDITS: more CREDITS for Kevan and David-Sarah zooko@zooko.com**20110912223357 Ignore-this: 4ea8f0d6f2918171d2f5359c25ad1ada ] [merge NEWS about the mutable file bounds fixes with NEWS about work-in-progress zooko@zooko.com**20110913205521 Ignore-this: 4289a4225f848d6ae6860dd39bc92fa8 ] [doc: add NEWS item about fixes to potential palimpsest issues in mutable files zooko@zooko.com**20110912223329 Ignore-this: 9d63c95ddf95c7d5453c94a1ba4d406a ref. #1528 ] [merge the NEWS about the security fix (#1528) with the work-in-progress NEWS zooko@zooko.com**20110913205153 Ignore-this: 88e88a2ad140238c62010cf7c66953fc ] [doc: add NEWS entry about the issue which allows unauthorized deletion of shares zooko@zooko.com**20110912223246 Ignore-this: 77e06d09103d2ef6bb51ea3e5d6e80b0 ref. #1528 ] [doc: add entry in known_issues.rst about the issue which allows unauthorized deletion of shares zooko@zooko.com**20110912223135 Ignore-this: b26c6ea96b6c8740b93da1f602b5a4cd ref. #1528 ] [storage: more paranoid handling of bounds and palimpsests in mutable share files zooko@zooko.com**20110912222655 Ignore-this: a20782fa423779ee851ea086901e1507 * storage server ignores requests to extend shares by sending a new_length * storage server fills exposed holes (created by sending a write vector whose offset begins after the end of the current data) with 0 to avoid "palimpsest" exposure of previous contents * storage server zeroes out lease info at the old location when moving it to a new location ref. #1528 ] [storage: test that the storage server ignores requests to extend shares by sending a new_length, and that the storage server fills exposed holes with 0 to avoid "palimpsest" exposure of previous contents zooko@zooko.com**20110912222554 Ignore-this: 61ebd7b11250963efdf5b1734a35271 ref. #1528 ] [immutable: prevent clients from reading past the end of share data, which would allow them to learn the cancellation secret zooko@zooko.com**20110912222458 Ignore-this: da1ebd31433ea052087b75b2e3480c25 Declare explicitly that we prevent this problem in the server's version dict. fixes #1528 (there are two patches that are each a sufficient fix to #1528 and this is one of them) ] [storage: remove the storage server's "remote_cancel_lease" function zooko@zooko.com**20110912222331 Ignore-this: 1c32dee50e0981408576daffad648c50 We're removing this function because it is currently unused, because it is dangerous, and because the bug described in #1528 leaks the cancellation secret, which allows anyone who knows a file's storage index to abuse this function to delete shares of that file. fixes #1528 (there are two patches that are each a sufficient fix to #1528 and this is one of them) ] [storage: test that the storage server does *not* have a "remote_cancel_lease" function zooko@zooko.com**20110912222324 Ignore-this: 21c652009704652d35f34651f98dd403 We're removing this function because it is currently unused, because it is dangerous, and because the bug described in #1528 leaks the cancellation secret, which allows anyone who knows a file's storage index to abuse this function to delete shares of that file. ref. #1528 ] [immutable: test whether the server allows clients to read past the end of share data, which would allow them to learn the cancellation secret zooko@zooko.com**20110912221201 Ignore-this: 376e47b346c713d37096531491176349 Also test whether the server explicitly declares that it prevents this problem. ref #1528 ] [Retrieve._activate_enough_peers: rewrite Verify logic Brian Warner **20110909181150 Ignore-this: 9367c11e1eacbf025f75ce034030d717 ] [Retrieve: implement/test stopProducing Brian Warner **20110909181150 Ignore-this: 47b2c3df7dc69835e0a066ca12e3c178 ] [move DownloadStopped from download.common to interfaces Brian Warner **20110909181150 Ignore-this: 8572acd3bb16e50341dbed8eb1d90a50 ] [retrieve.py: remove vestigal self._validated_readers Brian Warner **20110909181150 Ignore-this: faab2ec14e314a53a2ffb714de626e2d ] [Retrieve: rewrite flow-control: use a top-level loop() to catch all errors Brian Warner **20110909181150 Ignore-this: e162d2cd53b3d3144fc6bc757e2c7714 This ought to close the potential for dropped errors and hanging downloads. Verify needs to be examined, I may have broken it, although all tests pass. ] [Retrieve: merge _validate_active_prefixes into _add_active_peers Brian Warner **20110909181150 Ignore-this: d3ead31e17e69394ae7058eeb5beaf4c ] [Retrieve: remove the initial prefix-is-still-good check Brian Warner **20110909181150 Ignore-this: da66ee51c894eaa4e862e2dffb458acc This check needs to be done with each fetch from the storage server, to detect when someone has changed the share (i.e. our servermap goes stale). Doing it just once at the beginning of retrieve isn't enough: a write might occur after the first segment but before the second, etc. _try_to_validate_prefix() was not removed: it will be used by the future check-with-each-fetch code. test_mutable.Roundtrip.test_corrupt_all_seqnum_late was disabled, since it fails until this check is brought back. (the corruption it applies only touches the prefix, not the block data, so the check-less retrieve actually tolerates it). Don't forget to re-enable it once the check is brought back. ] [MDMFSlotReadProxy: remove the queue Brian Warner **20110909181150 Ignore-this: 96673cb8dda7a87a423de2f4897d66d2 This is a neat trick to reduce Foolscap overhead, but the need for an explicit flush() complicates the Retrieve path and makes it prone to lost-progress bugs. Also change test_mutable.FakeStorageServer to tolerate multiple reads of the same share in a row, a limitation exposed by turning off the queue. ] [rearrange Retrieve: first step, shouldn't change order of execution Brian Warner **20110909181149 Ignore-this: e3006368bfd2802b82ea45c52409e8d6 ] [CLI: test_cli.py -- remove an unnecessary call in test_mkdir_mutable_type. refs #1527 david-sarah@jacaranda.org**20110906183730 Ignore-this: 122e2ffbee84861c32eda766a57759cf ] [CLI: improve test for 'tahoe mkdir --mutable-type='. refs #1527 david-sarah@jacaranda.org**20110906183020 Ignore-this: f1d4598e6c536f0a2b15050b3bc0ef9d ] [CLI: make the --mutable-type option value for 'tahoe put' and 'tahoe mkdir' case-insensitive, and change --help for these commands accordingly. fixes #1527 david-sarah@jacaranda.org**20110905020922 Ignore-this: 75a6df0a2df9c467d8c010579e9a024e ] [cli: make --mutable-type imply --mutable in 'tahoe put' Kevan Carstensen **20110903190920 Ignore-this: 23336d3c43b2a9554e40c2a11c675e93 ] [SFTP: add a comment about a subtle interaction between OverwriteableFileConsumer and GeneralSFTPFile, and test the case it is commenting on. david-sarah@jacaranda.org**20110903222304 Ignore-this: 980c61d4dd0119337f1463a69aeebaf0 ] [improve the storage/mutable.py asserts even more warner@lothar.com**20110901160543 Ignore-this: 5b2b13c49bc4034f96e6e3aaaa9a9946 ] [storage/mutable.py: special characters in struct.foo arguments indicate standard as opposed to native sizes, we should be using these characters in these asserts wilcoxjg@gmail.com**20110901084144 Ignore-this: 28ace2b2678642e4d7269ddab8c67f30 ] [docs/write_coordination.rst: fix formatting and add more specific warning about access via sshfs. david-sarah@jacaranda.org**20110831232148 Ignore-this: cd9c851d3eb4e0a1e088f337c291586c ] [test_mutable.Version: consolidate some tests, reduce runtime from 19s to 15s warner@lothar.com**20110831050451 Ignore-this: 64815284d9e536f8f3798b5f44cf580c ] [mutable/retrieve: handle the case where self._read_length is 0. Kevan Carstensen **20110830210141 Ignore-this: fceafbe485851ca53f2774e5a4fd8d30 Note that the downloader will still fetch a segment for a zero-length read, which is wasteful. Fixing that isn't specifically required to fix #1512, but it should probably be fixed before 1.9. ] [NEWS: added summary of all changes since 1.8.2. Needs editing. Brian Warner **20110830163205 Ignore-this: 273899b37a899fc6919b74572454b8b2 ] [test_mutable.Update: only upload the files needed for each test. refs #1500 Brian Warner **20110829072717 Ignore-this: 4d2ab4c7523af9054af7ecca9c3d9dc7 This first step shaves 15% off the runtime: from 139s to 119s on my laptop. It also fixes a couple of places where a Deferred was being dropped, which would cause two tests to run in parallel and also confuse error reporting. ] [Let Uploader retain History instead of passing it into upload(). Fixes #1079. Brian Warner **20110829063246 Ignore-this: 3902c58ec12bd4b2d876806248e19f17 This consistently records all immutable uploads in the Recent Uploads And Downloads page, regardless of code path. Previously, certain webapi upload operations (like PUT /uri/$DIRCAP/newchildname) failed to pass the History object and were left out. ] [Fix mutable publish/retrieve timing status displays. Fixes #1505. Brian Warner **20110828232221 Ignore-this: 4080ce065cf481b2180fd711c9772dd6 publish: * encrypt and encode times are cumulative, not just current-segment retrieve: * same for decrypt and decode times * update "current status" to include segment number * set status to Finished/Failed when download is complete * set progress to 1.0 when complete More improvements to consider: * progress is currently 0% or 100%: should calculate how many segments are involved (remembering retrieve can be less than the whole file) and set it to a fraction * "fetch" time is fuzzy: what we want is to know how much of the delay is not our own fault, but since we do decode/decrypt work while waiting for more shares, it's not straightforward ] [Teach 'tahoe debug catalog-shares about MDMF. Closes #1507. Brian Warner **20110828080931 Ignore-this: 56ef2951db1a648353d7daac6a04c7d1 ] [debug.py: remove some dead comments Brian Warner **20110828074556 Ignore-this: 40e74040dd4d14fd2f4e4baaae506b31 ] [hush pyflakes Brian Warner **20110828074254 Ignore-this: bef9d537a969fa82fe4decc4ba2acb09 ] [MutableFileNode.set_downloader_hints: never depend upon order of dict.values() Brian Warner **20110828074103 Ignore-this: caaf1aa518dbdde4d797b7f335230faa The old code was calculating the "extension parameters" (a list) from the downloader hints (a dictionary) with hints.values(), which is not stable, and would result in corrupted filecaps (with the 'k' and 'segsize' hints occasionally swapped). The new code always uses [k,segsize]. ] [layout.py: fix MDMF share layout documentation Brian Warner **20110828073921 Ignore-this: 3f13366fed75b5e31b51ae895450a225 ] [teach 'tahoe debug dump-share' about MDMF and offsets. refs #1507 Brian Warner **20110828073834 Ignore-this: 3a9d2ef9c47a72bf1506ba41199a1dea ] [test_mutable.Version.test_debug: use splitlines() to fix buildslaves Brian Warner **20110828064728 Ignore-this: c7f6245426fc80b9d1ae901d5218246a Any slave running in a directory with spaces in the name was miscounting shares, causing the test to fail. ] [test_mutable.Version: exercise 'tahoe debug find-shares' on MDMF. refs #1507 Brian Warner **20110828005542 Ignore-this: cb20bea1c28bfa50a72317d70e109672 Also changes NoNetworkGrid to put shares in storage/shares/ . ] [test_mutable.py: oops, missed a .todo Brian Warner **20110828002118 Ignore-this: fda09ae86481352b7a627c278d2a3940 ] [test_mutable: merge davidsarah's patch with my Version refactorings warner@lothar.com**20110827235707 Ignore-this: b5aaf481c90d99e33827273b5d118fd0 ] [Make the immutable/read-only constraint checking for MDMF URIs identical to that for SSK URIs. refs #393 david-sarah@jacaranda.org**20110823012720 Ignore-this: e1f59d7ff2007c81dbef2aeb14abd721 ] [Additional tests for MDMF URIs and for zero-length files. refs #393 david-sarah@jacaranda.org**20110823011532 Ignore-this: a7cc0c09d1d2d72413f9cd227c47a9d5 ] [Additional tests for zero-length partial reads and updates to mutable versions. refs #393 david-sarah@jacaranda.org**20110822014111 Ignore-this: 5fc6f4d06e11910124e4a277ec8a43ea ] [test_mutable.Version: factor out some expensive uploads, save 25% runtime Brian Warner **20110827232737 Ignore-this: ea37383eb85ea0894b254fe4dfb45544 ] [SDMF: update filenode with correct k/N after Retrieve. Fixes #1510. Brian Warner **20110827225031 Ignore-this: b50ae6e1045818c400079f118b4ef48 Without this, we get a regression when modifying a mutable file that was created with more shares (larger N) than our current tahoe.cfg . The modification attempt creates new versions of the (0,1,..,newN-1) shares, but leaves the old versions of the (newN,..,oldN-1) shares alone (and throws a assertion error in SDMFSlotWriteProxy.finish_publishing in the process). The mixed versions that result (some shares with e.g. N=10, some with N=20, such that both versions are recoverable) cause problems for the Publish code, even before MDMF landed. Might be related to refs #1390 and refs #1042. ] [layout.py: annotate assertion to figure out 'tahoe backup' failure Brian Warner **20110827195253 Ignore-this: 9b92b954e3ed0d0f80154fff1ff674e5 ] [Add 'tahoe debug dump-cap' support for MDMF, DIR2-CHK, DIR2-MDMF. refs #1507. Brian Warner **20110827195048 Ignore-this: 61c6af5e33fc88e0251e697a50addb2c This also adds tests for all those cases, and fixes an omission in uri.py that broke parsing of DIR2-MDMF-Verifier and DIR2-CHK-Verifier. ] [MDMF: more writable/writeable consistentifications warner@lothar.com**20110827190602 Ignore-this: 22492a9e20c1819ddb12091062888b55 ] [MDMF: s/Writable/Writeable/g, for consistency with existing SDMF code warner@lothar.com**20110827183357 Ignore-this: 9dd312acedbdb2fc2f7bef0d0fb17c0b ] [setup.cfg: remove no-longer-supported test_mac_diskimage alias. refs #1479 david-sarah@jacaranda.org**20110826230345 Ignore-this: 40e908b8937322a290fb8012bfcad02a ] [test_mutable.Update: increase timeout from 120s to 400s, slaves are failing Brian Warner **20110825230140 Ignore-this: 101b1924a30cdbda9b2e419e95ca15ec ] [tests: fix check_memory test zooko@zooko.com**20110825201116 Ignore-this: 4d66299fa8cb61d2ca04b3f45344d835 fixes #1503 ] [TAG allmydata-tahoe-1.9.0a1 warner@lothar.com**20110825161122 Ignore-this: 3cbf49f00dbda58189f893c427f65605 ] Patch bundle hash: 79e2ac1239861ebdcc3f88bf8eb2a168ccb5cbe2