1 patch for repository /Users/warner2/stuff/tahoe/t4: Sun Feb 27 18:10:56 PST 2011 warner@lothar.com * test_mutable.py: add test to exercise fencepost bug New patches: [test_mutable.py: add test to exercise fencepost bug warner@lothar.com**20110228021056 Ignore-this: d2f9cf237ce6db42fb250c8ad71a4fc3 ] { hunk ./src/allmydata/test/test_mutable.py 2 -import os +import os, re from cStringIO import StringIO from twisted.trial import unittest from twisted.internet import defer, reactor hunk ./src/allmydata/test/test_mutable.py 2931 self.set_up_grid() self.c = self.g.clients[0] self.nm = self.c.nodemaker - self.data = "test data" * 100000 # about 900 KiB; MDMF + self.data = "testdata " * 100000 # about 900 KiB; MDMF self.small_data = "test data" * 10 # about 90 B; SDMF return self.do_upload() hunk ./src/allmydata/test/test_mutable.py 2981 self.failUnlessEqual(results, new_data)) return d + def test_replace_segstart1(self): + offset = 128*1024+1 + new_data = "NNNN" + expected = self.data[:offset]+new_data+self.data[offset+4:] + d = self.mdmf_node.get_best_mutable_version() + d.addCallback(lambda mv: + mv.update(MutableData(new_data), offset)) + d.addCallback(lambda ignored: + self.mdmf_node.download_best_version()) + def _check(results): + if results != expected: + print + print "got: %s ... %s" % (results[:20], results[-20:]) + print "exp: %s ... %s" % (expected[:20], expected[-20:]) + self.fail("results != expected") + d.addCallback(_check) + return d + + def _check_differences(self, got, expected): + # displaying arbitrary file corruption is tricky for a + # 1MB file of repeating data,, so look for likely places + # with problems and display them separately + gotmods = [mo.span() for mo in re.finditer('([A-Z]+)', got)] + expmods = [mo.span() for mo in re.finditer('([A-Z]+)', expected)] + gotspans = ["%d:%d=%s" % (start,end,got[start:end]) + for (start,end) in gotmods] + expspans = ["%d:%d=%s" % (start,end,expected[start:end]) + for (start,end) in expmods] + #print "expecting: %s" % expspans + + SEGSIZE = 128*1024 + if got != expected: + print "differences:" + for segnum in range(len(expected)//SEGSIZE): + start = segnum * SEGSIZE + end = (segnum+1) * SEGSIZE + got_ends = "%s .. %s" % (got[start:start+20], got[end-20:end]) + exp_ends = "%s .. %s" % (expected[start:start+20], expected[end-20:end]) + if got_ends != exp_ends: + print "expected[%d]: %s" % (start, exp_ends) + print "got [%d]: %s" % (start, got_ends) + if expspans != gotspans: + print "expected: %s" % expspans + print "got : %s" % gotspans + open("EXPECTED","wb").write(expected) + open("GOT","wb").write(got) + print "wrote data to EXPECTED and GOT" + self.fail("didn't get expected data") + + + def test_replace_locations(self): + # exercise fencepost conditions + expected = self.data + SEGSIZE = 128*1024 + suspects = range(SEGSIZE-3, SEGSIZE+1)+range(2*SEGSIZE-3, 2*SEGSIZE+1) + letters = iter("ABCDEFGHIJKLMNOPQRSTUVWXYZ") + d = defer.succeed(None) + for offset in suspects: + new_data = letters.next()*2 # "AA", then "BB", etc + expected = expected[:offset]+new_data+expected[offset+2:] + d.addCallback(lambda ign: + self.mdmf_node.get_best_mutable_version()) + def _modify(mv, offset=offset, new_data=new_data): + # close over 'offset','new_data' + md = MutableData(new_data) + return mv.update(md, offset) + d.addCallback(_modify) + d.addCallback(lambda ignored: + self.mdmf_node.download_best_version()) + d.addCallback(self._check_differences, expected) + return d + def test_replace_and_extend(self): # We should be able to replace data in the middle of a mutable } Context: [web/filenode.py: avoid calling req.finish() on closed HTTP connections. Closes #1366 "Brian Warner "**20110221061544 Ignore-this: 799d4de19933f2309b3c0c19a63bb888 ] [update MDMF code with StorageFarmBroker changes "Brian Warner "**20110221061004 Ignore-this: a693b201d31125b391cebe0412ddd027 ] [resolve more conflicts with current trunk "Brian Warner "**20110221055600 Ignore-this: 77ad038a478dbf5d9b34f7a68159a3e0 ] [Refactor StorageFarmBroker handling of servers Brian Warner **20110221015804 Ignore-this: 842144ed92f5717699b8f580eab32a51 Pass around IServer instance instead of (peerid, rref) tuple. Replace "descriptor" with "server". Other replacements: get_all_servers -> get_connected_servers/get_known_servers get_servers_for_index -> get_servers_for_psi (now returns IServers) This change still needs to be pushed further down: lots of code is now getting the IServer and then distributing (peerid, rref) internally. Instead, it ought to distribute the IServer internally and delay extracting a serverid or rref until the last moment. no_network.py was updated to retain parallelism. ] [Add unit tests for cross_check_pkg_resources_versus_import, and a regression test for ref #1355. This requires a little refactoring to make it testable. david-sarah@jacaranda.org**20110221015817 Ignore-this: 51d181698f8c20d3aca58b057e9c475a ] [allmydata/__init__.py: .name was used in place of the correct .__name__ when printing an exception. Also, robustify string formatting by using %r instead of %s in some places. fixes #1355. david-sarah@jacaranda.org**20110221020125 Ignore-this: b0744ed58f161bf188e037bad077fc48 ] [mutable/filenode.py: fix create_mutable_file('string') "Brian Warner "**20110221014659 Ignore-this: dc6bdad761089f0199681eeb784f1001 ] [resolve conflicts between 393-MDMF patches and trunk as of 1.8.2 "Brian Warner "**20110220230201 Ignore-this: 9bbf5d26c994e8069202331dcb4cdd95 ] [tests: Kevan Carstensen **20100819003531 Ignore-this: 314e8bbcce532ea4d5d2cecc9f31cca0 - A lot of existing tests relied on aspects of the mutable file implementation that were changed. This patch updates those tests to work with the changes. - This patch also adds tests for new features. ] [mutable/servermap.py: Alter the servermap updater to work with MDMF files Kevan Carstensen **20100819003439 Ignore-this: 7e408303194834bd59a2f27efab3bdb These modifications were basically all to the end of having the servermap updater use the unified MDMF + SDMF read interface whenever possible -- this reduces the complexity of the code, making it easier to read and maintain. To do this, I needed to modify the process of updating the servermap a little bit. To support partial-file updates, I also modified the servermap updater to fetch the block hash trees and certain segments of files while it performed a servermap update (this can be done without adding any new roundtrips because of batch-read functionality that the read proxy has). ] [mutable/retrieve.py: Modify the retrieval process to support MDMF Kevan Carstensen **20100819003409 Ignore-this: c03f4e41aaa0366a9bf44847f2caf9db The logic behind a mutable file download had to be adapted to work with segmented mutable files; this patch performs those adaptations. It also exposes some decoding and decrypting functionality to make partial-file updates a little easier, and supports efficient random-access downloads of parts of an MDMF file. ] [mutable/layout.py and interfaces.py: add MDMF writer and reader Kevan Carstensen **20100819003304 Ignore-this: 44400fec923987b62830da2ed5075fb4 The MDMF writer is responsible for keeping state as plaintext is gradually processed into share data by the upload process. When the upload finishes, it will write all of its share data to a remote server, reporting its status back to the publisher. The MDMF reader is responsible for abstracting an MDMF file as it sits on the grid from the downloader; specifically, by receiving and responding to requests for arbitrary data within the MDMF file. The interfaces.py file has also been modified to contain an interface for the writer. ] [docs: update docs to mention MDMF Kevan Carstensen **20100814225644 Ignore-this: 1c3caa3cd44831007dcfbef297814308 ] [nodemaker.py: Make nodemaker expose a way to create MDMF files Kevan Carstensen **20100819003509 Ignore-this: a6701746d6b992fc07bc0556a2b4a61d ] [mutable/publish.py: Modify the publish process to support MDMF Kevan Carstensen **20100819003342 Ignore-this: 2bb379974927e2e20cff75bae8302d1d The inner workings of the publishing process needed to be reworked to a large extend to cope with segmented mutable files, and to cope with partial-file updates of mutable files. This patch does that. It also introduces wrappers for uploadable data, allowing the use of filehandle-like objects as data sources, in addition to strings. This reduces memory inefficiency when dealing with large files through the webapi, and clarifies update code there. ] [mutable/filenode.py: add versions and partial-file updates to the mutable file node Kevan Carstensen **20100819003231 Ignore-this: b7b5434201fdb9b48f902d7ab25ef45c One of the goals of MDMF as a GSoC project is to lay the groundwork for LDMF, a format that will allow Tahoe-LAFS to deal with and encourage multiple versions of a single cap on the grid. In line with this, there is a now a distinction between an overriding mutable file (which can be thought to correspond to the cap/unique identifier for that mutable file) and versions of the mutable file (which we can download, update, and so on). All download, upload, and modification operations end up happening on a particular version of a mutable file, but there are shortcut methods on the object representing the overriding mutable file that perform these operations on the best version of the mutable file (which is what code should be doing until we have LDMF and better support for other paradigms). Another goal of MDMF was to take advantage of segmentation to give callers more efficient partial file updates or appends. This patch implements methods that do that, too. ] [mutable/checker.py and mutable/repair.py: Modify checker and repairer to work with MDMF Kevan Carstensen **20100819003216 Ignore-this: d3bd3260742be8964877f0a53543b01b The checker and repairer required minimal changes to work with the MDMF modifications made elsewhere. The checker duplicated a lot of the code that was already in the downloader, so I modified the downloader slightly to expose this functionality to the checker and removed the duplicated code. The repairer only required a minor change to deal with data representation. ] [client.py: learn how to create different kinds of mutable files Kevan Carstensen **20100814225711 Ignore-this: 61ff665bc050cba5f58bf2ed779d692b ] [web: Alter the webapi to get along with and take advantage of the MDMF changes Kevan Carstensen **20100814081012 Ignore-this: 96c2ed4e4a9f450fb84db5d711d10bd6 The main benefit that the webapi gets from MDMF, at least initially, is the ability to do a streaming download of an MDMF mutable file. It also exposes a way (through the PUT verb) to append to or otherwise modify (in-place) an MDMF mutable file. ] [scripts: tell 'tahoe put' about MDMF Kevan Carstensen **20100813234957 Ignore-this: c106b3384fc676bd3c0fb466d2a52b1b ] [immutable/literal.py: implement the same interfaces as other filenodes Kevan Carstensen **20100810000633 Ignore-this: b50dd5df2d34ecd6477b8499a27aef13 ] [immutable/filenode.py: Make the immutable file node implement the same interfaces as the mutable one Kevan Carstensen **20100810000619 Ignore-this: 93e536c0f8efb705310f13ff64621527 ] [frontends/sftpd.py: Modify the sftp frontend to work with the MDMF changes Kevan Carstensen **20100809233535 Ignore-this: 2d25e2cfcd0d7bbcbba660c7e1da12f ] [interfaces.py: Add #993 interfaces Kevan Carstensen **20100809233244 Ignore-this: b58621ac5cc86f1b4b4149f9e6c6a1ce ] [TAG allmydata-tahoe-1.8.2 warner@lothar.com**20110131020101] Patch bundle hash: 85ba2dfc67d9e255e8b82316f147ac92a2b7896e