error doing a check --verify on files bigger than about 1Gbyte #1395

Closed
opened 2011-05-05 20:02:27 +00:00 by sickness · 35 comments
sickness commented 2011-05-05 20:02:27 +00:00
Owner

on a grid with 10 storage nodes, all on win32, and a gateway/client on win32, with all the nodes up and all the shares ok, I can download all the files ok, and check all of them ok.
only thing that doesn't work is check --verify of files bigger than 1Gbyte

here's the error log of the gateway, and after it of a storage node:

######################################################################

2011-05-05 20:33:34+0200 [HTTPChannel,42,10.100.9.71] [Failure instance: Traceback: <type 'exceptions.IOError'>: [Errno 22] invalid mode ('wb') or filename: u'C:\\tahoeclient\\logs\\incidents\\incident-2011-05-05--18:33:34Z-w2qn32q.flog'
2011-05-05 20:33:34+0200 [HTTPChannel,42,10.100.9.71] c:\tahoe-lafs\support\Lib\site-packages\twisted-8.2.0-py2.6-win32.egg\twisted\internet\defer.py:106:maybeDeferred
2011-05-05 20:33:34+0200 [HTTPChannel,42,10.100.9.71] c:\tahoe-lafs\src\allmydata\web\common.py:302:renderHTTP
2011-05-05 20:33:34+0200 [HTTPChannel,42,10.100.9.71] c:\tahoe-lafs\src\allmydata\web\root.py:124:render_POST
2011-05-05 20:33:34+0200 [HTTPChannel,42,10.100.9.71] c:\tahoe-lafs\support\Lib\site-packages\foolscap-0.6.1-py2.6.egg\foolscap\logging\log.py:210:msg
2011-05-05 20:33:34+0200 [HTTPChannel,42,10.100.9.71] --- <exception caught here> ---
2011-05-05 20:33:34+0200 [HTTPChannel,42,10.100.9.71] c:\tahoe-lafs\support\Lib\site-packages\foolscap-0.6.1-py2.6.egg\foolscap\logging\log.py:265:add_event
2011-05-05 20:33:34+0200 [HTTPChannel,42,10.100.9.71] c:\tahoe-lafs\support\Lib\site-packages\foolscap-0.6.1-py2.6.egg\foolscap\logging\incident.py:35:event
2011-05-05 20:33:34+0200 [HTTPChannel,42,10.100.9.71] c:\tahoe-lafs\support\Lib\site-packages\foolscap-0.6.1-py2.6.egg\foolscap\logging\log.py:278:declare_incident
2011-05-05 20:33:34+0200 [HTTPChannel,42,10.100.9.71] c:\tahoe-lafs\support\Lib\site-packages\foolscap-0.6.1-py2.6.egg\foolscap\logging\incident.py:85:incident_declared
2011-05-05 20:33:34+0200 [HTTPChannel,42,10.100.9.71] ]
2011-05-05 20:42:33+0200 [HTTPChannel,43,10.100.9.71] Unhandled Error
	Traceback (most recent call last):
	  File "c:\tahoe-lafs\support\Lib\site-packages\nevow-0.10.0-py2.6.egg\nevow\rend.py", line 569, in _renderHTTP
	    return self.flattenFactory(doc, ctx, writer, finisher)
	  File "c:\tahoe-lafs\support\Lib\site-packages\nevow-0.10.0-py2.6.egg\nevow\rend.py", line 528, in <lambda>
	    flattenFactory = lambda self, *args: flat.flattenFactory(*args)
	  File "c:\tahoe-lafs\support\Lib\site-packages\nevow-0.10.0-py2.6.egg\nevow\flat\__init__.py", line 14, in flattenFactory
	    return deferflatten(stan, ctx, writer).addCallback(finisher)
	  File "c:\tahoe-lafs\support\Lib\site-packages\nevow-0.10.0-py2.6.egg\nevow\flat\twist.py", line 63, in deferflatten
	    _drive(iterable, finished)
	--- <exception caught here> ---
	  File "c:\tahoe-lafs\support\Lib\site-packages\nevow-0.10.0-py2.6.egg\nevow\flat\twist.py", line 24, in _drive
	    next = iterable.next()
	  File "c:\tahoe-lafs\support\Lib\site-packages\nevow-0.10.0-py2.6.egg\nevow\flat\ten.py", line 85, in iterflatten
	    straccum.append(item)
	exceptions.MemoryError: 

######################################################################

now for the storage log:

######################################################################

2011-05-05 20:29:01+0200 [Negotiation,8,10.100.26.102] Unhandled Error
	Traceback (most recent call last):
	  File "c:\tahoe-lafs\support\Lib\site-packages\twisted-8.2.0-py2.6-win32.egg\twisted\python\log.py", line 84, in callWithLogger
	    return callWithContext({"system": lp}, func, *args, **kw)
	  File "c:\tahoe-lafs\support\Lib\site-packages\twisted-8.2.0-py2.6-win32.egg\twisted\python\log.py", line 69, in callWithContext
	    return context.call({ILogContext: newCtx}, func, *args, **kw)
	  File "c:\tahoe-lafs\support\Lib\site-packages\twisted-8.2.0-py2.6-win32.egg\twisted\python\context.py", line 59, in callWithContext
	    return self.currentContext().callWithContext(ctx, func, *args, **kw)
	  File "c:\tahoe-lafs\support\Lib\site-packages\twisted-8.2.0-py2.6-win32.egg\twisted\python\context.py", line 37, in callWithContext
	    return func(*args,**kw)
	--- <exception caught here> ---
	  File "c:\tahoe-lafs\support\Lib\site-packages\twisted-8.2.0-py2.6-win32.egg\twisted\internet\selectreactor.py", line 146, in _doReadOrWrite
	    why = getattr(selectable, method)()
	  File "c:\tahoe-lafs\support\Lib\site-packages\twisted-8.2.0-py2.6-win32.egg\twisted\internet\tcp.py", line 180, in doWrite
	    return Connection.doWrite(self)
	  File "c:\tahoe-lafs\support\Lib\site-packages\twisted-8.2.0-py2.6-win32.egg\twisted\internet\tcp.py", line 431, in doWrite
	    result = abstract.FileDescriptor.doWrite(self)
	  File "c:\tahoe-lafs\support\Lib\site-packages\twisted-8.2.0-py2.6-win32.egg\twisted\internet\abstract.py", line 101, in doWrite
	    self.dataBuffer = buffer(self.dataBuffer, self.offset) + "".join(self._tempDataBuffer)
	exceptions.MemoryError: 

######################################################################

the gateway/client process quickly grow over 1.5 Gbyte in ram, then it shrunk to 180 Mbyte, the storage process grow to 300 Mbyte and stayed there

on a grid with 10 storage nodes, all on win32, and a gateway/client on win32, with all the nodes up and all the shares ok, I can download all the files ok, and check all of them ok. only thing that doesn't work is check --verify of files bigger than 1Gbyte here's the error log of the gateway, and after it of a storage node: ###################################################################### ``` 2011-05-05 20:33:34+0200 [HTTPChannel,42,10.100.9.71] [Failure instance: Traceback: <type 'exceptions.IOError'>: [Errno 22] invalid mode ('wb') or filename: u'C:\\tahoeclient\\logs\\incidents\\incident-2011-05-05--18:33:34Z-w2qn32q.flog' 2011-05-05 20:33:34+0200 [HTTPChannel,42,10.100.9.71] c:\tahoe-lafs\support\Lib\site-packages\twisted-8.2.0-py2.6-win32.egg\twisted\internet\defer.py:106:maybeDeferred 2011-05-05 20:33:34+0200 [HTTPChannel,42,10.100.9.71] c:\tahoe-lafs\src\allmydata\web\common.py:302:renderHTTP 2011-05-05 20:33:34+0200 [HTTPChannel,42,10.100.9.71] c:\tahoe-lafs\src\allmydata\web\root.py:124:render_POST 2011-05-05 20:33:34+0200 [HTTPChannel,42,10.100.9.71] c:\tahoe-lafs\support\Lib\site-packages\foolscap-0.6.1-py2.6.egg\foolscap\logging\log.py:210:msg 2011-05-05 20:33:34+0200 [HTTPChannel,42,10.100.9.71] --- <exception caught here> --- 2011-05-05 20:33:34+0200 [HTTPChannel,42,10.100.9.71] c:\tahoe-lafs\support\Lib\site-packages\foolscap-0.6.1-py2.6.egg\foolscap\logging\log.py:265:add_event 2011-05-05 20:33:34+0200 [HTTPChannel,42,10.100.9.71] c:\tahoe-lafs\support\Lib\site-packages\foolscap-0.6.1-py2.6.egg\foolscap\logging\incident.py:35:event 2011-05-05 20:33:34+0200 [HTTPChannel,42,10.100.9.71] c:\tahoe-lafs\support\Lib\site-packages\foolscap-0.6.1-py2.6.egg\foolscap\logging\log.py:278:declare_incident 2011-05-05 20:33:34+0200 [HTTPChannel,42,10.100.9.71] c:\tahoe-lafs\support\Lib\site-packages\foolscap-0.6.1-py2.6.egg\foolscap\logging\incident.py:85:incident_declared 2011-05-05 20:33:34+0200 [HTTPChannel,42,10.100.9.71] ] 2011-05-05 20:42:33+0200 [HTTPChannel,43,10.100.9.71] Unhandled Error Traceback (most recent call last): File "c:\tahoe-lafs\support\Lib\site-packages\nevow-0.10.0-py2.6.egg\nevow\rend.py", line 569, in _renderHTTP return self.flattenFactory(doc, ctx, writer, finisher) File "c:\tahoe-lafs\support\Lib\site-packages\nevow-0.10.0-py2.6.egg\nevow\rend.py", line 528, in <lambda> flattenFactory = lambda self, *args: flat.flattenFactory(*args) File "c:\tahoe-lafs\support\Lib\site-packages\nevow-0.10.0-py2.6.egg\nevow\flat\__init__.py", line 14, in flattenFactory return deferflatten(stan, ctx, writer).addCallback(finisher) File "c:\tahoe-lafs\support\Lib\site-packages\nevow-0.10.0-py2.6.egg\nevow\flat\twist.py", line 63, in deferflatten _drive(iterable, finished) --- <exception caught here> --- File "c:\tahoe-lafs\support\Lib\site-packages\nevow-0.10.0-py2.6.egg\nevow\flat\twist.py", line 24, in _drive next = iterable.next() File "c:\tahoe-lafs\support\Lib\site-packages\nevow-0.10.0-py2.6.egg\nevow\flat\ten.py", line 85, in iterflatten straccum.append(item) exceptions.MemoryError: ``` ###################################################################### now for the storage log: ###################################################################### ``` 2011-05-05 20:29:01+0200 [Negotiation,8,10.100.26.102] Unhandled Error Traceback (most recent call last): File "c:\tahoe-lafs\support\Lib\site-packages\twisted-8.2.0-py2.6-win32.egg\twisted\python\log.py", line 84, in callWithLogger return callWithContext({"system": lp}, func, *args, **kw) File "c:\tahoe-lafs\support\Lib\site-packages\twisted-8.2.0-py2.6-win32.egg\twisted\python\log.py", line 69, in callWithContext return context.call({ILogContext: newCtx}, func, *args, **kw) File "c:\tahoe-lafs\support\Lib\site-packages\twisted-8.2.0-py2.6-win32.egg\twisted\python\context.py", line 59, in callWithContext return self.currentContext().callWithContext(ctx, func, *args, **kw) File "c:\tahoe-lafs\support\Lib\site-packages\twisted-8.2.0-py2.6-win32.egg\twisted\python\context.py", line 37, in callWithContext return func(*args,**kw) --- <exception caught here> --- File "c:\tahoe-lafs\support\Lib\site-packages\twisted-8.2.0-py2.6-win32.egg\twisted\internet\selectreactor.py", line 146, in _doReadOrWrite why = getattr(selectable, method)() File "c:\tahoe-lafs\support\Lib\site-packages\twisted-8.2.0-py2.6-win32.egg\twisted\internet\tcp.py", line 180, in doWrite return Connection.doWrite(self) File "c:\tahoe-lafs\support\Lib\site-packages\twisted-8.2.0-py2.6-win32.egg\twisted\internet\tcp.py", line 431, in doWrite result = abstract.FileDescriptor.doWrite(self) File "c:\tahoe-lafs\support\Lib\site-packages\twisted-8.2.0-py2.6-win32.egg\twisted\internet\abstract.py", line 101, in doWrite self.dataBuffer = buffer(self.dataBuffer, self.offset) + "".join(self._tempDataBuffer) exceptions.MemoryError: ``` ###################################################################### the gateway/client process quickly grow over 1.5 Gbyte in ram, then it shrunk to 180 Mbyte, the storage process grow to 300 Mbyte and stayed there
tahoe-lafs added the
unknown
minor
defect
1.8.2
labels 2011-05-05 20:02:27 +00:00
tahoe-lafs added this to the undecided milestone 2011-05-05 20:02:27 +00:00
davidsarah commented 2011-05-05 21:45:16 +00:00
Author
Owner

I suspect that this isn't specific to win32, although the limit on usable process address space for 32-bit Windows might be smaller than for other 32-bit platforms.

There was also an error while reporting the incident:

[Failure instance: Traceback: <type 'exceptions.IOError'>: [Errno 22] invalid mode ('wb') or filename: u'C:\\tahoeclient\\logs\\incidents\\incident-2011-05-05--18:33:34Z-w2qn32q.flog'

but it isn't clear whether that's due to the low-memory condition or an independent issue.
'wb' is a valid mode, so probably the path is incorrect. Does C:\tahoeclient\logs\incidents\ exist?

I suspect that this isn't specific to win32, although the limit on usable process address space for 32-bit Windows might be smaller than for other 32-bit platforms. There was also an error while reporting the incident: ``` [Failure instance: Traceback: <type 'exceptions.IOError'>: [Errno 22] invalid mode ('wb') or filename: u'C:\\tahoeclient\\logs\\incidents\\incident-2011-05-05--18:33:34Z-w2qn32q.flog' ``` but it isn't clear whether that's due to the low-memory condition or an independent issue. 'wb' is a valid mode, so probably the path is incorrect. Does `C:\tahoeclient\logs\incidents\` exist?
sickness commented 2011-05-05 21:55:00 +00:00
Author
Owner

yeah, I can confirm that C:\tahoeclient\logs\incidents\ exists

yeah, I can confirm that C:\tahoeclient\logs\incidents\ exists
davidsarah commented 2011-05-06 15:50:59 +00:00
Author
Owner

I see the problem. ':' is not valid in a Windows filename. Will file another ticket for that.

I see the problem. '`:`' is not valid in a Windows filename. Will file another ticket for that.
davidsarah commented 2011-05-06 16:49:29 +00:00
Author
Owner

Replying to davidsarah:

I see the problem. ':' is not valid in a Windows filename. Will file another ticket for that.

#1396 and http://foolscap.lothar.com/trac/ticket/177

Replying to [davidsarah](/tahoe-lafs/trac-2024-07-25/issues/1395#issuecomment-124995): > I see the problem. '`:`' is not valid in a Windows filename. Will file another ticket for that. #1396 and <http://foolscap.lothar.com/trac/ticket/177>
tahoe-lafs added
code-encoding
and removed
unknown
labels 2011-05-06 16:53:21 +00:00
tahoe-lafs changed title from error doing a check --verify on files bigger than 1Gb on win32 to error doing a check --verify on files bigger than about 1Gbyte 2011-05-06 16:53:21 +00:00
davidsarah commented 2011-05-06 16:55:06 +00:00
Author
Owner

How was this reported by tahoe check --verify?

How was this reported by `tahoe check --verify`?
sickness commented 2011-05-06 18:14:56 +00:00
Author
Owner

it simply reported that 0 shares could be find for that file

it simply reported that 0 shares could be find for that file
zooko commented 2011-05-06 18:36:13 +00:00
Author
Owner

So both the gateway and the storage node had MemoryError. Can you reproduce it and tell what the Process Explorer says about how much memory those two different processes are using at different times in the operation? Are they using a large amount of memory continuously, or does it go up just before this failure, or what?

Thank you!

So both the gateway and the storage node had MemoryError. Can you reproduce it and tell what the Process Explorer says about how much memory those two different processes are using at different times in the operation? Are they using a large amount of memory continuously, or does it go up just before this failure, or what? Thank you!
davidsarah commented 2011-05-06 19:18:10 +00:00
Author
Owner

#1229 is another checker-related memory issue. I've no evidence that it's the same issue, but it's possible that they might interact (i.e. the fact that we need memory proportional to the file size might increase the size of the #1229 memory leak, or the leak might increase the memory required for a single check).

Anyway, we should try to reproduce this on a non-Windows system to confirm that it's not win32-specific.

#1229 is another checker-related memory issue. I've no evidence that it's the same issue, but it's possible that they might interact (i.e. the fact that we need memory proportional to the file size might increase the size of the #1229 memory leak, or the leak might increase the memory required for a single check). Anyway, we should try to reproduce this on a non-Windows system to confirm that it's not win32-specific.
davidsarah commented 2011-05-06 19:23:33 +00:00
Author
Owner

Note that performance.rst claims:

memory footprint: N/K*S

for verifying a file (where S is the segment size). That seems to be contradicted by the behaviour in this bug.

Note that [performance.rst](http://tahoe-lafs.org/trac/tahoe-lafs/browser/trunk/docs/performance.rst?rev=3523b50716e65575#performing-a-file-verify-on-an-a-byte-file) claims: memory footprint: N/K*S for verifying a file (where S is the segment size). That seems to be contradicted by the behaviour in this bug.
warner commented 2011-05-07 18:33:32 +00:00
Author
Owner

Not that it's likely to be the problem here, but we should update that claim to be "N/K*S times a small multiple". I think the multiple is currently about 2 or 3. During encryption, we hold both a plaintext share and a ciphertext share in RAM at the same time (so 2S), then we drop the plaintext. During erasure-coding, we hold a whole S of ciphertext in memory at the same time as the N/KS shares, then we drop the ciphertext before pushing. We also pipeline the sends a little bit, I think 10kB or 50kB per server, to get better utilization out of a non-zero-latency wire.

Also Python's memory-management strategy interacts weirdly. Dropping the plaintext segment may not be enough: Python might not re-use that memory space for anything else right away. Although I'd expect it to de-fragment or coalesce free blocks before asking the OS for so much memory that it crashed.

Not that it's likely to be the problem here, but we should update that claim to be "`N/K*S` times a small multiple". I think the multiple is currently about 2 or 3. During encryption, we hold both a plaintext share and a ciphertext share in RAM at the same time (so 2*S), then we drop the plaintext. During erasure-coding, we hold a whole S of ciphertext in memory at the same time as the N/K*S shares, then we drop the ciphertext before pushing. We also pipeline the sends a little bit, I think 10kB or 50kB per server, to get better utilization out of a non-zero-latency wire. Also Python's memory-management strategy interacts weirdly. Dropping the plaintext segment may not be enough: Python might not re-use that memory space for anything else right away. Although I'd expect it to de-fragment or coalesce free blocks before asking the OS for so much memory that it crashed.
zooko commented 2011-05-08 01:43:40 +00:00
Author
Owner

Replying to warner:

Not that it's likely to be the problem here, but we should update that claim to be "N/K*S times a small multiple". I think the multiple is currently about 2 or 3.

I would be willing to update these docs to be more precise or more correct, but I'm not entirely sure what you want them to say. (N/K)*S*3+50KB? (But only for immutable repair.)

Note that some of the other numbers in there are marked as approximate by a preceding tilde ~, e.g. performance.rst "Repairing an A-byte file". Maybe we should use the computer science tradition of ignoring constant factors which are independent of the variables (K, N, S, A, B, and G). However, I would rather follow that tradition only when the constant factors that we're ignoring are sufficiently small that our users will be willing to ignore them too. :-)

So in short: +0 from me, but you would need to write a patch for performance.rst yourself. Your attention to that document would be much appreciated by me because I would like for your admirable concern for precision in resource usage to be better represented there.

The pipeline size, which applies only to immutable objects and only to uploads, is 50 KB: [immutable/layout.py WriteBucketProxy](http://tahoe-lafs.org/trac/tahoe-lafs/browser/trunk/src/allmydata/immutable/layout.py?annotate=blame&rev=4655#L99).

Replying to [warner](/tahoe-lafs/trac-2024-07-25/issues/1395#issuecomment-125005): > Not that it's likely to be the problem here, but we should update that claim to be "`N/K*S` times a small multiple". I think the multiple is currently about 2 or 3. I would be willing to update these docs to be more precise or more correct, but I'm not entirely sure what you want them to say. `(N/K)*S*3+50KB`? (But only for immutable repair.) Note that some of the other numbers in there are marked as approximate by a preceding tilde `~`, e.g. [performance.rst "Repairing an A-byte file"](http://tahoe-lafs.org/trac/tahoe-lafs/browser/trunk/docs/performance.rst?rev=3523b50716e65575#repairing-an-a-byte-file-mutable-or-immutable). Maybe we should use the computer science tradition of ignoring constant factors which are independent of the variables (`K`, `N`, `S`, `A`, `B`, and `G`). However, I would rather follow that tradition only when the constant factors that we're ignoring are sufficiently small that our users will be willing to ignore them too. :-) So in short: +0 from me, but you would need to write a patch for `performance.rst` yourself. Your attention to that document would be much appreciated by me because I would like for your admirable concern for precision in resource usage to be better represented there. The pipeline size, which applies only to immutable objects and only to uploads, is 50 KB: [immutable/layout.py [WriteBucketProxy](wiki/WriteBucketProxy)](http://tahoe-lafs.org/trac/tahoe-lafs/browser/trunk/src/allmydata/immutable/layout.py?annotate=blame&rev=4655#L99).
zooko commented 2011-05-08 12:47:51 +00:00
Author
Owner

I've opened #1398 (make docs/performance.rst more precise and accurate) for the documentation issues and attached a patch there. I do hope Brian looks at it, because my previous pass seems to have left a lot of incorrect stuff in that document and I would like the next version of it to be less wrong.

I've opened #1398 (make docs/performance.rst more precise and accurate) for the documentation issues and attached a patch there. I do hope Brian looks at it, because my previous pass seems to have left a lot of incorrect stuff in that document and I would like the next version of it to be less wrong.
warner commented 2011-05-08 20:47:00 +00:00
Author
Owner

Attachment serialize-verifier.diff (1614 bytes) added

patch to serialize block-fetches in Verifier. Multiple shares are still done in parallel.

**Attachment** serialize-verifier.diff (1614 bytes) added patch to serialize block-fetches in Verifier. Multiple shares are still done in parallel.
warner commented 2011-05-08 21:02:54 +00:00
Author
Owner

I think I see the problem. The immutable Verifier code (here in checker.py) is overly parallelized. It uses a DeferredList to work on all shares in parallel, and each share worker uses a DeferredList to work on all blocks in parallel. The result is that every single byte of every single share is fetched at the same time, completely blowing our memory budget. As to why the server is crashing, I suspect that when the server gets a gigantic batch of requests for every single byte of the file, it responds to all of them, queueing a massive amount of data in the output buffers, which blows the memory space. A separate issue is to protect our servers against this sort of DoS, but I'm not sure how (we'd need to delay responding to a request if there were more than a certain number of bytes sitting in the output queue for that connection, which jumps wildly across the abstraction boundaries).

The Verifier should work just like the Downloader: one segment at a time, all blocks for a single segment being fetched in parallel. That approach gives a memory footprint of about S*N/k (whereas regular download is about S). We could reduce the footprint to S/k (at the expense of speed) by doing just one block at a time (i.e. completely verify share 1 before touching share 2, and within share 1 we completely verify block 1 before touching block 2), but I think that's too much.

I've attached a patch which limits parallelism to approximately the right thing, given the slightly funky design of the Verifier (the verifier iterates primarily over shares, not segments). The patch continues to verify all shares in parallel. However, within each share, it serializes the handling of blocks, so that each share-handler will only look at one block at a time.

The patch needs tests, which should verify a moderate-size artificially-small-segment (thus high-number-of-segment) file, probably with N=1 for simplicity. It needs to confirm that one block is completed before the next begins: I don't know an easy way to do that.. probably needs some instrumentation in checker.py. My manual tests just added some printfs, one just before the call to vrbp.get_block(), another inside _discard_result(), and noticed that there were lots of get_blocks without interleaved _discard_results.

I think I see the problem. The immutable Verifier code ([here](http://tahoe-lafs.org/trac/tahoe-lafs/browser/trunk/src/allmydata/immutable/checker.py?rev=ffd296fc5ab8007f#L620) in checker.py) is overly parallelized. It uses a DeferredList to work on all shares in parallel, and each share worker uses a DeferredList to work on all blocks in parallel. The result is that every single byte of every single share is fetched at the same time, completely blowing our memory budget. As to why the server is crashing, I suspect that when the server gets a gigantic batch of requests for every single byte of the file, it responds to all of them, queueing a massive amount of data in the output buffers, which blows the memory space. A separate issue is to protect our servers against this sort of DoS, but I'm not sure how (we'd need to delay responding to a request if there were more than a certain number of bytes sitting in the output queue for that connection, which jumps wildly across the abstraction boundaries). The Verifier should work just like the Downloader: one segment at a time, all blocks for a single segment being fetched in parallel. That approach gives a memory footprint of about `S*N/k` (whereas regular download is about `S`). We could reduce the footprint to `S/k` (at the expense of speed) by doing just one block at a time (i.e. completely verify share 1 before touching share 2, and within share 1 we completely verify block 1 before touching block 2), but I think that's too much. I've attached a patch which limits parallelism to approximately the right thing, given the slightly funky design of the Verifier (the verifier iterates primarily over *shares*, not segments). The patch continues to verify all shares in parallel. However, within each share, it serializes the handling of blocks, so that each share-handler will only look at one block at a time. The patch needs tests, which should verify a moderate-size artificially-small-segment (thus high-number-of-segment) file, probably with N=1 for simplicity. It needs to confirm that one block is completed before the next begins: I don't know an easy way to do that.. probably needs some instrumentation in `checker.py`. My manual tests just added some printfs, one just before the call to `vrbp.get_block()`, another inside `_discard_result()`, and noticed that there were lots of `get_block`s without interleaved `_discard_result`s.
davidsarah commented 2011-05-08 21:51:14 +00:00
Author
Owner

Could this patch cause a performance regression for small files, or is each server going to serialize the requests to it anyway?

Could this patch cause a performance regression for small files, or is each server going to serialize the requests to it anyway?
zooko commented 2011-05-28 21:09:36 +00:00
Author
Owner

This is a major bug. I really think we should fix this for v1.9.0.

This is a major bug. I really think we should fix this for v1.9.0.
tahoe-lafs modified the milestone from undecided to 1.9.0 2011-05-28 21:09:36 +00:00
davidsarah commented 2011-05-28 21:58:57 +00:00
Author
Owner

+1 on fixing this for 1.9.0.

+1 on fixing this for 1.9.0.
tahoe-lafs added
major
and removed
minor
labels 2011-05-28 21:59:21 +00:00
sickness commented 2011-06-10 11:01:45 +00:00
Author
Owner

YAY, this patch solves my problem! (and also speeds up the testing and greatly reduces the memory usate of the storage node =)

YAY, this patch solves my problem! (and also speeds up the testing and greatly reduces the memory usate of the storage node =)
ChosenOne commented 2011-06-10 14:35:42 +00:00
Author
Owner

Cannot reproduce the bug on my virtual machine: Windows 7 x64, 1G RAM. Test files were 1G and 2G.

Cannot reproduce the bug on my virtual machine: Windows 7 x64, 1G RAM. Test files were 1G and 2G.
warner commented 2011-06-11 18:50:08 +00:00
Author
Owner

ok, so sickness' report suggests this was the right problem to fix. Now to figure out how to write a test for it.

ok, so sickness' report suggests this was the right problem to fix. Now to figure out how to write a test for it.
warner commented 2011-06-11 19:37:12 +00:00
Author
Owner

Attachment 1395-overparallel.diff (4744 bytes) added

serialize block-fetches, add test

**Attachment** 1395-overparallel.diff (4744 bytes) added serialize block-fetches, add test
warner commented 2011-06-11 19:39:11 +00:00
Author
Owner

Ok, that latest patch should be ready: it adds a test which includes a tiny bit of instrumentation on the immutable verifier (to measure how many block fetches are active at any one time), and it fails without the verifier fix, and passes once that fix is applied. Ready for review.

Ok, that latest patch should be ready: it adds a test which includes a tiny bit of instrumentation on the immutable verifier (to measure how many block fetches are active at any one time), and it fails without the verifier fix, and passes once that fix is applied. Ready for review.
davidsarah commented 2011-06-12 14:59:27 +00:00
Author
Owner

comment:125000 "How was this reported by tahoe check --verify?"

comment:125001 "it simply reported that 0 shares could be find for that file"

That seems like an error-reporting bug that wouldn't be fixed by attachment:1395-overparallel.diff, even if it would happen in fewer cases.

[comment:125000](/tahoe-lafs/trac-2024-07-25/issues/1395#issuecomment-125000) "How was this reported by `tahoe check --verify?`" [comment:125001](/tahoe-lafs/trac-2024-07-25/issues/1395#issuecomment-125001) "it simply reported that 0 shares could be find for that file" That seems like an error-reporting bug that wouldn't be fixed by attachment:1395-overparallel.diff, even if it would happen in fewer cases.
warner commented 2011-06-15 17:40:46 +00:00
Author
Owner

davidsarah: hmm, if the storage server got clobbered by all the parallel fetches, then it would throw MemoryError (and either crash completely, or at least return Failures to the remote_read() calls). From the client's point of view, this is identical to there being no shares available: crashing servers are not providing shares.

Maybe the CLI command output could be extended to mention how many servers were contacted, and how many answered, so the user could distinguish crashing-servers (or no-network) from solid NAKs.

This patch is still important to apply, though.. zooko: please don't defer reviewing it because of the CLI deficiency. It'll be easier to test a CLI change with special-purpose test code (building a client with no servers, or with intentionally-failing servers) than by trying to provoke a MemoryError.

davidsarah: hmm, if the storage server got clobbered by all the parallel fetches, then it would throw `MemoryError` (and either crash completely, or at least return Failures to the `remote_read()` calls). From the client's point of view, this is identical to there being no shares available: crashing servers are not providing shares. Maybe the CLI command output could be extended to mention how many servers were contacted, and how many answered, so the user could distinguish crashing-servers (or no-network) from solid NAKs. This patch is still important to apply, though.. zooko: please don't defer reviewing it because of the CLI deficiency. It'll be easier to test a CLI change with special-purpose test code (building a client with no servers, or with intentionally-failing servers) than by trying to provoke a `MemoryError`.
zooko commented 2011-06-17 06:09:26 +00:00
Author
Owner

Attachment not-too-parallel-test-by-patching.diff (6944 bytes) added

**Attachment** not-too-parallel-test-by-patching.diff (6944 bytes) added
zooko commented 2011-06-17 06:11:20 +00:00
Author
Owner

Okay, I've reviewed the patch and it is good. Please review this alternate patch — not-too-parallel-test-by-patching.diff — in which the test patches the ValidatedReadBucketClass instead of the code under test (Checker) having code in it for testing purposes. Unfortunately this patch depends on a variant of mock.py named mockutil.py which knows what to do when a test case returns a Deferred instance, so I don't think we should commit this patch as is. I've submitted a patch to the mock project (http://code.google.com/p/mock/issues/detail?id=96 ) and I'll proceed to work up a patch for Tahoe-LAFS which uses a lower-tech patching tool. But please review this patch because it most clearly shows why I like this approach. (If you ignore the whole mockutil.py part.)

Okay, I've reviewed the patch and it is good. Please review this alternate patch — [not-too-parallel-test-by-patching.diff](/tahoe-lafs/trac-2024-07-25/attachments/000078ac-64be-d117-196c-c5f1d247fb88) — in which the test patches the ValidatedReadBucketClass instead of the code under test (Checker) having code in it for testing purposes. Unfortunately this patch depends on a variant of `mock.py` named `mockutil.py` which knows what to do when a test case returns a Deferred instance, so I don't think we should commit this patch as is. I've submitted a patch to the `mock` project (<http://code.google.com/p/mock/issues/detail?id=96> ) and I'll proceed to work up a patch for Tahoe-LAFS which uses a lower-tech patching tool. But please review this patch because it most clearly shows why I like this approach. (If you ignore the whole `mockutil.py` part.)
zooko commented 2011-06-17 06:15:36 +00:00
Author
Owner

Attachment not-too-parallel-test-by-low-tech-patching.diff (5149 bytes) added

**Attachment** not-too-parallel-test-by-low-tech-patching.diff (5149 bytes) added
zooko commented 2011-06-17 06:18:12 +00:00
Author
Owner

Okay here's a version that doesn't depend on mockutil.py: attachment:not-too-parallel-test-by-low-tech-patching.diff. As you can see it makes the test code a little uglier and it doesn't handle exceptions (it'll leave the ValidatedReadBucketProxy patched out for the rest of the trial process if there is an exception raised from the test). I guess I'm a bit on the fence about which of these two patches I prefer. What's your preference?

Okay here's a version that doesn't depend on `mockutil.py`: attachment:not-too-parallel-test-by-low-tech-patching.diff. As you can see it makes the test code a little uglier and it doesn't handle exceptions (it'll leave the `ValidatedReadBucketProxy` patched out for the rest of the `trial` process if there is an exception raised from the test). I guess I'm a bit on the fence about which of these two patches I prefer. What's your preference?
zooko commented 2011-06-20 20:31:49 +00:00
Author
Owner

To clarify, the original patch 1395-overparallel.diff passes my review and I approve of applying it to trunk. I also think that not-too-parallel-test-by-low-tech-patching.diff is better, and hope you (warner) will consider it as a replacement for attachment:1395-overparallel.diff. I also like not-too-parallel-test-by-patching.diff , but it has the flaw that it will probably break if someone uses it with a newer version of mock.py, so we should probably avoid it.

To clarify, the original patch [1395-overparallel.diff](/tahoe-lafs/trac-2024-07-25/attachments/000078ac-64be-d117-196c-d6d7eed2c0fa) passes my review and I approve of applying it to trunk. I *also* think that [not-too-parallel-test-by-low-tech-patching.diff](/tahoe-lafs/trac-2024-07-25/attachments/000078ac-64be-d117-196c-0120d1708ea2) is better, and hope you (warner) will consider it as a replacement for attachment:1395-overparallel.diff. I also like [not-too-parallel-test-by-patching.diff](/tahoe-lafs/trac-2024-07-25/attachments/000078ac-64be-d117-196c-c5f1d247fb88) , but it has the flaw that it will probably break if someone uses it with a newer version of `mock.py`, so we should probably avoid it.
warner commented 2011-06-23 21:39:09 +00:00
Author
Owner

I'm -1 on a test that leaves the code in a broken/patched state after an
exception.. too much opportunity for confusion and hair-tearing debugging of
other problems later. And I'm -1 on a test that depends deeply upon internals
of a library that we don't include a copy of (note that I'm historically
inconsistent about this one, q.v. Foolscap vs Twisted, but I'm trying to mend
my ways).

At first, I found your -low-tech-patching test hard to follow, but looking
more closely, I'm growing fond of it. The original code isn't too hard to
patch in the right way, so the mock subclass isn't too too weird. It might be
nice if the new variable names were shorter (and since they only appear in the
context of the TooParallel unit test, they don't need to be as
fully-qualified as if they were in the original checker.py). But after
getting used to it, I think I'm ok with the way it stands.

I'd be ok with simple partial fix to the cleanup-after-exception problem.
Instead of using setUp/tearDown or pushing the whole middle of the test
method into a sub-method, just put the set_up_grid part into a small
function so you can get it into the Deferred chain. Something like:

    d = defer.succeed(None)
    def _start(ign):
        self.set_up_grid(num_servers=4)
        c0 = self.g.clients[0]
        c0.DEFAULT_ENCODING_PARAMETERS = { "k": 1, "happy": 4, "n": 4, 
                                           "max_segment_size": 5, } 
        self.c0 = c0
        self.uris = {} 
 	DATA = "data" * 100 # 400/5 = 80 blocks 
 	return c0.upload(Data(DATA, convergence="")) 
    d.addCallback(_start)
    def _do_check(ur):
        n = self.c0.create_node_from_uri(ur.uri)
        return n.check(Monitor(), verify=True)
    d.addCallback(_do_check)

Since the "critical region" where exceptions could cause problems doesn't
start until after the ...checker.ValidatedReadBucketProxy = make_mock_VRBP
line, it's good enough to just capture the first chunk of the code after that
in a function which is run from within the Deferred chain. Any exceptions in
set_up_grid or c0.upload (which are fairly big and complex) will
still get caught, so the cleanup can happen.

I'm -1 on a test that leaves the code in a broken/patched state after an exception.. too much opportunity for confusion and hair-tearing debugging of other problems later. And I'm -1 on a test that depends deeply upon internals of a library that we don't include a copy of (note that I'm historically inconsistent about this one, q.v. Foolscap vs Twisted, but I'm trying to mend my ways). At first, I found your -low-tech-patching test hard to follow, but looking more closely, I'm growing fond of it. The original code isn't too hard to patch in the right way, so the mock subclass isn't too too weird. It might be nice if the new variable names were shorter (and since they only appear in the context of the TooParallel unit test, they don't need to be as fully-qualified as if they were in the original `checker.py`). But after getting used to it, I think I'm ok with the way it stands. I'd be ok with simple partial fix to the cleanup-after-exception problem. Instead of using `setUp/tearDown` or pushing the whole middle of the test method into a sub-method, just put the `set_up_grid` part into a small function so you can get it into the Deferred chain. Something like: ``` d = defer.succeed(None) def _start(ign): self.set_up_grid(num_servers=4) c0 = self.g.clients[0] c0.DEFAULT_ENCODING_PARAMETERS = { "k": 1, "happy": 4, "n": 4, "max_segment_size": 5, } self.c0 = c0 self.uris = {} DATA = "data" * 100 # 400/5 = 80 blocks return c0.upload(Data(DATA, convergence="")) d.addCallback(_start) def _do_check(ur): n = self.c0.create_node_from_uri(ur.uri) return n.check(Monitor(), verify=True) d.addCallback(_do_check) ``` Since the "critical region" where exceptions could cause problems doesn't start until after the `...checker.ValidatedReadBucketProxy = make_mock_VRBP` line, it's good enough to just capture the first chunk of the code after that in a function which is run from within the Deferred chain. Any exceptions in `set_up_grid` or `c0.upload` (which are fairly big and complex) will still get caught, so the cleanup can happen.
zooko commented 2011-06-30 14:59:54 +00:00
Author
Owner

I'll modify the low-tech patcher approach (attachment:not-too-parallel-test-by-low-tech-patching.diff) to clean up in case of exception, like Brian suggested.

Note that Michael Foord, author of mock.py, has proposed a patch for mock.py that would make it do everything we need. He just wants code-review and unit tests. See the mock.py ticket to help with that.

Once a version of mock.py is out with that functionality and we are willing to make Tahoe-LAFS depends on that version of mock then we can have the more succinct not-too-parallel-test-by-patching.diff style of mockery in our unit tests.

I'll modify the low-tech patcher approach (attachment:not-too-parallel-test-by-low-tech-patching.diff) to clean up in case of exception, like Brian suggested. Note that Michael Foord, author of mock.py, has proposed a patch for mock.py that would make it do everything we need. He just wants code-review and unit tests. See [the mock.py ticket](http://code.google.com/p/mock/issues/detail?id=96) to help with that. Once a version of mock.py is out with that functionality and we are willing to make Tahoe-LAFS depends on that version of mock then we can have the more succinct [not-too-parallel-test-by-patching.diff](/tahoe-lafs/trac-2024-07-25/attachments/000078ac-64be-d117-196c-c5f1d247fb88) style of mockery in our unit tests.
zooko@zooko.com commented 2011-08-02 06:44:56 +00:00
Author
Owner

In changeset:f426e82287c11b11:

verifier: serialize the fetching of blocks within a share so that we don't use too much RAM

Shares are still verified in parallel, but within a share, don't request a
block until the previous block has been verified and the memory we used to hold
it has been freed up.

Patch originally due to Brian. This version has a mockery-patchery-style test
which is "low tech" (it implements the patching inline in the test code instead
of using an extension of the mock.patch() function from the mock library) and
which unpatches in case of exception.

fixes #1395
In changeset:f426e82287c11b11: ``` verifier: serialize the fetching of blocks within a share so that we don't use too much RAM Shares are still verified in parallel, but within a share, don't request a block until the previous block has been verified and the memory we used to hold it has been freed up. Patch originally due to Brian. This version has a mockery-patchery-style test which is "low tech" (it implements the patching inline in the test code instead of using an extension of the mock.patch() function from the mock library) and which unpatches in case of exception. fixes #1395 ```
tahoe-lafs added the
fixed
label 2011-08-02 06:44:56 +00:00
zooko@zooko.com closed this issue 2011-08-02 06:44:56 +00:00
david-sarah@jacaranda.org commented 2011-08-02 18:58:09 +00:00
Author
Owner

In changeset:9f8d34e63aa1aeeb:

test_repairer: add a deterministic test of share data corruption that always flips the bits of the last byte of the share data. refs #1395
In changeset:9f8d34e63aa1aeeb: ``` test_repairer: add a deterministic test of share data corruption that always flips the bits of the last byte of the share data. refs #1395 ```
tahoe-lafs added
critical
and removed
major
fixed
labels 2011-08-02 19:20:10 +00:00
zooko reopened this issue 2011-08-02 19:20:10 +00:00
david-sarah@jacaranda.org commented 2011-08-02 19:39:48 +00:00
Author
Owner

In changeset:c7f65ee8ad254f3f:

verifier: correct a bug introduced in changeset [changeset:f426e82287c11b11] that caused us to only verify the first block of a file. refs #1395
In changeset:c7f65ee8ad254f3f: ``` verifier: correct a bug introduced in changeset [changeset:f426e82287c11b11] that caused us to only verify the first block of a file. refs #1395 ```
davidsarah commented 2011-08-02 20:59:40 +00:00
Author
Owner

changeset:c7f65ee8ad254f3f seems to have fixed the failures of allmydata.test.test_repairer.Verifier.test_corrupt_sharedata on the builders. (Some remaining nondeterministic failures are due to #1084.)

changeset:c7f65ee8ad254f3f seems to have fixed the failures of `allmydata.test.test_repairer.Verifier.test_corrupt_sharedata` on the builders. (Some remaining nondeterministic failures are due to #1084.)
tahoe-lafs added the
fixed
label 2011-08-02 20:59:40 +00:00
davidsarah closed this issue 2011-08-02 20:59:40 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: tahoe-lafs/trac-2024-07-25#1395
No description provided.