diff --git a/KnownIssues.md b/KnownIssues.md new file mode 100644 index 0000000..acb2246 --- /dev/null +++ b/KnownIssues.md @@ -0,0 +1,71 @@ +# Known Issues + +This page describes known problems for recent releases of Tahoe. Issues are +fixed as quickly as possible, however users of older releases may still need +to be aware of these problems until they upgrade to a release which resolves +it. + +## Issues in [1.1 [/tahoe-lafs/trac-2024-07-25/milestone/127](/tahoe-lafs/trac-2024-07-25/milestone/127)]Tahoe (not quite released) + +### Servers which run out of space + +If a Tahoe storage server runs out of space, writes will fail with an +`IOError` exception. In some situations, Tahoe-1.1 clients will not react +to this very well: + + * if the exception occurs during an immutable-share write, that share will + be broken. The client will detect this, and will declare the upload as + failing if insufficient shares can be placed (this "shares of happiness" + threshold defaults to 7 out of 10). The code does not yet search for new + servers to replace the full ones. If the upload fails, the server's + upload-already-in-progress routines may interfere with a subsequent + upload. + * if the exception occurs during a mutable-share write, the old share will + be left in place (and a new home for the share will be sought). If enough + old shares are left around, subsequent reads may see the file in its + earlier state, known as a "rollback" fault. Writing a new version of the + file should find the newer shares correctly, although it will take + longer (more roundtrips) than usual. + +The out-of-space handling code is not yet complete, and we do not yet have a +space-limiting solution that is suitable for large storage nodes. The +"sizelimit" configuration uses a /usr/bin/du -style query at node startup, +which takes a long time (tens of minutes) on storage nodes that offer 100GB +or more, making it unsuitable for highly-available servers. + +In lieu of 'sizelimit', server admins are advised to set the +NODEDIR/readonly_storage (and remove 'sizelimit', and restart their nodes) on +their storage nodes before space is exhausted. This will stop the influx of +immutable shares. Mutable shares will continue to arrive, but since these are +mainly used by directories, the amount of space consumed will be smaller. + +Eventually we will have a better solution for this. + +== Issues in Tahoe 1.0 == + +=== Servers which run out of space === + +In addition to the problems described above, Tahoe-1.0 clients which +experience out-of-space errors while writing mutable files are likely to +think the write succeeded, when it in fact failed. This can cause data loss. + +=== Large Directories or Mutable files in a specific range of sizes === + +A mismatched pair of size limits causes a problem when a client attempts to +upload a large mutable file with a size between 3139275 and 3500000 bytes. +(Mutable files larger than 3.5MB are refused outright). The symptom is very +high memory usage (3GB) and 100% CPU for about 5 minutes. The attempted write +will fail, but the client may think that it succeeded. This size corresponds +to roughly 9000 entries in a directory. + +This was fixed in 1.1, as ticket #379. Files up to 3.5MB should now work +properly, and files above that size should be rejected properly. Both servers +and clients must be upgraded to resolve the problem, although once the client +is upgraded to 1.1 the memory usage and false-success problems should be +fixed. + +=== pycryptopp compile errors resulting in corruption === + +Certain combinations of compiler, linker, and pycryptopp versions may cause +corruption errors during decryption, resulting in corrupted plaintext. +