[Imported from Trac: page KnownIssues, version 1]

2008-06-05 20:02:29 +00:00 · 2008-06-05 20:02:29 +00:00 · 46d95ba256
parent fdd647ac87
commit 46d95ba256
1 changed files with 71 additions and 0 deletions
--- a/KnownIssues.md
+++ b/KnownIssues.md
@ -0,0 +1,71 @@
+# Known Issues
+
+This page describes known problems for recent releases of Tahoe. Issues are
+fixed as quickly as possible, however users of older releases may still need
+to be aware of these problems until they upgrade to a release which resolves
+it.
+
+## Issues in [1.1 [/tahoe-lafs/trac-2024-07-25/milestone/127](/tahoe-lafs/trac-2024-07-25/milestone/127)]Tahoe (not quite released)
+
+### Servers which run out of space
+
+If a Tahoe storage server runs out of space, writes will fail with an
+`IOError` exception. In some situations, Tahoe-1.1 clients will not react
+to this very well:
+
+ * if the exception occurs during an immutable-share write, that share will
+   be broken. The client will detect this, and will declare the upload as
+   failing if insufficient shares can be placed (this "shares of happiness"
+   threshold defaults to 7 out of 10). The code does not yet search for new
+   servers to replace the full ones. If the upload fails, the server's
+   upload-already-in-progress routines may interfere with a subsequent
+   upload.
+ * if the exception occurs during a mutable-share write, the old share will
+   be left in place (and a new home for the share will be sought). If enough
+   old shares are left around, subsequent reads may see the file in its
+   earlier state, known as a "rollback" fault. Writing a new version of the
+   file should find the newer shares correctly, although it will take
+   longer (more roundtrips) than usual.
+
+The out-of-space handling code is not yet complete, and we do not yet have a
+space-limiting solution that is suitable for large storage nodes. The
+"sizelimit" configuration uses a /usr/bin/du -style query at node startup,
+which takes a long time (tens of minutes) on storage nodes that offer 100GB
+or more, making it unsuitable for highly-available servers.
+
+In lieu of 'sizelimit', server admins are advised to set the
+NODEDIR/readonly_storage (and remove 'sizelimit', and restart their nodes) on
+their storage nodes before space is exhausted. This will stop the influx of
+immutable shares. Mutable shares will continue to arrive, but since these are
+mainly used by directories, the amount of space consumed will be smaller.
+
+Eventually we will have a better solution for this.
+
+== Issues in Tahoe 1.0 ==
+
+=== Servers which run out of space ===
+
+In addition to the problems described above, Tahoe-1.0 clients which
+experience out-of-space errors while writing mutable files are likely to
+think the write succeeded, when it in fact failed. This can cause data loss.
+
+=== Large Directories or Mutable files in a specific range of sizes ===
+
+A mismatched pair of size limits causes a problem when a client attempts to
+upload a large mutable file with a size between 3139275 and 3500000 bytes.
+(Mutable files larger than 3.5MB are refused outright). The symptom is very
+high memory usage (3GB) and 100% CPU for about 5 minutes. The attempted write
+will fail, but the client may think that it succeeded. This size corresponds
+to roughly 9000 entries in a directory.
+
+This was fixed in 1.1, as ticket #379. Files up to 3.5MB should now work
+properly, and files above that size should be rejected properly. Both servers
+and clients must be upgraded to resolve the problem, although once the client
+is upgraded to 1.1 the memory usage and false-success problems should be
+fixed.
+
+=== pycryptopp compile errors resulting in corruption ===
+
+Certain combinations of compiler, linker, and pycryptopp versions may cause
+corruption errors during decryption, resulting in corrupted plaintext.
+