startup failures non-obvious #2802

Closed
opened 2016-08-08 21:41:32 +00:00 by meejah · 2 comments
meejah commented 2016-08-08 21:41:32 +00:00
Owner

If you start up a node with a magic-folder and for some reason the shared-folder can't be accessed you'll get a long exception in logs/twistd.log ending with allmydata.mutable.common.UnrecoverableFileError: no recoverable versions.

However, the node keeps running but will not do any magic-folder "stuff" any more. No matter what, the error printed to the log should be something suitable for a user (e.g. "can't access magic-folder directory X" or so).

We should either:

  • kill the entire node if this happens
  • keep re-trying (and re-logging the error)

I have seen this occasionally when running check_magicfolder_smoke.py tests but am unsure how to reliably repeat it. The obvious thing of turning off all the storage servers and re-starting the Alice (or Bob) client doesn't appear to be sufficient.

If you start up a node with a magic-folder and for some reason the shared-folder can't be accessed you'll get a long exception in `logs/twistd.log` ending with `allmydata.mutable.common.UnrecoverableFileError: no recoverable versions`. However, the node keeps running but will not do any magic-folder "stuff" any more. No matter what, the error printed to the log should be something suitable for a user (e.g. "can't access magic-folder directory X" or so). We should either: - kill the entire node if this happens - keep re-trying (and re-logging the error) I have seen this occasionally when running `check_magicfolder_smoke.py` tests but am unsure how to reliably repeat it. The obvious thing of turning off all the storage servers and re-starting the Alice (or Bob) client doesn't appear to be sufficient.
tahoe-lafs added the
code-frontend-magic-folder
normal
defect
1.11.0
labels 2016-08-08 21:41:32 +00:00
tahoe-lafs added this to the undecided milestone 2016-08-08 21:41:32 +00:00
meejah commented 2016-08-08 22:30:35 +00:00
Author
Owner

Okay there's a way to repeat: run check_magicfolder_smoke.py, then run it with kill argv, then start just one of the storage nodes, then start Alice.

Alice will be when_connected_enough because there will be 2 storage servers connected (alice, plus the one storage node) so MagicFolder.ready is called, but this fails because the DMD can't be retrieved...

Okay there's a way to repeat: run `check_magicfolder_smoke.py`, then run it with `kill` argv, then start just one of the storage nodes, then start Alice. Alice will be `when_connected_enough` because there will be 2 storage servers connected (alice, plus the one storage node) so `MagicFolder.ready` is called, but this fails because the DMD can't be retrieved...
exarkun commented 2020-06-30 13:45:05 +00:00
Author
Owner

magic-folder has been split out into a separate project - https://github.com/leastauthority/magic-folder

magic-folder has been split out into a separate project - <https://github.com/leastauthority/magic-folder>
tahoe-lafs added the
somebody else's problem
label 2020-06-30 13:45:05 +00:00
exarkun closed this issue 2020-06-30 13:45:05 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: tahoe-lafs/trac-2024-07-25#2802
No description provided.