connection-hint incompatibility with debian/jesse (foolscap-0.6.5) and current version #2831

Closed
opened 2016-09-15 17:53:14 +00:00 by warner · 1 comment
warner commented 2016-09-15 17:53:14 +00:00
Owner

IRC user "edi" was using the debian/jesse version of Tahoe (1.11.0) and Foolscap (0.6.5), and happened to configure tub.location using the modern recommendations (so tub.location = tcp:HOST:PORT). Unfortunately foolscap-0.6.5 can't handle that format: it can handle HOST:PORT, and tcp:host=HOST:port=PORT, but not tcp:HOST:PORT.

0.6.5 is about the time we were experimenting with using client-endpoint strings as connection hints. 5 weeks later, we backed away from this in 0.7.0, which accepts HOST:PORT and tcp:HOST:PORT but not the kind with the equals.

Unfortunately debian/jesse froze the one foolscap release that behaves this way. The symptom is a ValueError as it tries to look for "=" in the connection hint:

File "/usr/lib/python2.7/dist-packages/allmydata/node.py", line 385, in _setup_tub
            self.tub.setLocation(location)
          File "/usr/lib/python2.7/dist-packages/foolscap/pb.py", line 510, in setLocation
            self._maybeCreateLogPortFURLFile()
          File "/usr/lib/python2.7/dist-packages/foolscap/pb.py", line 450, in _maybeCreateLogPortFURLFile
            ignored = self.getLogPortFURL()
          File "/usr/lib/python2.7/dist-packages/foolscap/pb.py", line 461, in getLogPortFURL
            furlFile=furlfile)
          File "/usr/lib/python2.7/dist-packages/foolscap/pb.py", line 712, in registerReference
            sr = SturdyRef(oldfurl)
          File "/usr/lib/python2.7/dist-packages/foolscap/referenceable.py", line 897, in __init__
            decode_furl(url)
          File "/usr/lib/python2.7/dist-packages/foolscap/referenceable.py", line 850, in decode_furl
            location_hints = decode_location_hints(hints)
          File "/usr/lib/python2.7/dist-packages/foolscap/referenceable.py", line 820, in decode_location_hints
            fields = dict([f.split("=") for f in pieces[1:]])
        exceptions.ValueError: dictionary update sequence element #0 has length 1; 2 is required

This error could either happen when a client hears one of these FURLs from the introducer, when a client or server is configured with an introducer.furl in this format, or when any older node is configured with tub.location=tcp:HOST:PORT and is then booted a second time.

The second-boot issue is that upon first boot, nodes write some of their generated FURLs into a "furlfile". This is where persistent FURLs (like the Introducer's private/introducer.furl, or a storage server's private/storage.furl) remember their "swissnum", the secret portion after the last slash. These recorded FURLs include tub.location as their connection hints. The node doesn't immediately care what's in these hints, it just advertises the FURLs to the introducer and writes them into the furlfiles.

But on the second boot, if the furlfile exists, the node reads the contents back into memory and parses them (to get the swissnum). While parsing, as a side-effect, the node passes the connection hints to decode_location_hints(), which then raises an exception when it sees the tcp: prefix but not any = in the components.

This exception means anything that is waiting for the Tub to be ready just won't get run, which means the Introducer Client isn't started, and the storage server isn't announced. And since the node has already daemonized by then, the original tahoe start doesn't show the error: you have to look in twistd.log to find out why we haven't heard anything from the introducer (this behavior is much better in current tahoe: tahoe start prints the traceback to stderr and then exits with an error).

So even if you fix tahoe.cfg to set tub.location = HOST:PORT, you must also edit (or delete) those furlfiles (private/introducer.furl, private/storage.furl, private/logport.furl, maybe others I've missed), or the exception will happen again.

The worst part is that I think this represents an incompatibility between Tahoe nodes created under debian/jesse and modern ones. If the modern nodes intentionally advertise old-style HOST:PORT hints, then they'll all work, but if they stick with the default tcp:HOST:PORT, then the jesse nodes will ignore those servers.

I was hoping that we'd got the forwards-compatibility right, but Jesse captured the wrong version of Foolscap. If only we'd gotten Foolscap-0.7.0 out a few weeks earlier, this wouldn't be a problem.

So I guess the resolution for this ticket is to add a note to the docs under "Compatibility", pointing out that you have to configure your modern introducer or servers with old-style hints if you want them to be useable by jesse-based servers or clients.

IRC user "edi" was using the debian/jesse version of Tahoe (1.11.0) and Foolscap (0.6.5), and happened to configure `tub.location` using the modern recommendations (so `tub.location = tcp:HOST:PORT`). Unfortunately foolscap-0.6.5 can't handle that format: it can handle `HOST:PORT`, and `tcp:host=HOST:port=PORT`, but not `tcp:HOST:PORT`. 0.6.5 is about the time we were experimenting with using client-endpoint strings as connection hints. 5 weeks later, we backed away from this in 0.7.0, which accepts `HOST:PORT` and `tcp:HOST:PORT` but not the kind with the equals. Unfortunately debian/jesse froze the one foolscap release that behaves this way. The symptom is a ValueError as it tries to look for "=" in the connection hint: ``` File "/usr/lib/python2.7/dist-packages/allmydata/node.py", line 385, in _setup_tub self.tub.setLocation(location) File "/usr/lib/python2.7/dist-packages/foolscap/pb.py", line 510, in setLocation self._maybeCreateLogPortFURLFile() File "/usr/lib/python2.7/dist-packages/foolscap/pb.py", line 450, in _maybeCreateLogPortFURLFile ignored = self.getLogPortFURL() File "/usr/lib/python2.7/dist-packages/foolscap/pb.py", line 461, in getLogPortFURL furlFile=furlfile) File "/usr/lib/python2.7/dist-packages/foolscap/pb.py", line 712, in registerReference sr = SturdyRef(oldfurl) File "/usr/lib/python2.7/dist-packages/foolscap/referenceable.py", line 897, in __init__ decode_furl(url) File "/usr/lib/python2.7/dist-packages/foolscap/referenceable.py", line 850, in decode_furl location_hints = decode_location_hints(hints) File "/usr/lib/python2.7/dist-packages/foolscap/referenceable.py", line 820, in decode_location_hints fields = dict([f.split("=") for f in pieces[1:]]) exceptions.ValueError: dictionary update sequence element #0 has length 1; 2 is required ``` This error could either happen when a client hears one of these FURLs from the introducer, when a client or server is configured with an introducer.furl in this format, or when any older node is configured with `tub.location=tcp:HOST:PORT` and is then booted a second time. The second-boot issue is that upon first boot, nodes write some of their generated FURLs into a "furlfile". This is where persistent FURLs (like the Introducer's `private/introducer.furl`, or a storage server's `private/storage.furl`) remember their "swissnum", the secret portion after the last slash. These recorded FURLs include `tub.location` as their connection hints. The node doesn't immediately care what's in these hints, it just advertises the FURLs to the introducer and writes them into the furlfiles. But on the second boot, if the furlfile exists, the node reads the contents back into memory and parses them (to get the swissnum). While parsing, as a side-effect, the node passes the connection hints to `decode_location_hints()`, which then raises an exception when it sees the `tcp:` prefix but not any `=` in the components. This exception means anything that is waiting for the Tub to be ready just won't get run, which means the Introducer Client isn't started, and the storage server isn't announced. And since the node has already daemonized by then, the original `tahoe start` doesn't show the error: you have to look in twistd.log to find out why we haven't heard anything from the introducer (this behavior is much better in current tahoe: `tahoe start` prints the traceback to stderr and then exits with an error). So even if you fix `tahoe.cfg` to set `tub.location = HOST:PORT`, you must also edit (or delete) those furlfiles (`private/introducer.furl`, `private/storage.furl`, `private/logport.furl`, maybe others I've missed), or the exception will happen again. The worst part is that I think this represents an incompatibility between Tahoe nodes created under debian/jesse and modern ones. If the modern nodes intentionally advertise old-style `HOST:PORT` hints, then they'll all work, but if they stick with the default `tcp:HOST:PORT`, then the jesse nodes will ignore those servers. I was hoping that we'd got the forwards-compatibility right, but Jesse captured the wrong version of Foolscap. If only we'd gotten Foolscap-0.7.0 out a few weeks earlier, this wouldn't be a problem. So I guess the resolution for this ticket is to add a note to the docs under "Compatibility", pointing out that you have to configure your modern introducer or servers with old-style hints if you want them to be useable by jesse-based servers or clients.
tahoe-lafs added the
code-network
normal
defect
1.11.0
labels 2016-09-15 17:53:14 +00:00
tahoe-lafs added this to the 1.12.0 milestone 2016-09-15 17:53:14 +00:00
tahoe-lafs changed title from connection-hint incompatibility with tahoe-1.11.0 and current versions? to connection-hint incompatibility with debian/jesse (foolscap-0.6.5) and current version 2016-09-15 17:53:56 +00:00
Brian Warner <warner@lothar.com> commented 2016-12-11 19:09:02 +00:00
Author
Owner

In 2e1a39e/trunk:

NEWS: more edits

Documents and closes ticket:2831 (Debian/Jesse FURL incompatibility).
In [2e1a39e/trunk](/tahoe-lafs/trac-2024-07-25/commit/2e1a39e630c8369fcc0fd1d18bd69636f7fec238): ``` NEWS: more edits Documents and closes ticket:2831 (Debian/Jesse FURL incompatibility). ```
tahoe-lafs added the
fixed
label 2016-12-11 19:09:02 +00:00
Brian Warner <warner@lothar.com> closed this issue 2016-12-11 19:09:02 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: tahoe-lafs/trac-2024-07-25#2831
No description provided.