describe the effects of the debian SSL bug on Tahoe

[Imported from Trac: page TahoeVsDebianBuggyOpenSsl, version 1]
2008-05-22 21:12:51 +00:00 · 2008-05-22 21:12:51 +00:00 · da99220f25
parent 161fee0949
commit da99220f25
1 changed files with 175 additions and 0 deletions
--- a/TahoeVsDebianBuggyOpenSsl.md
+++ b/TahoeVsDebianBuggyOpenSsl.md
@ -0,0 +1,175 @@
 The Debian OpenSSL bug that was announced last week has some effects on
 Foolscap security, detailed by the Foolscap trac page:
 <http://foolscap.lothar.com/trac/wiki/DebianOpenSslBug>
 Now, what are the consequences for Tahoe?
 In summary: not very severe. Once you've upgraded to the fixed openssl
 library, the lingering effects of weak keys are (starting with the most
 severe):
 1. a successful Man-in-the-middle attack could allow the attacker to delete
    (or roll back) mutable file shares for which they do not have the
    write-cap.
 2. clients who were not given the introducer.furl could use a MitM attack to
    connect to the introducer anyway, and from there get access to storage
    servers
 3. clients who were not given a helper.furl could use a MitM attack to
    connect to (and use) a helper process
 4. clients who were not given a key-generator.furl could use a MitM attack
    to connect to (and drain the keys out of) a key generator. This is a DoS
    attack only.
 5. attackers could mount a MitM attack between a node and its log-gatherer,
    allowing the attacker to view the node's logs (which contain no secrets,
    but which would assist a traffic-analysis attack)
 The only vulnerable component of Tahoe is the Foolscap TubID. All other
 random numbers are either generated by Crypto++ or by calling os.urandom()
 (which uses the kernel's /dev/urandom RNG): this includes the AES and RSA
 keys used for write-caps, and the unguessable swissnums used to grant access
 to Referenceables.
 Tahoe benefits immensely from its conservative "trust nobody" design: none of
 the important secrets leave the user's computer. We were somewhat lucky that
 openssl was not used to generate any of thse important secrets. The remaining
 problems are described below.
 ## Mutable File Share write-secrets
 The authority to modify a mutable file is expressed in its "write-cap", which
 includes enough information to obtain an RSA private (signing) key. Anyone
 who can sign shares with the right key will be able to modify the file any
 way they please.
 These shares are stored on untrusted servers, who could damage or delete them
 (since there are extensive cryptographic hashes checked on each share,
 culminating in the RSA signature, damaging a share is equivalent to deleting
 it). The servers could also "roll back" the share to an earlier state. If
 enough servers do this, a client could see the file revert back to an earlier
 version. Rollback is the one way in which the servers can extert a form of
 "write authority" over a mutable file. Other parties are not supposed to have
 any such power.
 To reduce storage server workload, and to reduce version dependencies, the
 servers do not actually check this signature at upload/modify time (clients
 who are downloading the mutable file are the only ones who check it).
 Instead, when the mutable file's shares are created for the first time, the
 original uploader creates a set of "write secrets", one for each server,
 which are derived from the hash of the write-cap and the server's peerid. The
 server will accept an update from anyone who can provide the same secret.
 These secrets are different for each server, so serverA has no authority over
 a different share of the same file on serverB.
 Since these shared secrets are sent over the Foolscap connection with no
 further encryption, a successful MitM attack (accomplished against a storage
 server that uses a Tub certificate generated by the buggy version of OpenSSL)
 could reveal these secrets to the attacker. This attacker would then get the
 authority to make changes to those shares. They would be unable to forge
 valid signatures, so they would be limited to the same deletion-or-rollback
 attacks that the server could perform. They could only perform these attacks
 on the servers that had weak Tub certificates.
 ## Unauthorized Access To introducer/helper/key-generator
 Several configuration controls use FURLs to provide/limit access to certain
 grid services. The main one is the introducer.furl : clients use this to
 contact the Introducer, from which they get access to all storage servers. In
 the current release, access to the storage servers can be withheld by not
 publishing the introducer.furl . (we plan to change this: once Accounting is
 in place, the introducer will be more public, and access to storage servers
 will be controlled by a signed and authorized private key).
 If the Introducer was created with the buggy version of openssl, its TubID
 will be guessable. This enables a man-in-the-middle attack between an
 authorized client and the Introducer, from which the attacker can learn the
 unguessable swissnum that protects access to the Introducer. A successful
 attack would thus allow an unauthorized party to connect to the Introducer
 and therefore use storage services.
 Similarly, access to the Helper and the Key-Generator is enabled/protected by
 distributing FURLs, and when these FURLs use guessable Tub certificates, an
 attacker will be able to perform a successful MitM attack against a user of
 the service. From this, the attacker can learn the swissnum, and thus gain
 access to the service.
 Unauthorized access to the Helper means the attacker gets to upload files and
 consume the Helper's CPU time (which may have been intended to be reserved
 for paying customers).
 The "key generator" is a small process that creates RSA keypairs, intended to
 offload mutable file creation work from a webapi server. (the RSA key
 generation process involves 0.5s to 3.0s of blocking CPU time, so the webapi
 machine's responsiveness to other requests is improved by passing the work to
 a separate process). It pre-generates a small pool of keys to respond faster.
 An attacker who uses an MitM attack to gain access to the key generator could
 request a lot of keys, causing extra CPU load and draining this pool, which
 would slow down legitimate requests.
 ## Log Gatherer
 Tahoe nodes can be configured with a log-gatherer.furl, which directs the
 node to connect to the given gatherer and offer its "log port". The log port
 can be used to retrieve stored log messages, and to subscribe to new ones.
 Grid managers can use this to record verbose information about uploads and
 downloads.
 If the log-gatherer is using a weak Tub certificate, an attacker could mount
 a successfuly MitM attack between the node and the gatherer, revealing the
 swissnum of the node's logport. This would allow the attacker to see the same
 log messages that the gatherer sees.
 By design, Tahoe nodes do not log secrets. Instead, most upload/download
 operations refer to the Storage Index of the file being processed, which is
 public information (storage servers and several diagnostic web pages show the
 SI values). However, the logs do contain file sizes, and the information
 therein would be useful to an attacker interested in performing a
 traffic-analysis attack: it could help them learn who is interested in the
 same file, or who is downloading a file that someone else uploaded. So, while
 it does not threaten data confidentiality or integrity, you still wouldn't
 want to publish logs to the world, which is why the log-gatherer.furl is
 meant to control how this gets published.
 ## Fixing The Problems
 To fix these problems, server operators need to regenerate any Tub
 certificates that were created while the buggy version of openssl was
 installed. However, there are several operational problems that may make this
 more difficult than it sounds.
 * introducer.furl: All clients need to be updated with the new FURL, which may
   require touching hundreds of client machines. Since the Introducer FURL is
   the primary entry point, Tahoe does not have a mechanism to automatically
   update it from some other server.
 * helper.furl: same problem. Eventually, Helpers will be accessed through the
   Introducer, but in the current release, the helper is configured by writing
   to the helper.furl file, so it must be updated as well
 * storage servers: Storage Server FURLs are distributed through the
   Introducer, so it would seem straightforward to delete the server's
   "node.pem" file, restart it, and allow it to generate a new one: the server
   would connect to the introducer and appear as a brand new server (that
   happens to have the same shares as it did before).
  * However, there are two problems that will result if this is done with the
    current release. The most significant is that clients use shared secrets
    derived partially from the storage server's TubID. The most important one
    is the mutable-share write-secret, which allows clients to modify mutable
    files (including modifying directories). If the storage server's TubID no
    longer matches the secret that was stored in the share, then clients will
    get errors when they attempt to modify those shares. In many cases, this
    will prevent users from modifying their directories.
  * There are plans to fix this: the error message includes the TubID that was
    used to generate the secret, so the plan is to add a storage API that
    allows the client to change the shared secret (by providing both the old
    one and the new one). This will allow clients to tolerate shares being
    moved from one server to another, which would be the effect of
    regenerating the Tub certificate for those storage servers.
  * The second problem is that the peer selection algorithm would now see
    shares in non-optimal places. This would look a lot like large-scale
    churn: shares being moved to random servers, not necessarily the same
    servers that the node would expect to find them on. The peer selection
    algorithm is designed to tolerate this, but the effect will be a
    slowdown: nodes will be looking for their shares in the wrong place, so
    they'll have to search further than usual, and this will take additional
    round trips. So changing the server's TubIDs will also affect client
    download performance. To address this, a file-repair step that moves
    shares to their ideal locations needs to be written.