LAFSify

[Imported from Trac: page FAQ, version 24]
terrell 2010-07-01 14:48:33 +00:00
parent 779bb0381c
commit 2cb83f972a

18
FAQ.md

@ -12,7 +12,7 @@ A: You know how with RAID-5 you can lose any one drive and still recover? And t
A: There isn't currently a way to disable or skip the encryption phase, but if you watch the status page on your local tahoe-lafs node for uploads, you'll see that the encryption time is orders (yes, plural) of magnitude smaller than the upload time, so there isn't much performance to be gained by skipping the encryption. We prefer 'secure by default', so without a compelling reason to allow insecure operation, our plan is to leave encryption turned on all the time. A: There isn't currently a way to disable or skip the encryption phase, but if you watch the status page on your local tahoe-lafs node for uploads, you'll see that the encryption time is orders (yes, plural) of magnitude smaller than the upload time, so there isn't much performance to be gained by skipping the encryption. We prefer 'secure by default', so without a compelling reason to allow insecure operation, our plan is to leave encryption turned on all the time.
**Q: Where should I look for current documentation about Tahoe's protocols?** **Q: Where should I look for current documentation about the Tahoe-LAFS protocols?**
A: <http://tahoe-lafs.org/source/tahoe/trunk/docs/architecture.txt> A: <http://tahoe-lafs.org/source/tahoe/trunk/docs/architecture.txt>
@ -39,16 +39,16 @@ A: Not directly. Each storage server has a single "base directory" which we abbr
**Q: Would it make sense to just use RAID-0 and let Tahoe-LAFS deal with the redundancy?** **Q: Would it make sense to just use RAID-0 and let Tahoe-LAFS deal with the redundancy?**
A: The Allmydata grid didn't bother with RAID at all: each Tahoe storage server node used a single spindle. A: The Allmydata grid didn't bother with RAID at all: each Tahoe-LAFS storage server node used a single spindle.
The "RAID and/or Tahoe" question depends upon how much you trust RAID vs how much you trust Tahoe, and how expensive the different forms of The "RAID and/or Tahoe-LAFS" question depends upon how much you trust RAID vs how much you trust Tahoe-LAFS, and how expensive the different forms of
repair would be. Tahoe can correctly be thought of as a form of "application-level RAID", with more flexibility than the usual RAID0/4/5 repair would be. Tahoe-LAFS can correctly be thought of as a form of "application-level RAID", with more flexibility than the usual RAID0/4/5
styles (I think RAID-0 is equivalent to 1-of-2 encoding, and RAID-5 is like 2-of-3). styles (I think RAID-0 is equivalent to 1-of-2 encoding, and RAID-5 is like 2-of-3).
Using RAID to achieve your redundancy gets you fairly fast repair, because it's all being handled by a controller that sits right on top of Using RAID to achieve your redundancy gets you fairly fast repair, because it's all being handled by a controller that sits right on top of
the raw drive. Tahoe's repair is a lot slower, because it is driven by a client that's examining one file at a time, and since there are a lot of the raw drive. Tahoe-LAFS's repair is a lot slower, because it is driven by a client that's examining one file at a time, and since there are a lot of
network roundtrips for each file. Doing a repair of a 1TB RAID-5 drive can easily be finished in a day. If that 1TB drive is filled with a network roundtrips for each file. Doing a repair of a 1TB RAID-5 drive can easily be finished in a day. If that 1TB drive is filled with a
million Tahoe files, the repair could take a month. On the other hand, many RAID configurations degrade significantly when a drive is lost, and million Tahoe-LAFS files, the repair could take a month. On the other hand, many RAID configurations degrade significantly when a drive is lost, and
Tahoe's read performance is nearly unaffected. So repair events may be infrequent enough to just let them happen quietly in the background and Tahoe-LAFS's read performance is nearly unaffected. So repair events may be infrequent enough to just let them happen quietly in the background and
not care much about how long they take. not care much about how long they take.
The optimal choice is a complicated one. Given inputs of: The optimal choice is a complicated one. Given inputs of:
@ -58,11 +58,11 @@ The optimal choice is a complicated one. Given inputs of:
* server/datacenter layout, inter/intra-colo bandwidth, costs<br> * server/datacenter layout, inter/intra-colo bandwidth, costs<br>
* drive/hardware costs<br> * drive/hardware costs<br>
it becomes a tradeoff between money (number of tahoe storage nodes, what sort of RAID [any]if you use for them, how many disks that means, how it becomes a tradeoff between money (number of Tahoe-LAFS storage nodes, what sort of RAID [any]if you use for them, how many disks that means, how
much those disks cost, how many computers you need to host them, how much bandwidth you spend doing upload/download/repair), bandwidth costs, much those disks cost, how many computers you need to host them, how much bandwidth you spend doing upload/download/repair), bandwidth costs,
read/write performance, and probability of file loss due to failures happening faster than repair. read/write performance, and probability of file loss due to failures happening faster than repair.
In addition, Tahoe's current repair code is not particularly clever: it doesn't put the new shares in exactly the right places, so you can In addition, Tahoe-LAFS's current repair code is not particularly clever: it doesn't put the new shares in exactly the right places, so you can
easily get shares doubled up and not distributed as evenly as if you'd done a single upload. This is being tracked in ticket #610. easily get shares doubled up and not distributed as evenly as if you'd done a single upload. This is being tracked in ticket #610.
'''Q: Suppose I have a file of 100GB and 2 storage nodes each with 75GB available, will I be able to store the file or does it have to fit '''Q: Suppose I have a file of 100GB and 2 storage nodes each with 75GB available, will I be able to store the file or does it have to fit