[Imported from Trac: page Performance, version 2]

warner 2007-09-09 00:09:46 +00:00
parent 5e10593467
commit b5e274edf2

@ -36,20 +36,24 @@ upload algorithm sends data to all shareholders in parallel, but these 9
phases are done sequentially. The phases are:
1. allocate_buckets
1. send_subshare (once per segment)
1. send_plaintext_hash_tree
1. send_crypttext_hash_tree
1. send_subshare_hash_trees
1. send_share_hash_trees
1. send_UEB
1. close
1. dirnode update
2. send_subshare (once per segment)
3. send_plaintext_hash_tree
4. send_crypttext_hash_tree
5. send_subshare_hash_trees
6. send_share_hash_trees
7. send_UEB
8. close
9. dirnode update
We need to keep the send_subshare calls sequential (to keep our memory
footprint down), and we need a barrier between the close and the dirnode
update (for robustness and clarity), but the others could be pipelined.
9*14ms=126ms, which accounts for about 15% of the measured upload time.
Doing steps 2-8 in parallel (using the attached pipeline-sends.diff patch)
does indeed seem to bring the time-per-file down from 900ms to about 800ms,
although the results aren't conclusive.
## Storage Servers
ext3 (on tahoebs1) refuses to create more than 32000 subdirectories in a