diff --git a/Performance.md b/Performance.md index 5a6103f..ccdf40a 100644 --- a/Performance.md +++ b/Performance.md @@ -36,20 +36,24 @@ upload algorithm sends data to all shareholders in parallel, but these 9 phases are done sequentially. The phases are: 1. allocate_buckets - 1. send_subshare (once per segment) - 1. send_plaintext_hash_tree - 1. send_crypttext_hash_tree - 1. send_subshare_hash_trees - 1. send_share_hash_trees - 1. send_UEB - 1. close - 1. dirnode update + 2. send_subshare (once per segment) + 3. send_plaintext_hash_tree + 4. send_crypttext_hash_tree + 5. send_subshare_hash_trees + 6. send_share_hash_trees + 7. send_UEB + 8. close + 9. dirnode update We need to keep the send_subshare calls sequential (to keep our memory footprint down), and we need a barrier between the close and the dirnode update (for robustness and clarity), but the others could be pipelined. 9*14ms=126ms, which accounts for about 15% of the measured upload time. +Doing steps 2-8 in parallel (using the attached pipeline-sends.diff patch) +does indeed seem to bring the time-per-file down from 900ms to about 800ms, +although the results aren't conclusive. + ## Storage Servers ext3 (on tahoebs1) refuses to create more than 32000 subdirectories in a