diff --git a/Performance/Sep2011.md b/Performance/Sep2011.md new file mode 100644 index 0000000..a72be0f --- /dev/null +++ b/Performance/Sep2011.md @@ -0,0 +1,95 @@ + +## The atlasperf1 grid + +All these performance tests are run on a four-machine grid, using hardware +generously provided by Atlas Networks. Each machine is a dual-core 3GHz P4, +connected with gigabit(?) ethernet links. Three machines are servers, running +two servers each (six storage servers in all), each on a separate disk. The +fourth machine is a client. The storage servers are running a variety of +versions. + +## Versions + +These tests were conducted from 19-Sep-2011 to 22-Sep-2011, against Tahoe +versions 1.7.1, 1.8.2, and trunk (circa 19-Sep-2011, about changeset:8e69b94588c1c0e7). + +## Overall Speed + +With the default encoding (k=3), trunk MDMF downloads on this grid run at +4.0MBps. Trunk CHK downloads run at 2.6MBps. (For historical comparison, the +old CHK downloader from 1.7.1 runs at 4.4MBps). CHK performance drops +significantly with larger k. + +## MDMF (trunk) + +MDMF is fast! Trunk downloads 1MB/10MB/100MB MDMF files at around 4MBps. The +download speed seems unaffected by k (from 1 to 60). Partial reads take the +expected amount of time: O(data_read), slightly quantized near the 128KiB +segment size. + +* MDMF read versus k, 100MB [MDMF-100MB-vs-k.png](../raw/attachments/Performance/Sep2011/MDMF-100MB-vs-k.png) (timing5.out) +* MDMF partial reads, 100MB [MDMF-100MB-partial.png](../raw/attachments/Performance/Sep2011/MDMF-100MB-partial.png) (timing6.out) +* MDMF partial reads, 1MB [MDMF-1MB-partial.png](../raw/attachments/Performance/Sep2011/MDMF-1MB-partial.png) (timing6.out) + + +## CHK (trunk) + +The new-downloader (introduced in 1.8.0) does not saturate these fast +connections. Compared to the old-downloader (in 1.7.1), downloads tend to be +about 3x slower. (note that this delay is probably completely hidden on slow +WAN links, and it's only the fast LAN connections of the atlasperf1 grid that +exposes the delay). In addition, both old and new downloaders suffer from a +linear slowdown as k increases. On the new-downloader, k=60 takes roughly 3x +more time than k=15. Trunk contains a fix for #1268 that might improve speeds +by 5% compared to 1.8.2. Partial reads take the expected amount of time, +although the segsize-quantization was nowhere nearly as clear as with MDMF. + + +* CHK (1.7.1/1.8.2/trunk) read versus k, 1MB [CHK-1MB-vs-k.png](../raw/attachments/Performance/Sep2011/CHK-1MB-vs-k.png) (t4/t/t3) +* CHK (1.7.1/1.8.2/trunk) read versus k, 100MB [CHK-100MB-vs-k.png](../raw/attachments/Performance/Sep2011/CHK-100MB-vs-k.png) (t4/t/t3) + +* CHK (1.8.2) read versus segsize, 1MB [CHK-1MB-vs-segsize.png](../raw/attachments/Performance/Sep2011/CHK-1MB-vs-segsize.png) (t2) +* CHK (1.8.2) read versus segsize, 100MB [CHK-100MB-vs-segsize.png](../raw/attachments/Performance/Sep2011/CHK-100MB-vs-segsize.png) (t2) + +* CHK (trunk) partial reads, 1MB [CHK-1MB-partial.png](../raw/attachments/Performance/Sep2011/CHK-1MB-partial.png) (t7) +* CHK (trunk) partial reads, k=3, 1MB [CHK-1MB-k3-partial.png](../raw/attachments/Performance/Sep2011/CHK-1MB-k3-partial.png) (t7) +* CHK (trunk) partial reads, 100MB [CHK-100MB-partial.png](../raw/attachments/Performance/Sep2011/CHK-100MB-partial.png) (t7) + + +Likely problems include: + +* high k and default segsize=128KiB means tiny segments, like 2KB when k=60. +* lots of reads, lots of foolscap messages, and marshalling is probably slow +* disk seeks to gather hash nodes from all over the share + +Likely fixes include: + +* add a readv() API, to reduce the number of Foolscap messages in flight +* prefetch hash-tree nodes, by reading larger chunks of the tree at once. +> old-downloader cheated by reading the whole hash tree at once, violating +> the memory footprint goals (requires O(numsegments) memory), but probably +> tolerable unless the filesize is really large. +* encourage use of larger segsize for large files (at the expense of +> alacrity) +* use unencrypted HTTP for share reads + +readv() is the least-work/most-promising, since MDMF has readv() and achieves +high speeds. First step is to do whatever MDMF is doing + +## Future Tests + +* measure alacrity: ask for random single byte, measure elapsed time +* measure partial-read speeds for CHK +* measure SDMF/MDMF modification times +* measure upload times +* using existing data as a baseline, detect outliers in real-time during the +> benchmark run, and capture more information about them (their "Recent +> Uploads And Downloads" event timeline, for starters) + +## Additional Notes + +Some graphs were added to + . + +Complete benchmark toolchain and data included in +[benchmark.git.tar.gz](../raw/attachments/Performance/Sep2011/benchmark.git.tar.gz)