add report on performance tests

[Imported from Trac: page Performance/Sep2011, version 1]
2011-09-23 06:17:45 +00:00 · 2011-09-23 06:17:45 +00:00 · 53a3a54bf4
parent ff0c943b89
commit 53a3a54bf4
1 changed files with 95 additions and 0 deletions
--- a/Performance/Sep2011.md
+++ b/Performance/Sep2011.md
@ -0,0 +1,95 @@
 ## The atlasperf1 grid
 All these performance tests are run on a four-machine grid, using hardware
 generously provided by Atlas Networks. Each machine is a dual-core 3GHz P4,
 connected with gigabit(?) ethernet links. Three machines are servers, running
 two servers each (six storage servers in all), each on a separate disk. The
 fourth machine is a client. The storage servers are running a variety of
 versions.
 ## Versions
 These tests were conducted from 19-Sep-2011 to 22-Sep-2011, against Tahoe
 versions 1.7.1, 1.8.2, and trunk (circa 19-Sep-2011, about changeset:8e69b94588c1c0e7).
 ## Overall Speed
 With the default encoding (k=3), trunk MDMF downloads on this grid run at
 4.0MBps. Trunk CHK downloads run at 2.6MBps. (For historical comparison, the
 old CHK downloader from 1.7.1 runs at 4.4MBps). CHK performance drops
 significantly with larger k.
 ## MDMF (trunk)
 MDMF is fast! Trunk downloads 1MB/10MB/100MB MDMF files at around 4MBps. The
 download speed seems unaffected by k (from 1 to 60). Partial reads take the
 expected amount of time: O(data_read), slightly quantized near the 128KiB
 segment size.
 * MDMF read versus k, 100MB [MDMF-100MB-vs-k.png](../raw/attachments/Performance/Sep2011/MDMF-100MB-vs-k.png) (timing5.out)
 * MDMF partial reads, 100MB [MDMF-100MB-partial.png](../raw/attachments/Performance/Sep2011/MDMF-100MB-partial.png) (timing6.out)
 * MDMF partial reads, 1MB [MDMF-1MB-partial.png](../raw/attachments/Performance/Sep2011/MDMF-1MB-partial.png) (timing6.out)
 ## CHK (trunk)
 The new-downloader (introduced in 1.8.0) does not saturate these fast
 connections. Compared to the old-downloader (in 1.7.1), downloads tend to be
 about 3x slower. (note that this delay is probably completely hidden on slow
 WAN links, and it's only the fast LAN connections of the atlasperf1 grid that
 exposes the delay). In addition, both old and new downloaders suffer from a
 linear slowdown as k increases. On the new-downloader, k=60 takes roughly 3x
 more time than k=15. Trunk contains a fix for #1268 that might improve speeds
 by 5% compared to 1.8.2. Partial reads take the expected amount of time,
 although the segsize-quantization was nowhere nearly as clear as with MDMF.
 * CHK (1.7.1/1.8.2/trunk) read versus k, 1MB [CHK-1MB-vs-k.png](../raw/attachments/Performance/Sep2011/CHK-1MB-vs-k.png) (t4/t/t3)
 * CHK (1.7.1/1.8.2/trunk) read versus k, 100MB [CHK-100MB-vs-k.png](../raw/attachments/Performance/Sep2011/CHK-100MB-vs-k.png) (t4/t/t3)
 * CHK (1.8.2) read versus segsize, 1MB [CHK-1MB-vs-segsize.png](../raw/attachments/Performance/Sep2011/CHK-1MB-vs-segsize.png) (t2)
 * CHK (1.8.2) read versus segsize, 100MB [CHK-100MB-vs-segsize.png](../raw/attachments/Performance/Sep2011/CHK-100MB-vs-segsize.png) (t2)
 * CHK (trunk) partial reads, 1MB [CHK-1MB-partial.png](../raw/attachments/Performance/Sep2011/CHK-1MB-partial.png) (t7)
 * CHK (trunk) partial reads, k=3, 1MB [CHK-1MB-k3-partial.png](../raw/attachments/Performance/Sep2011/CHK-1MB-k3-partial.png) (t7)
 * CHK (trunk) partial reads, 100MB [CHK-100MB-partial.png](../raw/attachments/Performance/Sep2011/CHK-100MB-partial.png) (t7)
 Likely problems include:
 * high k and default segsize=128KiB means tiny segments, like 2KB when k=60.
 * lots of reads, lots of foolscap messages, and marshalling is probably slow
 * disk seeks to gather hash nodes from all over the share
 Likely fixes include:
 * add a readv() API, to reduce the number of Foolscap messages in flight
 * prefetch hash-tree nodes, by reading larger chunks of the tree at once.
 > old-downloader cheated by reading the whole hash tree at once, violating
 > the memory footprint goals (requires O(numsegments) memory), but probably
 > tolerable unless the filesize is really large.
 * encourage use of larger segsize for large files (at the expense of
 > alacrity)
 * use unencrypted HTTP for share reads
 readv() is the least-work/most-promising, since MDMF has readv() and achieves
 high speeds. First step is to do whatever MDMF is doing
 ## Future Tests
 * measure alacrity: ask for random single byte, measure elapsed time
 * measure partial-read speeds for CHK
 * measure SDMF/MDMF modification times
 * measure upload times
 * using existing data as a baseline, detect outliers in real-time during the
 > benchmark run, and capture more information about them (their "Recent
 > Uploads And Downloads" event timeline, for starters)
 ## Additional Notes
 Some graphs were added to
 <http://tahoe-lafs.org/trac/tahoe-lafs/ticket/1264#comment:17> .
 Complete benchmark toolchain and data included in
 [benchmark.git.tar.gz](../raw/attachments/Performance/Sep2011/benchmark.git.tar.gz)