add report on performance tests

[Imported from Trac: page Performance/Sep2011, version 1]
warner 2011-09-23 06:17:45 +00:00
parent ff0c943b89
commit 53a3a54bf4

95
Performance/Sep2011.md Normal file

@ -0,0 +1,95 @@
## The atlasperf1 grid
All these performance tests are run on a four-machine grid, using hardware
generously provided by Atlas Networks. Each machine is a dual-core 3GHz P4,
connected with gigabit(?) ethernet links. Three machines are servers, running
two servers each (six storage servers in all), each on a separate disk. The
fourth machine is a client. The storage servers are running a variety of
versions.
## Versions
These tests were conducted from 19-Sep-2011 to 22-Sep-2011, against Tahoe
versions 1.7.1, 1.8.2, and trunk (circa 19-Sep-2011, about changeset:8e69b94588c1c0e7).
## Overall Speed
With the default encoding (k=3), trunk MDMF downloads on this grid run at
4.0MBps. Trunk CHK downloads run at 2.6MBps. (For historical comparison, the
old CHK downloader from 1.7.1 runs at 4.4MBps). CHK performance drops
significantly with larger k.
## MDMF (trunk)
MDMF is fast! Trunk downloads 1MB/10MB/100MB MDMF files at around 4MBps. The
download speed seems unaffected by k (from 1 to 60). Partial reads take the
expected amount of time: O(data_read), slightly quantized near the 128KiB
segment size.
* MDMF read versus k, 100MB [MDMF-100MB-vs-k.png](../raw/attachments/Performance/Sep2011/MDMF-100MB-vs-k.png) (timing5.out)
* MDMF partial reads, 100MB [MDMF-100MB-partial.png](../raw/attachments/Performance/Sep2011/MDMF-100MB-partial.png) (timing6.out)
* MDMF partial reads, 1MB [MDMF-1MB-partial.png](../raw/attachments/Performance/Sep2011/MDMF-1MB-partial.png) (timing6.out)
## CHK (trunk)
The new-downloader (introduced in 1.8.0) does not saturate these fast
connections. Compared to the old-downloader (in 1.7.1), downloads tend to be
about 3x slower. (note that this delay is probably completely hidden on slow
WAN links, and it's only the fast LAN connections of the atlasperf1 grid that
exposes the delay). In addition, both old and new downloaders suffer from a
linear slowdown as k increases. On the new-downloader, k=60 takes roughly 3x
more time than k=15. Trunk contains a fix for #1268 that might improve speeds
by 5% compared to 1.8.2. Partial reads take the expected amount of time,
although the segsize-quantization was nowhere nearly as clear as with MDMF.
* CHK (1.7.1/1.8.2/trunk) read versus k, 1MB [CHK-1MB-vs-k.png](../raw/attachments/Performance/Sep2011/CHK-1MB-vs-k.png) (t4/t/t3)
* CHK (1.7.1/1.8.2/trunk) read versus k, 100MB [CHK-100MB-vs-k.png](../raw/attachments/Performance/Sep2011/CHK-100MB-vs-k.png) (t4/t/t3)
* CHK (1.8.2) read versus segsize, 1MB [CHK-1MB-vs-segsize.png](../raw/attachments/Performance/Sep2011/CHK-1MB-vs-segsize.png) (t2)
* CHK (1.8.2) read versus segsize, 100MB [CHK-100MB-vs-segsize.png](../raw/attachments/Performance/Sep2011/CHK-100MB-vs-segsize.png) (t2)
* CHK (trunk) partial reads, 1MB [CHK-1MB-partial.png](../raw/attachments/Performance/Sep2011/CHK-1MB-partial.png) (t7)
* CHK (trunk) partial reads, k=3, 1MB [CHK-1MB-k3-partial.png](../raw/attachments/Performance/Sep2011/CHK-1MB-k3-partial.png) (t7)
* CHK (trunk) partial reads, 100MB [CHK-100MB-partial.png](../raw/attachments/Performance/Sep2011/CHK-100MB-partial.png) (t7)
Likely problems include:
* high k and default segsize=128KiB means tiny segments, like 2KB when k=60.
* lots of reads, lots of foolscap messages, and marshalling is probably slow
* disk seeks to gather hash nodes from all over the share
Likely fixes include:
* add a readv() API, to reduce the number of Foolscap messages in flight
* prefetch hash-tree nodes, by reading larger chunks of the tree at once.
> old-downloader cheated by reading the whole hash tree at once, violating
> the memory footprint goals (requires O(numsegments) memory), but probably
> tolerable unless the filesize is really large.
* encourage use of larger segsize for large files (at the expense of
> alacrity)
* use unencrypted HTTP for share reads
readv() is the least-work/most-promising, since MDMF has readv() and achieves
high speeds. First step is to do whatever MDMF is doing
## Future Tests
* measure alacrity: ask for random single byte, measure elapsed time
* measure partial-read speeds for CHK
* measure SDMF/MDMF modification times
* measure upload times
* using existing data as a baseline, detect outliers in real-time during the
> benchmark run, and capture more information about them (their "Recent
> Uploads And Downloads" event timeline, for starters)
## Additional Notes
Some graphs were added to
<http://tahoe-lafs.org/trac/tahoe-lafs/ticket/1264#comment:17> .
Complete benchmark toolchain and data included in
[benchmark.git.tar.gz](../raw/attachments/Performance/Sep2011/benchmark.git.tar.gz)