add report on performance tests
[Imported from Trac: page Performance/Sep2011, version 1]
parent
ff0c943b89
commit
53a3a54bf4
95
Performance/Sep2011.md
Normal file
95
Performance/Sep2011.md
Normal file
|
@ -0,0 +1,95 @@
|
||||||
|
|
||||||
|
## The atlasperf1 grid
|
||||||
|
|
||||||
|
All these performance tests are run on a four-machine grid, using hardware
|
||||||
|
generously provided by Atlas Networks. Each machine is a dual-core 3GHz P4,
|
||||||
|
connected with gigabit(?) ethernet links. Three machines are servers, running
|
||||||
|
two servers each (six storage servers in all), each on a separate disk. The
|
||||||
|
fourth machine is a client. The storage servers are running a variety of
|
||||||
|
versions.
|
||||||
|
|
||||||
|
## Versions
|
||||||
|
|
||||||
|
These tests were conducted from 19-Sep-2011 to 22-Sep-2011, against Tahoe
|
||||||
|
versions 1.7.1, 1.8.2, and trunk (circa 19-Sep-2011, about changeset:8e69b94588c1c0e7).
|
||||||
|
|
||||||
|
## Overall Speed
|
||||||
|
|
||||||
|
With the default encoding (k=3), trunk MDMF downloads on this grid run at
|
||||||
|
4.0MBps. Trunk CHK downloads run at 2.6MBps. (For historical comparison, the
|
||||||
|
old CHK downloader from 1.7.1 runs at 4.4MBps). CHK performance drops
|
||||||
|
significantly with larger k.
|
||||||
|
|
||||||
|
## MDMF (trunk)
|
||||||
|
|
||||||
|
MDMF is fast! Trunk downloads 1MB/10MB/100MB MDMF files at around 4MBps. The
|
||||||
|
download speed seems unaffected by k (from 1 to 60). Partial reads take the
|
||||||
|
expected amount of time: O(data_read), slightly quantized near the 128KiB
|
||||||
|
segment size.
|
||||||
|
|
||||||
|
* MDMF read versus k, 100MB [MDMF-100MB-vs-k.png](../raw/attachments/Performance/Sep2011/MDMF-100MB-vs-k.png) (timing5.out)
|
||||||
|
* MDMF partial reads, 100MB [MDMF-100MB-partial.png](../raw/attachments/Performance/Sep2011/MDMF-100MB-partial.png) (timing6.out)
|
||||||
|
* MDMF partial reads, 1MB [MDMF-1MB-partial.png](../raw/attachments/Performance/Sep2011/MDMF-1MB-partial.png) (timing6.out)
|
||||||
|
|
||||||
|
|
||||||
|
## CHK (trunk)
|
||||||
|
|
||||||
|
The new-downloader (introduced in 1.8.0) does not saturate these fast
|
||||||
|
connections. Compared to the old-downloader (in 1.7.1), downloads tend to be
|
||||||
|
about 3x slower. (note that this delay is probably completely hidden on slow
|
||||||
|
WAN links, and it's only the fast LAN connections of the atlasperf1 grid that
|
||||||
|
exposes the delay). In addition, both old and new downloaders suffer from a
|
||||||
|
linear slowdown as k increases. On the new-downloader, k=60 takes roughly 3x
|
||||||
|
more time than k=15. Trunk contains a fix for #1268 that might improve speeds
|
||||||
|
by 5% compared to 1.8.2. Partial reads take the expected amount of time,
|
||||||
|
although the segsize-quantization was nowhere nearly as clear as with MDMF.
|
||||||
|
|
||||||
|
|
||||||
|
* CHK (1.7.1/1.8.2/trunk) read versus k, 1MB [CHK-1MB-vs-k.png](../raw/attachments/Performance/Sep2011/CHK-1MB-vs-k.png) (t4/t/t3)
|
||||||
|
* CHK (1.7.1/1.8.2/trunk) read versus k, 100MB [CHK-100MB-vs-k.png](../raw/attachments/Performance/Sep2011/CHK-100MB-vs-k.png) (t4/t/t3)
|
||||||
|
|
||||||
|
* CHK (1.8.2) read versus segsize, 1MB [CHK-1MB-vs-segsize.png](../raw/attachments/Performance/Sep2011/CHK-1MB-vs-segsize.png) (t2)
|
||||||
|
* CHK (1.8.2) read versus segsize, 100MB [CHK-100MB-vs-segsize.png](../raw/attachments/Performance/Sep2011/CHK-100MB-vs-segsize.png) (t2)
|
||||||
|
|
||||||
|
* CHK (trunk) partial reads, 1MB [CHK-1MB-partial.png](../raw/attachments/Performance/Sep2011/CHK-1MB-partial.png) (t7)
|
||||||
|
* CHK (trunk) partial reads, k=3, 1MB [CHK-1MB-k3-partial.png](../raw/attachments/Performance/Sep2011/CHK-1MB-k3-partial.png) (t7)
|
||||||
|
* CHK (trunk) partial reads, 100MB [CHK-100MB-partial.png](../raw/attachments/Performance/Sep2011/CHK-100MB-partial.png) (t7)
|
||||||
|
|
||||||
|
|
||||||
|
Likely problems include:
|
||||||
|
|
||||||
|
* high k and default segsize=128KiB means tiny segments, like 2KB when k=60.
|
||||||
|
* lots of reads, lots of foolscap messages, and marshalling is probably slow
|
||||||
|
* disk seeks to gather hash nodes from all over the share
|
||||||
|
|
||||||
|
Likely fixes include:
|
||||||
|
|
||||||
|
* add a readv() API, to reduce the number of Foolscap messages in flight
|
||||||
|
* prefetch hash-tree nodes, by reading larger chunks of the tree at once.
|
||||||
|
> old-downloader cheated by reading the whole hash tree at once, violating
|
||||||
|
> the memory footprint goals (requires O(numsegments) memory), but probably
|
||||||
|
> tolerable unless the filesize is really large.
|
||||||
|
* encourage use of larger segsize for large files (at the expense of
|
||||||
|
> alacrity)
|
||||||
|
* use unencrypted HTTP for share reads
|
||||||
|
|
||||||
|
readv() is the least-work/most-promising, since MDMF has readv() and achieves
|
||||||
|
high speeds. First step is to do whatever MDMF is doing
|
||||||
|
|
||||||
|
## Future Tests
|
||||||
|
|
||||||
|
* measure alacrity: ask for random single byte, measure elapsed time
|
||||||
|
* measure partial-read speeds for CHK
|
||||||
|
* measure SDMF/MDMF modification times
|
||||||
|
* measure upload times
|
||||||
|
* using existing data as a baseline, detect outliers in real-time during the
|
||||||
|
> benchmark run, and capture more information about them (their "Recent
|
||||||
|
> Uploads And Downloads" event timeline, for starters)
|
||||||
|
|
||||||
|
## Additional Notes
|
||||||
|
|
||||||
|
Some graphs were added to
|
||||||
|
<http://tahoe-lafs.org/trac/tahoe-lafs/ticket/1264#comment:17> .
|
||||||
|
|
||||||
|
Complete benchmark toolchain and data included in
|
||||||
|
[benchmark.git.tar.gz](../raw/attachments/Performance/Sep2011/benchmark.git.tar.gz)
|
Loading…
Reference in a new issue