evaluate different share-storage schemes #464

Open
opened 2008-06-14 17:04:03 +00:00 by warner · 0 comments
warner commented 2008-06-14 17:04:03 +00:00
Owner

Our current share-storage scheme is simple and easy to understand, but may cause performance problems at some point. It would be nice to know how fast it will run, and how fast some different schemes might be.

So the task is:

  • look at an existing storage server to get a count and size-distribution of files (just a histogram of filesizes)
  • look at the logs to get a traffic mix: what percentage of operations are reads vs writes, and what percentage of the reads are for shares that the server has (rather than ones that the server is missing)
  • use this information to create a tool that uses a StorageServer instance to create a similar share directory, in a configurable size. We should be able to create 1GB, 10GB, 100GB, or 1TB of shares in a similar ratio as a real store.
  • use the traffic-mix information to create a tool that queries the StorageServer instance with the same traffic characteristics as real servers do, with a configurable rate: simulate 10 average clients, 100 clients, 1000 clients, etc.
  • measure the performance of the server:
    • how long do the queries take (milliseconds per query, look at the mean, median, and 90th percentile)
    • kernel-level disk IO stats: blocks per second, see if we can count seeks per second
    • space consumed (as measured by 'df') vs the total size of the shares that were written: measure the fs overhead, including minimum block size and extra directories
    • the filesystem type (ext3, xfs, reiser, etc) must be recorded with each measurement, along with the storage scheme in use
  • evaluate other filesystem types
  • evaluate other storage schemes:
    • current is 2-level: ab/abcdef../SHNUM
    • try 3-level, maybe up to 10-level
    • pack small shares for different SIs into one file, use an offset table to locate the share

When we're done with this, we should have a good idea about how many simultaneous clients our existing scheme can handle before we run out of disk bandwidth (or seek bandwidth), at which point we'll need to switch to something more sophisticated.

Our current share-storage scheme is simple and easy to understand, but may cause performance problems at some point. It would be nice to know how fast it will run, and how fast some different schemes might be. So the task is: * look at an existing storage server to get a count and size-distribution of files (just a histogram of filesizes) * look at the logs to get a traffic mix: what percentage of operations are reads vs writes, and what percentage of the reads are for shares that the server has (rather than ones that the server is missing) * use this information to create a tool that uses a `StorageServer` instance to create a similar share directory, in a configurable size. We should be able to create 1GB, 10GB, 100GB, or 1TB of shares in a similar ratio as a real store. * use the traffic-mix information to create a tool that queries the `StorageServer` instance with the same traffic characteristics as real servers do, with a configurable rate: simulate 10 average clients, 100 clients, 1000 clients, etc. * measure the performance of the server: * how long do the queries take (milliseconds per query, look at the mean, median, and 90th percentile) * kernel-level disk IO stats: blocks per second, see if we can count seeks per second * space consumed (as measured by 'df') vs the total size of the shares that were written: measure the fs overhead, including minimum block size and extra directories * the filesystem type (ext3, xfs, reiser, etc) must be recorded with each measurement, along with the storage scheme in use * evaluate other filesystem types * evaluate other storage schemes: * current is 2-level: ab/abcdef../SHNUM * try 3-level, maybe up to 10-level * pack small shares for different SIs into one file, use an offset table to locate the share When we're done with this, we should have a good idea about how many simultaneous clients our existing scheme can handle before we run out of disk bandwidth (or seek bandwidth), at which point we'll need to switch to something more sophisticated.
tahoe-lafs added the
code-storage
major
task
1.1.0
labels 2008-06-14 17:04:03 +00:00
tahoe-lafs added this to the undecided milestone 2008-06-14 17:04:03 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: tahoe-lafs/trac-2024-07-25#464
No description provided.