tahoe-lafs/trac-2024-07-25

peer selection could be more uniform #132

New issue

Closed

opened 2007-09-16 23:57:06 +00:00 by warner · 1 comment

warner commented

2007-09-16 23:57:06 +00:00

Owner

Our new Tahoe2 peer selection algorithm is a lot more uniform than the previous Tahoe3 one. But
the way I implemented it could still be improved a bit.

The issue is when N (the total number of shares being created) is not an exact multiple of S (the number of servers available). For example, on our current testnet, we have 3-of-10 encoding (so N=10) and 3 servers (so S=3). I'd really like the share allocation to be 4+3+3, because then you could lose any two servers and still recover the file. But the current code does 4+4+2 instead, which still leaves a lose-two-servers case that fails.

The reason for this is the way I built the peer-selection loop. We try to avoid requiring more than two passes around the circle of peers. During the first pass, we ask each server to hold a single share. When we get back to the start, if we see that we still have shares left, we do shares_per_server = div_ceil(remaining_shares, num_servers), and then ask each peer to hold that many shares.

For our N=10/S=3 example, this places one share per server on the first loop, then starts the second loop with remaining_shares=7, and div_ceil(7,3) is 3, so it gives 3 additional shares (for a total of 4) to the first server, same to the second server, then it sees that it only has one unplaced share left and gives just that to the last server, resulting in an allocation of 4+4+2.

We only recalculate shares_per_server at the top of each loop. I think the way to address this is going to involve using div_floor instead of div_ceil and distributing the remainder by sending an extra share to every Nth server, somehow. It would probably be easier if we had a
convenient way to know how many servers we've looped through so far. If we knew how many servers
were left to ask on this particular loop, we could do div_floor(remaining_shares, remaining_servers)
and that would probably get us the desired properties. To do this, as we make the second pass,
we should move servers from self.contacted_servers to a holding list, and then move the
whole batch of them back into self.contacted_servers when we exhaust that list.

Our new Tahoe2 peer selection algorithm is a lot more uniform than the previous Tahoe3 one. But the way I implemented it could still be improved a bit. The issue is when N (the total number of shares being created) is not an exact multiple of S (the number of servers available). For example, on our current testnet, we have 3-of-10 encoding (so N=10) and 3 servers (so S=3). I'd really like the share allocation to be 4+3+3, because then you could lose any two servers and still recover the file. But the current code does 4+4+2 instead, which still leaves a lose-two-servers case that fails. The reason for this is the way I built the peer-selection loop. We try to avoid requiring more than two passes around the circle of peers. During the first pass, we ask each server to hold a single share. When we get back to the start, if we see that we still have shares left, we do `shares_per_server = div_ceil(remaining_shares, num_servers)`, and then ask each peer to hold that many shares. For our N=10/S=3 example, this places one share per server on the first loop, then starts the second loop with remaining_shares=7, and `div_ceil(7,3)` is 3, so it gives 3 additional shares (for a total of 4) to the first server, same to the second server, then it sees that it only has one unplaced share left and gives just that to the last server, resulting in an allocation of 4+4+2. We only recalculate shares_per_server at the top of each loop. I think the way to address this is going to involve using `div_floor` instead of `div_ceil` and distributing the remainder by sending an extra share to every Nth server, somehow. It would probably be easier if we had a convenient way to know how many servers we've looped through so far. If we knew how many servers were left to ask on this particular loop, we could do `div_floor(remaining_shares, remaining_servers)` and that would probably get us the desired properties. To do this, as we make the second pass, we should move servers from `self.contacted_servers` to a holding list, and then move the whole batch of them back into `self.contacted_servers` when we exhaust that list.