collect server capacities and put them on the welcome page #648

Closed
opened 2009-02-27 18:07:35 +00:00 by zooko · 23 comments
zooko commented 2009-02-27 18:07:35 +00:00
Owner

As we're setting up the Volunteer Grid, this makes me want to see a summary of total storage capacity and free storage capacity on each server on the introducer's [gateway's]and welcome page.

As we're setting up the Volunteer Grid, this makes me want to see a summary of total storage capacity and free storage capacity on each server on the introducer's [gateway's]and welcome page.
tahoe-lafs added the
code
major
enhancement
1.3.0
labels 2009-02-27 18:07:35 +00:00
tahoe-lafs added this to the undecided milestone 2009-02-27 18:07:35 +00:00
warner commented 2009-03-01 04:26:56 +00:00
Author
Owner

Yeah! I've been thinking of two approaches:

  • add methods to the existing storage server remote API to query for total-space, space-available, etc (basically all the storage-related things you can get from the current stats gatherer). Have the introducer (or anyone else who's interested) query this interface and aggregate the results.
  • add a new service class (to the one "storage" one that we have now), with a separate remote API, that just does space-available information. Publish this through the introducer. Have the introducer (or anyone else who's interested) query this interface and aggregate the results.

The first approach feels a bit weird because it would conflate server access (upload/download shares) with a purely informational interface, and getting access to one should not necessarily provide access to the other. The second approach feels cleaner, but I've been holding off on implementing it until #466 is done (signed/extensible introducer messages, which is blocked on ECDSA). It doesn't strictly require #466, though.. maybe we could build it first.

Another approach would be to use the extensible-message part of #466 and publish space-available information in each announcement, but this would never be updated/updateable as quickly as having a remotely-callable query interface.

In any case, the information could be used by either the introducer, or by a separate disk-watcher process, not unlike the one we have right now. The existing disk-watcher queries the HTTP-based stats interface on each node to construct total-available, total-left, and rate-of-space-usage averages. One annoying aspect of this HTTP-based approach is that it must be configured manually: each time you add a server, you have to add its /statistics URL to the list. A process which used the introducer announcements to locate storage servers to query would be a lot easier to use.

Yeah! I've been thinking of two approaches: * add methods to the existing storage server remote API to query for total-space, space-available, etc (basically all the storage-related things you can get from the current stats gatherer). Have the introducer (or anyone else who's interested) query this interface and aggregate the results. * add a new service class (to the one "storage" one that we have now), with a separate remote API, that just does space-available information. Publish this through the introducer. Have the introducer (or anyone else who's interested) query this interface and aggregate the results. The first approach feels a bit weird because it would conflate server access (upload/download shares) with a purely informational interface, and getting access to one should not necessarily provide access to the other. The second approach feels cleaner, but I've been holding off on implementing it until #466 is done (signed/extensible introducer messages, which is blocked on ECDSA). It doesn't strictly require #466, though.. maybe we could build it first. Another approach would be to use the extensible-message part of #466 and publish space-available information in each announcement, but this would never be updated/updateable as quickly as having a remotely-callable query interface. In any case, the information could be used by either the introducer, or by a separate disk-watcher process, not unlike the one we have right now. The existing disk-watcher queries the HTTP-based stats interface on each node to construct total-available, total-left, and rate-of-space-usage averages. One annoying aspect of this HTTP-based approach is that it must be configured manually: each time you add a server, you have to add its /statistics URL to the list. A process which used the introducer announcements to locate storage servers to query would be a lot easier to use.
tahoe-lafs added
code-frontend-web
and removed
code
labels 2009-03-08 22:08:49 +00:00
imhavoc commented 2009-12-12 20:03:56 +00:00
Author
Owner

As a user/grid administrator, I would be happy enough with an aggregation of node-reported statistics. Even though it would not be immediately up to date, it would be able to report in "round gigabyte" (TB, PB?) the approximate status and available capacity of the grid. This "out of date" information would be "cheap" and much better than a) no information or b) "expensive" and immediate information. Each node updated hourly, which is going to be fine-grain enough for most applications.

As a user/grid administrator, I would be happy enough with an aggregation of node-reported statistics. Even though it would not be immediately up to date, it would be able to report in "round gigabyte" (TB, PB?) the approximate status and available capacity of the grid. This "out of date" information would be "cheap" and much better than a) no information or b) "expensive" and immediate information. Each node updated hourly, which is going to be fine-grain enough for most applications.
kpreid commented 2009-12-19 20:39:53 +00:00
Author
Owner

I would like to see this too, per-server — I think it should show up automatically in the table of storage servers on every node's welcome page.

I would like to see this too, per-server — I think it should show up automatically in the table of storage servers on every node's welcome page.
zooko commented 2009-12-26 15:19:06 +00:00
Author
Owner

Kevin: I agree it should show up automatically on the welcome page.

Kevin: I agree it should show up automatically on the welcome page.
Author
Owner

This would be nice for making sure you have enough storage space on your tahoe network. It would also be good to add it to the sshfs interface so that it shows up in the 'df' report.

This would be nice for making sure you have enough storage space on your tahoe network. It would also be good to add it to the sshfs interface so that it shows up in the 'df' report.
davidsarah commented 2010-09-01 18:49:57 +00:00
Author
Owner

The code that determines what SFTP outputs for 'df' is at lines [1757]source:src/allmydata/frontends/sftpd.py@4545#L1757 and [1879]source:src/allmydata/frontends/sftpd.py@4545#L1879 of sftpd.py. It currently has to fake some values to keep sshfs happy.

The code that determines what SFTP outputs for 'df' is at lines [1757]source:src/allmydata/frontends/sftpd.py@4545#L1757 and [1879]source:src/allmydata/frontends/sftpd.py@4545#L1879 of sftpd.py. It currently has to fake some values to keep sshfs happy.
tahoe-lafs changed title from show server capacities on introducer welcome page to collect server capacities and put them on introducer welcome page, output of 'df' for SFTP, etc. 2010-09-01 18:49:57 +00:00
tahoe-lafs changed title from collect server capacities and put them on introducer welcome page, output of 'df' for SFTP, etc. to collect server capacities and put them on the welcome page, output of 'df' for SFTP, etc. 2010-09-18 16:41:33 +00:00
zooko commented 2010-10-14 17:15:01 +00:00
Author
Owner

Replying to warner:

Yeah! I've been thinking of two approaches:

  • add methods to the existing storage server remote API to query for total-space, space-available, etc (basically all the storage-related things you can get from the current stats gatherer). Have the introducer (or anyone else who's interested) query this interface and aggregate the results.
  • add a new service class (to the one "storage" one that we have now), with a separate remote API, that just does space-available information. Publish this through the introducer. Have the introducer (or anyone else who's interested) query this interface and aggregate the results.

Don't storage servers already announce their space available to the introducer and doesn't the introducer already send that information to each client?

Let's see...

Yeah, there in [remote_get_version()]source:trunk/src/allmydata/storage/server.py?annotate=blame&rev=33e2d2962e2bc6ccf0f8619d5ea67baee1aebde1#L221:

                    { "maximum-immutable-share-size": remaining_space,

So the introducers and the clients could just display that information on their web pages.

In addition to that, we could get a lot more information if each storage server would be default automatically send its stats to a stats-gatherer and each storage client (or else each introducer) would automatically run a stats-gatherer and give the stats-gatherer's furl to each storage server:
[stats.txt]source:trunk/docs/stats.rst?rev=67ad0175cd3e48703b81737abdcf531d167e8daa
(And then the storage client or introducer would publish a web page with aggregated information in JSON, and then someone would write a nice JavaScript tool using protovis to visualize that information...)

Replying to [warner](/tahoe-lafs/trac-2024-07-25/issues/648#issuecomment-111840): > Yeah! I've been thinking of two approaches: > > * add methods to the existing storage server remote API to query for total-space, space-available, etc (basically all the storage-related things you can get from the current stats gatherer). Have the introducer (or anyone else who's interested) query this interface and aggregate the results. > * add a new service class (to the one "storage" one that we have now), with a separate remote API, that just does space-available information. Publish this through the introducer. Have the introducer (or anyone else who's interested) query this interface and aggregate the results. Don't storage servers already announce their space available to the introducer and doesn't the introducer already send that information to each client? Let's see... Yeah, there in [remote_get_version()]source:trunk/src/allmydata/storage/server.py?annotate=blame&rev=33e2d2962e2bc6ccf0f8619d5ea67baee1aebde1#L221: ``` { "maximum-immutable-share-size": remaining_space, ``` So the introducers and the clients could just display that information on their web pages. In addition to that, we could get a lot more information if each storage server would be default automatically send its stats to a stats-gatherer and each storage client (or else each introducer) would automatically run a stats-gatherer and give the stats-gatherer's furl to each storage server: [stats.txt]source:trunk/docs/stats.rst?rev=67ad0175cd3e48703b81737abdcf531d167e8daa (And then the storage client or introducer would publish a web page with aggregated information in JSON, and then someone would write a nice JavaScript tool using protovis to visualize that information...)
warner commented 2010-10-14 20:56:21 +00:00
Author
Owner

Replying to davidsarah:

The code that determines what SFTP outputs for 'df' is at lines [1757]source:src/allmydata/frontends/sftpd.py@4545#L1757 and [1879]source:src/allmydata/frontends/sftpd.py@4545#L1879 of sftpd.py. It currently has to fake some values to keep sshfs happy.

Wait, what? What's the relationship between server-space available and the number that SFTP reports as available to any given client? Not trivial, I think.

If we do this, let's make it clear that we're providing only a very rough approximation of the client-side space. Adding together all of the raw server space and dividing by the expansion factor is pretty rough, especially with the servers-of-happiness change (e.g. one server has 14TB free, but you can't upload anything because everyone else is full: SFTP should announce 0).

Also let's make room for Accounting APIs to generate this data (since really it's a function of accounting: how much space an individual "user" is allowed to consume, which may be far less than the sum of all server capacities). At least let's be thinking in that direction when we name the functions.

Replying to [davidsarah](/tahoe-lafs/trac-2024-07-25/issues/648#issuecomment-111847): > The code that determines what SFTP outputs for 'df' is at lines [1757]source:src/allmydata/frontends/sftpd.py@4545#L1757 and [1879]source:src/allmydata/frontends/sftpd.py@4545#L1879 of sftpd.py. It currently has to fake some values to keep sshfs happy. Wait, what? What's the relationship between server-space available and the number that SFTP reports as available to any given client? Not trivial, I think. If we do this, let's make it clear that we're providing only a very rough approximation of the client-side space. Adding together all of the raw server space and dividing by the expansion factor is pretty rough, especially with the servers-of-happiness change (e.g. one server has 14TB free, but you can't upload anything because everyone else is full: SFTP should announce 0). Also let's make room for Accounting APIs to generate this data (since really it's a function of accounting: how much space an individual "user" is allowed to consume, which may be far less than the sum of all server capacities). At least let's be thinking in that direction when we name the functions.
davidsarah commented 2010-12-29 21:31:39 +00:00
Author
Owner

Replying to [warner]comment:12:

Replying to davidsarah:

The code that determines what SFTP outputs for 'df' is at lines [1757]source:src/allmydata/frontends/sftpd.py@4545#L1757 and [1879]source:src/allmydata/frontends/sftpd.py@4545#L1879 of sftpd.py. It currently has to fake some values to keep sshfs happy.

Wait, what? What's the relationship between server-space available and the number that SFTP reports as available to any given client? Not trivial, I think.

Agreed that estimating the total available space is nontrivial. I've split it out into ticket #1285 (SFTP: put an approximation of grid capacity and available space in the 'df' output).

Replying to [warner]comment:12: > Replying to [davidsarah](/tahoe-lafs/trac-2024-07-25/issues/648#issuecomment-111847): > > The code that determines what SFTP outputs for 'df' is at lines [1757]source:src/allmydata/frontends/sftpd.py@4545#L1757 and [1879]source:src/allmydata/frontends/sftpd.py@4545#L1879 of sftpd.py. It currently has to fake some values to keep sshfs happy. > > Wait, what? What's the relationship between server-space available and the number that SFTP reports as available to any given client? Not trivial, I think. Agreed that estimating the total available space is nontrivial. I've split it out into ticket #1285 (SFTP: put an approximation of grid capacity and available space in the 'df' output).
zooko commented 2011-04-27 16:38:24 +00:00
Author
Owner

#1206 (node status page does not indicate per server if it is taking shares) was a duplicate of this. In that ticket, gdt wrote:

A very important indicator of the health of a server in a grid is whether it will take new shares. A client node has enough information (or could record it) to know this. It should show somehow if a node is not taking shares (either if it says it won't or if it actually doesn't). The lack of this feature makes it almost impossible to assess if files can be uploaded without trying it.

Whether a server is accepting shares is determined like this: if the server is configured to be in read-only mode then it sets its "available space" to 0: [StorageServer.get_available_space()]source:trunk/src/allmydata/storage/server.py?annotate=blame&rev=33e2d2962e2bc6ccf0f8619d5ea67baee1aebde1#L196. If "reserved space" is set then it subtracts that much space from its available space: [fileutil.get_disk_stats()]source:trunk/src/allmydata/util/fileutil.py?annotate=blame&rev=ff64a0fef5879d3651bc3db6ca0522d96b217d45#L338. It includes the resulting "available space" in the metadata about itself that it sends back in response to get_version requests: [StorageServer.remote_get_version()]source:trunk/src/allmydata/storage/server.py?annotate=blame&rev=33e2d2962e2bc6ccf0f8619d5ea67baee1aebde1#L221.

The client invokes get_version on each server as soon as it connects to that server, but it doesn't do so ever again as long as it stays connected: [storage_client.NativeStorageServer]source:trunk/src/allmydata/storage_client.py?annotate=blame&rev=68b7f9e979158dcb9f2fbc1bea74183c6897d46e#L161.

So, this ticket is basically a superset of #1206. The client is already learning (once, at connection establishment time) how much space the server is offering, which is equal to 0 if and only if the server is either in read-only mode or is full. If the client would display this information to the user in a nice comprehensible way then both #1206 and this ticket would be fixed.

patch-needed! :-)

#1206 (node status page does not indicate per server if it is taking shares) was a duplicate of this. In that ticket, gdt wrote: A very important indicator of the health of a server in a grid is whether it will take new shares. A client node has enough information (or could record it) to know this. It should show somehow if a node is not taking shares (either if it says it won't or if it actually doesn't). The lack of this feature makes it almost impossible to assess if files can be uploaded without trying it. Whether a server is accepting shares is determined like this: if the server is configured to be in read-only mode then it sets its "available space" to 0: [StorageServer.get_available_space()]source:trunk/src/allmydata/storage/server.py?annotate=blame&rev=33e2d2962e2bc6ccf0f8619d5ea67baee1aebde1#L196. If "reserved space" is set then it subtracts that much space from its available space: [fileutil.get_disk_stats()]source:trunk/src/allmydata/util/fileutil.py?annotate=blame&rev=ff64a0fef5879d3651bc3db6ca0522d96b217d45#L338. It includes the resulting "available space" in the metadata about itself that it sends back in response to `get_version` requests: [StorageServer.remote_get_version()]source:trunk/src/allmydata/storage/server.py?annotate=blame&rev=33e2d2962e2bc6ccf0f8619d5ea67baee1aebde1#L221. The client invokes `get_version` on each server as soon as it connects to that server, but it doesn't do so ever again as long as it stays connected: [storage_client.NativeStorageServer]source:trunk/src/allmydata/storage_client.py?annotate=blame&rev=68b7f9e979158dcb9f2fbc1bea74183c6897d46e#L161. So, this ticket is basically a superset of #1206. The client is already learning (once, at connection establishment time) how much space the server is offering, which is equal to 0 if and only if the server is either in read-only mode or is full. If the client would display this information to the user in a nice comprehensible way then both #1206 and this ticket would be fixed. `patch-needed`! :-)
zooko commented 2011-04-27 16:47:50 +00:00
Author
Owner

Hm, once we've fixed this ticket, then we should add to ticket #816 (Add ping-all-servers button to welcome page). That ticket is to make a button titled "ping all servers". When you click that button it will issue get_version requests to all servers and update the display of how much space they are offering.

Hm, once we've fixed this ticket, then we should add to ticket #816 *(Add ping-all-servers button to welcome page)*. That ticket is to make a button titled "ping all servers". When you click that button it will issue `get_version` requests to all servers and update the display of how much space they are offering.
zooko commented 2011-05-31 17:37:33 +00:00
Author
Owner

Moving the part about df in the SFTP server over to its own ticket: #1285.

Moving the part about `df` in the SFTP server over to its own ticket: #1285.
tahoe-lafs changed title from collect server capacities and put them on the welcome page, output of 'df' for SFTP, etc. to collect server capacities and put them on the welcome page 2011-05-31 17:37:33 +00:00
davidsarah commented 2011-10-11 02:33:42 +00:00
Author
Owner

addos asked about this on #tahoe-lafs (http://fred.submusic.ch/irc/tahoe-lafs/2011-10-09#i_296689 username irclogs, password irclogs):

in the status of the storage grid display in the web interface, why does it not show the storage of each node?

I guess I have to go to each node and visit /storage?

It would be nice if each node in the status of the storage grid table, had a link to that node's /storage

The suggestion of a link to the node's /storage page is a nice one; maybe one of the columns could be linked to that, so as not to take up any extra space.

addos asked about this on #tahoe-lafs (<http://fred.submusic.ch/irc/tahoe-lafs/2011-10-09#i_296689> username irclogs, password irclogs): > in the status of the storage grid display in the web interface, why does it not show the storage of each node? > I guess I have to go to each node and visit /storage? > It would be nice if each node in the status of the storage grid table, had a link to that node's /storage The suggestion of a link to the node's `/storage` page is a nice one; maybe one of the columns could be linked to that, so as not to take up any extra space.
Author
Owner

When I set up storage servers, the WUI of the server is not accessible beyond localhost. So having links to storage server web pages is at least for me a non-solution. From a usability point of view, I want to see free space per server in the main server table at my local client WUI,. This would also enable using the current k/H/N values to find the available grid space - basically sort by free space and then number from 1 to M, and find item H in that list, more or less. That's wrong, of course, and the real free space depends on the packing algorithm, but it's a conservative indication. Or perhaps show that as a lower bound and totalfree/(N/k) as the upper bound.

When I set up storage servers, the WUI of the server is not accessible beyond localhost. So having links to storage server web pages is at least for me a non-solution. From a usability point of view, I want to see free space per server in the main server table at my local client WUI,. This would also enable using the current k/H/N values to find the available grid space - basically sort by free space and then number from 1 to M, and find item H in that list, more or less. That's wrong, of course, and the real free space depends on the packing algorithm, but it's a conservative indication. Or perhaps show that as a lower bound and totalfree/(N/k) as the upper bound.
Author
Owner

This branch adds a "Space Available" column to the welcome page:

https://github.com/leif/tahoe-lafs/compare/master...ticket648

This branch adds a "Space Available" column to the welcome page: <https://github.com/leif/tahoe-lafs/compare/master...ticket648>
davidsarah commented 2012-09-04 23:11:46 +00:00
Author
Owner

When the available space for a given server is the fixed maximum or the server wasn't able to determine it (I think it sets the space to the maximum in that case), we should not print that literally in the "Space Available" column. Other than that, leif's patch looks like a good implementation, so I'm putting this ticket in 1.11.

Anyone please feel free to accept the ticket if you intend to write tests.

When the available space for a given server is the fixed maximum or the server wasn't able to determine it (I think it sets the space to the maximum in that case), we should not print that literally in the "Space Available" column. Other than that, leif's patch looks like a good implementation, so I'm putting this ticket in 1.11. Anyone please feel free to accept the ticket if you intend to write tests.
tahoe-lafs modified the milestone from undecided to 1.11.0 2012-09-04 23:11:46 +00:00
zooko commented 2012-12-14 20:26:40 +00:00
Author
Owner

See also #940 which is about the storage server displaying to its user its own space-usage/space-available stats.

See also #940 which is about the *storage server* displaying to its user its own space-usage/space-available stats.
Author
Owner

I intend to write tests for this and hope to get it in 1.11.

I intend to write tests for this and hope to get it in 1.11.
Lcstyle commented 2014-09-24 03:10:05 +00:00
Author
Owner

I like this idea, but I'd like to suggest a FR to go along with this Enhancement. Specifically I am concerned that the storage nodes have no way to restrict how much disk space they use on a file system, other than the negative value provided by the reserved_space config option.

If there was a disk space setting as I am suggesting, the reported disk space value for a storage node would be more accurately represented. For example, a storage node run on a volume group could all of a sudden find itself growing unintentionally if an admin added more PV's to an LV or VG. The reserved_space value would then allow the available space to grow to the newly available node's capacity, perhaps beyond that which a server admin originally intended.

It's easy to grow a storage nodes capacity, but how does one shrink it after the fact?

I like this idea, but I'd like to suggest a FR to go along with this Enhancement. Specifically I am concerned that the storage nodes have no way to restrict how much disk space they use on a file system, other than the negative value provided by the reserved_space config option. If there was a disk space setting as I am suggesting, the reported disk space value for a storage node would be more accurately represented. For example, a storage node run on a volume group could all of a sudden find itself growing unintentionally if an admin added more PV's to an LV or VG. The reserved_space value would then allow the available space to grow to the newly available node's capacity, perhaps beyond that which a server admin originally intended. It's easy to grow a storage nodes capacity, but how does one shrink it after the fact?
zooko commented 2014-09-24 04:06:31 +00:00
Author
Owner

Lcstyle: ticket #671 is about adding a configuration option to limit how much disk space the storage server can use. There is a patch, by markberger! Go forth an review it! :-)

Lcstyle: ticket #671 is about adding a configuration option to limit how much disk space the storage server can use. There is a patch, by markberger! Go forth an review it! :-)
cipherpunks commented 2014-11-21 04:36:24 +00:00
Author
Owner

Attachment 648_tests.patch (1947 bytes) added

tests for the leif's ticket648 branch

**Attachment** 648_tests.patch (1947 bytes) added tests for the leif's ticket648 branch
Author
Owner

I just pushed a squash-merged version of this along with the tests from the previous comment and another test here:

https://github.com/leif/tahoe-lafs/compare/master...ticket648-rebased

and opened a pull request here:

https://github.com/tahoe-lafs/tahoe-lafs/pull/127

If this patch is accepted, I suggest closing this ticket despite it not displaying the sum total space available because I don't think that is a particularly meaningful value.

I just pushed a squash-merged version of this along with the tests from the previous comment and another test here: <https://github.com/leif/tahoe-lafs/compare/master...ticket648-rebased> and opened a pull request here: <https://github.com/tahoe-lafs/tahoe-lafs/pull/127> If this patch is accepted, I suggest closing this ticket despite it not displaying the sum total space available because I don't think that is a particularly meaningful value.
daira commented 2014-11-23 06:29:52 +00:00
Author
Owner

Fixed in [335c2ed06ab97443e1809819bb77b9946bec405c/trunk] and preceding.

Fixed in [335c2ed06ab97443e1809819bb77b9946bec405c/trunk] and preceding.
tahoe-lafs added the
fixed
label 2014-11-23 06:29:52 +00:00
tahoe-lafs modified the milestone from soon to 1.11.0 2014-11-23 06:29:52 +00:00
daira closed this issue 2014-11-23 06:29:52 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: tahoe-lafs/trac-2024-07-25#648
No description provided.