From d5a99e6219dd1e0995c4444f272cf046370401da Mon Sep 17 00:00:00 2001 From: meejah <> Date: Wed, 12 Jun 2019 20:18:37 +0000 Subject: [PATCH] [Imported from Trac: page Accounting, version 1] --- Accounting.md | 133 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 133 insertions(+) create mode 100644 Accounting.md diff --git a/Accounting.md b/Accounting.md new file mode 100644 index 0000000..93a3de1 --- /dev/null +++ b/Accounting.md @@ -0,0 +1,133 @@ + +(This was copied from a [LeastAuthority](LeastAuthority) wiki page, summarizing steps and desire to get cloud-backend things into master .. mostly related directly to the S4 service, but is fairly general) + +# background + +We wish to get the 2237-cloud-backend branch onto master. The +cloud-backend branch was built off of a minimal Accounting prototype +(warner/accounting-2) so that the new "lease-db" could have somewhere +to hang. + +## currently + +As far as leases and accounting go, 2237 / accounting-3 have the +following design: + + - Accountant hold accounts. There are just 2 accounts and no way + (yet) to create or manage them: + + - "starter" account + - "anonymous" account + + - an Account object now implments RIStorageServer (formerly + implemented by [StorageServer](StorageServer)). So from a client perspective, + nothing changes: they contact a fURL that implements the + RIStorageServer API. During client setup, that fURL is now pointed + at the anonymous Account instance (instead of the [StorageServer](StorageServer) + instance). + + - leases are stored in a local sqlite database + - new "starter" leases are created for anything which lacks a lease + - all the code that reads/writes leases to the shares themselves is gone + + - the Accountant and Account objects have access to the leasedb + - the Account object manages leases + - an [AccountingCrawler](AccountingCrawler) replaces the [LeaseCheckingCrawler](LeaseCheckingCrawler). This new crawler will: + - Remove leases that are past their expiration time. + - Delete objects containing unleased shares. + - Discover shares that have been manually added to storage. + - Discover shares that are present when a storage server is upgraded from + a pre-leasedb version, and give them "starter leases". + - Recover from a situation where the leasedb is lost or detectably + corrupted. This is handled in the same way as upgrading. + - Detect shares that have unexpectedly disappeared from storage. + +## problems + + +There are a few problems with this: + +### database durability, ops burden + + - ultimately, cloud-backend uses "not local disk" for storage + - ...but the leasedb is "a thing that should be backed up", but isn't + stored in the "not local disk" storage. That is, if we're using an + S3 thing, it would be best to have the lease-db in S3 (or AWS + database) + + - this is "okay" for now, because the lease-db is built to recover + from "zero leases". Basically: + - if there's no lease for a share, add a "starter" one + - eventually (after the default-30-days expiry) we will either + learn which clients care about that share (because they renewed + their leases) or the starter lease expires (and we delete the + share) + - ...but this means we can't use the lease-db to definitely answer + the question "how much space is Alice using" if our lease-db is + younger than "default-expiry-time". + +### non-async APIs + + - the current LeaseDB API is synchronous. This is "sort of fine if + you squint" for a local sqlite database (although still not + correct, because a database read can take an arbitrary amount of + time). Ideally the LeaseDB API should be async. + - e.g. by using twisted.enterprise.adbapi (or similar "general-pupose + Twisted database API" -- is there a better one?) + + +### "database as cache" + + - currently, the database is completely throw-away + - that may limit future designs (i.e. we can't put anything + "permanent" in the leasedb) + - is this a problem? (if so, is it a problem we *can't* easily fix + later? i.e. if and when we want to add a feature that needs durable + lease-db data?) + - I *think* we decided in last Nuts&Bolts that treating the database + as "mostly disposible" is okay + + +## the future + +### Remote API Design + + - obviously, to support "not yet upgraded" clients, the + "anonymous-storage-FURL" API can't change. That is, it must + implement RIStorageServer. + - but maybe having Account directly implement that isn't great. + - Consider this: + + - we want introducers to go away + - thus, "tahoe storage servers" need to stay (as "the" smart thing) + - what if we call these "tahoe servers" instead, and they provide services + - one of those services is "storage" + - (another service might be e.g. a "membrane" that provides + temporary access to a read-cap) + - (another might be a payment API of some kind, to pay for "storage" or other services) + + - ...so I think a better API might be this: + + - Account just provides a "services" API + - "storage" is one of those services (the only one we provide right now) + - ...and "storage" implements RIStorageServer + + - not much changes, except the shape of the code: during client + setup, we get the "anonymous-storage-FURL" from the "storage" + service of the anonymous Account (instead of it just *being* the + Account directly). + + +### Backing up the Databse + + - one thing suggested was to just periodically (e.g. every hour) back + up the sqlite database to "whatever storage the backend is + using". That is, a "storage backend" has an API to backup (and + restore) an sqlite file. + - then can "mostly" still answer the "how much space is Alice using" + stuff (except for the possibility that shares were added by Alice + after the last database backup) + - ...but you get fast, local queries most of the time for other things + - (I still think we should make the LeaseDB API async even if we're + "always" using sqlite) +