[Imported from Trac: page Accounting, version 1]

meejah 2019-06-12 20:18:37 +00:00
parent f25059b9a5
commit d5a99e6219

133
Accounting.md Normal file

@ -0,0 +1,133 @@
(This was copied from a [LeastAuthority](LeastAuthority) wiki page, summarizing steps and desire to get cloud-backend things into master .. mostly related directly to the S4 service, but is fairly general)
# background
We wish to get the 2237-cloud-backend branch onto master. The
cloud-backend branch was built off of a minimal Accounting prototype
(warner/accounting-2) so that the new "lease-db" could have somewhere
to hang.
## currently
As far as leases and accounting go, 2237 / accounting-3 have the
following design:
- Accountant hold accounts. There are just 2 accounts and no way
(yet) to create or manage them:
- "starter" account
- "anonymous" account
- an Account object now implments RIStorageServer (formerly
implemented by [StorageServer](StorageServer)). So from a client perspective,
nothing changes: they contact a fURL that implements the
RIStorageServer API. During client setup, that fURL is now pointed
at the anonymous Account instance (instead of the [StorageServer](StorageServer)
instance).
- leases are stored in a local sqlite database
- new "starter" leases are created for anything which lacks a lease
- all the code that reads/writes leases to the shares themselves is gone
- the Accountant and Account objects have access to the leasedb
- the Account object manages leases
- an [AccountingCrawler](AccountingCrawler) replaces the [LeaseCheckingCrawler](LeaseCheckingCrawler). This new crawler will:
- Remove leases that are past their expiration time.
- Delete objects containing unleased shares.
- Discover shares that have been manually added to storage.
- Discover shares that are present when a storage server is upgraded from
a pre-leasedb version, and give them "starter leases".
- Recover from a situation where the leasedb is lost or detectably
corrupted. This is handled in the same way as upgrading.
- Detect shares that have unexpectedly disappeared from storage.
## problems
There are a few problems with this:
### database durability, ops burden
- ultimately, cloud-backend uses "not local disk" for storage
- ...but the leasedb is "a thing that should be backed up", but isn't
stored in the "not local disk" storage. That is, if we're using an
S3 thing, it would be best to have the lease-db in S3 (or AWS
database)
- this is "okay" for now, because the lease-db is built to recover
from "zero leases". Basically:
- if there's no lease for a share, add a "starter" one
- eventually (after the default-30-days expiry) we will either
learn which clients care about that share (because they renewed
their leases) or the starter lease expires (and we delete the
share)
- ...but this means we can't use the lease-db to definitely answer
the question "how much space is Alice using" if our lease-db is
younger than "default-expiry-time".
### non-async APIs
- the current LeaseDB API is synchronous. This is "sort of fine if
you squint" for a local sqlite database (although still not
correct, because a database read can take an arbitrary amount of
time). Ideally the LeaseDB API should be async.
- e.g. by using twisted.enterprise.adbapi (or similar "general-pupose
Twisted database API" -- is there a better one?)
### "database as cache"
- currently, the database is completely throw-away
- that may limit future designs (i.e. we can't put anything
"permanent" in the leasedb)
- is this a problem? (if so, is it a problem we *can't* easily fix
later? i.e. if and when we want to add a feature that needs durable
lease-db data?)
- I *think* we decided in last Nuts&Bolts that treating the database
as "mostly disposible" is okay
## the future
### Remote API Design
- obviously, to support "not yet upgraded" clients, the
"anonymous-storage-FURL" API can't change. That is, it must
implement RIStorageServer.
- but maybe having Account directly implement that isn't great.
- Consider this:
- we want introducers to go away
- thus, "tahoe storage servers" need to stay (as "the" smart thing)
- what if we call these "tahoe servers" instead, and they provide services
- one of those services is "storage"
- (another service might be e.g. a "membrane" that provides
temporary access to a read-cap)
- (another might be a payment API of some kind, to pay for "storage" or other services)
- ...so I think a better API might be this:
- Account just provides a "services" API
- "storage" is one of those services (the only one we provide right now)
- ...and "storage" implements RIStorageServer
- not much changes, except the shape of the code: during client
setup, we get the "anonymous-storage-FURL" from the "storage"
service of the anonymous Account (instead of it just *being* the
Account directly).
### Backing up the Databse
- one thing suggested was to just periodically (e.g. every hour) back
up the sqlite database to "whatever storage the backend is
using". That is, a "storage backend" has an API to backup (and
restore) an sqlite file.
- then can "mostly" still answer the "how much space is Alice using"
stuff (except for the possibility that shares were added by Alice
after the last database backup)
- ...but you get fast, local queries most of the time for other things
- (I still think we should make the LeaseDB API async even if we're
"always" using sqlite)