12 Summit2Day1
warner edited this page 2011-11-12 22:01:01 +00:00

2nd Summit Day 1

Tuesday 08-Nov-2011. Mozilla SF. Video (1.7GB flash .flv, 6 hours)

Attendees (with IRC nicks)

  • Brian Warner (warner)
  • Zooko (zooko)
  • David-Sarah Hopwood (davidsarah)
  • Zancas (zancas)
  • Shawn Willden (divegeek)
  • Zack Weinberg (zwol)
  • Zack Kansler (?)
  • Online: amiller, Dcoder

!Agent/Gateway split

- Shawn Willden observed that "tahoe backup" is usually run on a laptop
  (frequently sleeping/offline), whereas he's got some other machine (a
  desktop or home server) with limited CPU which *is* online 24/7, so
  he wants a backup program that quickly dumps the laptop's contents to
  the server, then (slowly/lazily) uploads that data from the server
  into Tahoe
  - extra points for only putting ciphertext on that server
- Brian wants a long-running process (specifically a
  twisted.application.service.Service object) to manage backup jobs,
  storing state in a sqlite db (which directories have been visited,
  time since last backup, ETA). Likewise for renew/repair/rebalance
  jobs.
  - maybe web interface to control/monitor these jobs
- vague consensus was to introduce an "Agent" service, distinct from
  the current "Client/Gateway" service. The normal client/gateway
  process will include both (and the co-resident Agent will have local
  access to the IClient object for upload/download). But it will also
  be possible to create one in a separate process (with no
  client/gateway), in which case it speaks WAPI over HTTP (and must be
  configured with a node.url).
- backup and renew/repair/rebalance jobs run in an Agent
- not sure about where the WAPI/WUI lives. One idea was to have the
  Agent provide the WUI, and the C/G provide the WAPI. Another is to
  have the C/G provide most webapi but add new webapi to the Agent for
  managing backup/renew/repair/rebalance jobs
- [whiteboard pix](../raw/attachments/Summit2Day1/day1-agent.jpg)

server-selection UI

expected vs required vs known, k-of-N, H, points, understandability

- Brian talked about an old #467 explicit-server-selection message and
  his proposed UI to list all known servers in a table, with "use this
  server?" and "require this server?" buttons
- David-Sarah (and Zooko) pointed out that "require?" is a bit harsh
  given our current H= ("Servers Of Happiness") share-placement code
- tradeoffs between clear-and-restrictive vs confusing-but-fails-less
- challenge of identifying reliability of nodes
- Brian says client/gateway should expect user to teach it what sort of
  grid they expect, so it can bail when expectations aren't met
- Shawn would prefer scheme where client measures node reliability itself
  (though in the near term allows the user to specify node reliabilities),
  sets N to the number of servers currently accepting shares, sets H
  slightly below N (to improve upload availability) and computes and
  uses a k that maximizes performance (or minimizes cost) while achieving
  a target reliability.
- whiteboard: [grid1](../raw/attachments/Summit2Day1/day1-grid1.jpg), [grid2](../raw/attachments/Summit2Day1/day1-grid2.jpg)

encrypted git, or revision control on tahoe

- Zack(?) is thinking about revision control on top of Tahoe, will
  present "big crazy idea" later when everyone is there
- Brian mentioned his signed-git-revisionid project (not yet released),
  and how git fetch/push is fast because both sides know full revision
  graph and can compute missing objects in one RTT. To get this in
  Tahoe, we must add deep-(verify)-caps and let servers see shape of
  directory tree.
- Lunch conversation about Monotone's mutable metadata and the problem
  of transferring it efficiently

grid management

non-transitive one-at-a-time invitations, transitive clique invitations, Grid Admin star config

- Brian is thinking about grid setup and Accounting, and pondering a
  startup mode where servers issue Invitations to clients
  - pasting Invitation code into a web form is sufficient to get
    connected to grid (combines Introducer functionality with Account
    authorization)
  - probably set up bidirectional connection: when Alice accepts
    Invitation from Bob, both Alice and Bob can use each other's
    storage
- three modes:
  - issue/accept one Invitation per link
  - each node needs one Invitation to join clique, then they get access
    to all storage servers in the clique (and offer service to all
    clients in the clique): grid grows one node at a time
    - issue: can two grids merge? or can you only accept an invitation
      when you aren't already in a grid?
  - managed grid: a central Grid Admin is the only one who can issue
    Invitations. When accepted, Alice can use storage of all members.
- Shawn thinks a Request model is more natural: Server admin (or Grid
  Admin) sends ambient URL to new user, they paste it into a field that
  says "Request Access", this sends a Request to the server (probably
  containing a pubkey), the server records it, then later the server
  admin Accepts or Rejects the request.
- Invite and Request are duals, modulo some channel and workflow
  variations (confidential vs authentic, who-sends-first-message)
- Brian will explore how hard/feasible it is to run one workflow on top
  of the other: can a Request be expressed with a note saying "please
  send an Invitation to this public encryption key" sent to the server?

#466 new-introducer review

- Brian walked through most of the #466 new-introducer code
  (<https://github.com/warner/tahoe-lafs/tree/466-take7>) with
  David-Sarah and Zooko
- David-Sarah found one critical security bug (signature checking
  failure), lots of good cleanups to recommend, tests to add
- overall it looks good
- Brian will make suggested cleanups and prepare for landing

Beer!

signature consensus!

- over drinks, Brian and David-Sarah and Zooko discussed signature
  options (needed for #466, Accounting, non-Foolscap storage protocol,
  new mutable file formats)
- choices:
  - [python-ed25519](https://github.com/warner/python-ed25519)
    (standalone C extension module)
  - [python-ecdsa](https://github.com/warner/python-ecdsa)
    (standalone pure-Python module)
  - ECDSA from Crypto++ via pycryptopp
  - non-EC DSA (eww)
  - get Ed25519 into Crypto++, then expose in pycryptopp
  - add Ed25519 into pycryptopp (making it more than just a python
    binding to Crypto++, hence nicknamed "pycryptoppp")
  - get ECDSA from pyOpenSSL (we think it isn't exposed)
- evaluation:
  - security: David-Sarah prefers Ed25519, Zooko slightly
    prefers ECDSA (older, more exposure), Brian (who currently has a
    crush on everything 25519) slightly prefers Ed25519. Crypto++'s
    entropy-using signature code includes nonce-safety (entropy is
    hashed with message to mitigate VM-rollback failure). Ed25519
    has deterministic signatures and nonce-safety.
  - speed: requirement is <10s startup with 100 servers (specifically,
    make sure the Announcement sign/verify is small compared to
    connection establishment time). That's sign+verify<100ms . This
    rules out python-ecdsa (sign+verify=330ms). Both a non-pure-python
    ECDSA and Ed25519 will do. A really fast primitive (optimized
    Ed25519 is like 20us) might enable new applications in the future
    (key-per-lease, key-per-write-request).
  - pure-python: slight preference for something that could be
    pure-python in the future if PyPy could make it fast enough.
    Seems unlikely in the near-term for any of the options.
  - patents: murky, of course. !Redhat/Fedora core currently eschew
    all ECC, might change, might not, too bad. Not a clear
    differentiator between ECDSA and Ed25519. Nobody was willing to
    tolerate non-EC DSA (would need 4kbit keys to feel safe, not
    confident of hitting speed requirements). We can always back it
    out if it proves to be a problem (at the cost of regenerating
    all serverids). Hopefully the scene will settle down before we
    want to use it for data (which would be harder to back out).
  - packaging: biggest differentiator
    - python-ed25519: must build eggs as we did for pycrypto, need to
      get into debian (which has other benefits, but delays tahoe),
      increases build pain marginally.
    - python-ecdsa (too slow, ruled out): pure-python, so no need for
      eggs, but still need to get into debian and increases build pain
    - ECDSA-via-pycryptopp: easy, code is mostly done, needs final
      review and polish, no new dependencies.
    - ed25519-in-Crypto++: probably good idea in long term, but will
      take a while (must convince Crypto++ to change, wait for a
      release, then add bindings to pycryptopp). Must also wait for
      distributions to pick up new Crypto++. Technically no new
      dependencies, but increases the version requirements on an
      external module with a historically slow (1/yr) release cycle.
    - ed25519-in-pycryptopp: a bit weird (pycryptoppp), fairly fast (we
      control pycryptopp), no external delays. No new dependencies.
- **winner: ed25519-in-pycryptopp** (aka pycryptoppp). Ed25519 wins over
  ECDSA with potential better security and future-coolness-enabling
  speed. Delivering in pycryptopp means no new dependency and no
  external parties to block.
- future goal is to get python-ed25519 into debian, then switch Tahoe
  to depend on it instead. And/or once Ed25519 gets into Crypto++,
  remove the separate implementation from pycryptopp (i.e. remove one
  "p" from pycryptoppp) and have pycryptopp rely on Crypto++'s version.
- also, get pycryptopp's ECDSA finished off in the ed25519-bearing
  release, just to have it available.
- dev plan:
  - brian makes new python-ed25519 release with API tweaks
  - brian makes patch for pycryptopp with most of python-ed25519
  - zooko makes new pycryptopp release
  - get folks to build eggs of new release
  - tahoe starts depending on new release
  - land #466 accounting code that uses ed25519