Page:
Summit2Day1
Pages
AboutUs
Accounting
AccountingDesign
AdvancedInstall
AllmydataDotComPage
ArchLinuxArmInstallation
BBFreeze
BadContent
Bibliography
BitCoinPage
BuildSystemTheory
BuildbotPolicy
Capabilities
CodingStandards
CompileError
ComponentDefinitions
Convergence Secret
DelegationOperations
Dev
DevInfra
DirectoryNode
Doc
DownloadDebianPackages
Extensions
FAQ
FileId
FileTree
GSoCIdeas
GSoCIdeas2009
GSoCIdeas2010
GridBackup
GridManager
Grids
Home
HowToReportABug
HowToWriteTests
HowtoBuildPyCryptoOnWindows
HowtoContributeABuildbot
InstallDetails
Installation
JavaScript
Keywords
KnownIssues
LocalGrid
Manual
MeetingNotes_2012_10_23
MemoryFootprint
MoveOffTrac
Munin_Stats Gatherer_Readme
Munin_local_plugins_README
NewAccountingDesign
NewCapDesign
NewImmutableEncodingDesign
NewMutableEncodingDesign
NewbieDeveloperSetup
News
OSPackages
OldNews
OneHundredYearCryptography
OriginalWikiStart
Ostrom
Packaging
PatchReviewProcess
Patches
PeerSelection
Performance
PkgSrc
Proposed
Python3
QuotaManagement
RelatedProjects
RequestedEdits
ResearchVenues
SNARKs
Security
ServerSelection
SftpFrontend
SpamPolicy
StorageIndex
Summit
Summit1
Summit2011
Summit2016
Summit2Day1
Summit2Day2
Summit2Day3
Summit2Day4
TaggedHash
TahoeLAFSMobile
TahoeLAFSWeeklyNews
TahoeThree
TahoeTwo
TahoeVsDebianBuggyOpenSsl
TestGrid
TipsTricks
TracSecurityOverview
TracStartingPoints
TracWikiMacros
Tutorial
UbuntuPackaging
UseCases
VerifierId
VersionNumbers
Versioning
ViewTickets
ViewTickets2
VolunteerGrid
WeeklyMeeting
WindowsBuild
apparmor
pyFilesystem
test
12
Summit2Day1
warner edited this page 2011-11-12 22:01:01 +00:00
2nd Summit Day 1
Tuesday 08-Nov-2011. Mozilla SF. Video (1.7GB flash .flv, 6 hours)
Attendees (with IRC nicks)
- Brian Warner (warner)
- Zooko (zooko)
- David-Sarah Hopwood (davidsarah)
- Zancas (zancas)
- Shawn Willden (divegeek)
- Zack Weinberg (zwol)
- Zack Kansler (?)
- Online: amiller, Dcoder
!Agent/Gateway split
- Shawn Willden observed that "tahoe backup" is usually run on a laptop
(frequently sleeping/offline), whereas he's got some other machine (a
desktop or home server) with limited CPU which *is* online 24/7, so
he wants a backup program that quickly dumps the laptop's contents to
the server, then (slowly/lazily) uploads that data from the server
into Tahoe
- extra points for only putting ciphertext on that server
- Brian wants a long-running process (specifically a
twisted.application.service.Service object) to manage backup jobs,
storing state in a sqlite db (which directories have been visited,
time since last backup, ETA). Likewise for renew/repair/rebalance
jobs.
- maybe web interface to control/monitor these jobs
- vague consensus was to introduce an "Agent" service, distinct from
the current "Client/Gateway" service. The normal client/gateway
process will include both (and the co-resident Agent will have local
access to the IClient object for upload/download). But it will also
be possible to create one in a separate process (with no
client/gateway), in which case it speaks WAPI over HTTP (and must be
configured with a node.url).
- backup and renew/repair/rebalance jobs run in an Agent
- not sure about where the WAPI/WUI lives. One idea was to have the
Agent provide the WUI, and the C/G provide the WAPI. Another is to
have the C/G provide most webapi but add new webapi to the Agent for
managing backup/renew/repair/rebalance jobs
- [whiteboard pix](../raw/attachments/Summit2Day1/day1-agent.jpg)
server-selection UI
expected vs required vs known, k-of-N, H, points, understandability
- Brian talked about an old #467 explicit-server-selection message and
his proposed UI to list all known servers in a table, with "use this
server?" and "require this server?" buttons
- David-Sarah (and Zooko) pointed out that "require?" is a bit harsh
given our current H= ("Servers Of Happiness") share-placement code
- tradeoffs between clear-and-restrictive vs confusing-but-fails-less
- challenge of identifying reliability of nodes
- Brian says client/gateway should expect user to teach it what sort of
grid they expect, so it can bail when expectations aren't met
- Shawn would prefer scheme where client measures node reliability itself
(though in the near term allows the user to specify node reliabilities),
sets N to the number of servers currently accepting shares, sets H
slightly below N (to improve upload availability) and computes and
uses a k that maximizes performance (or minimizes cost) while achieving
a target reliability.
- whiteboard: [grid1](../raw/attachments/Summit2Day1/day1-grid1.jpg), [grid2](../raw/attachments/Summit2Day1/day1-grid2.jpg)
encrypted git, or revision control on tahoe
- Zack(?) is thinking about revision control on top of Tahoe, will
present "big crazy idea" later when everyone is there
- Brian mentioned his signed-git-revisionid project (not yet released),
and how git fetch/push is fast because both sides know full revision
graph and can compute missing objects in one RTT. To get this in
Tahoe, we must add deep-(verify)-caps and let servers see shape of
directory tree.
- Lunch conversation about Monotone's mutable metadata and the problem
of transferring it efficiently
grid management
non-transitive one-at-a-time invitations, transitive clique invitations, Grid Admin star config
- Brian is thinking about grid setup and Accounting, and pondering a
startup mode where servers issue Invitations to clients
- pasting Invitation code into a web form is sufficient to get
connected to grid (combines Introducer functionality with Account
authorization)
- probably set up bidirectional connection: when Alice accepts
Invitation from Bob, both Alice and Bob can use each other's
storage
- three modes:
- issue/accept one Invitation per link
- each node needs one Invitation to join clique, then they get access
to all storage servers in the clique (and offer service to all
clients in the clique): grid grows one node at a time
- issue: can two grids merge? or can you only accept an invitation
when you aren't already in a grid?
- managed grid: a central Grid Admin is the only one who can issue
Invitations. When accepted, Alice can use storage of all members.
- Shawn thinks a Request model is more natural: Server admin (or Grid
Admin) sends ambient URL to new user, they paste it into a field that
says "Request Access", this sends a Request to the server (probably
containing a pubkey), the server records it, then later the server
admin Accepts or Rejects the request.
- Invite and Request are duals, modulo some channel and workflow
variations (confidential vs authentic, who-sends-first-message)
- Brian will explore how hard/feasible it is to run one workflow on top
of the other: can a Request be expressed with a note saying "please
send an Invitation to this public encryption key" sent to the server?
#466 new-introducer review
- Brian walked through most of the #466 new-introducer code
(<https://github.com/warner/tahoe-lafs/tree/466-take7>) with
David-Sarah and Zooko
- David-Sarah found one critical security bug (signature checking
failure), lots of good cleanups to recommend, tests to add
- overall it looks good
- Brian will make suggested cleanups and prepare for landing
Beer!
signature consensus!
- over drinks, Brian and David-Sarah and Zooko discussed signature
options (needed for #466, Accounting, non-Foolscap storage protocol,
new mutable file formats)
- choices:
- [python-ed25519](https://github.com/warner/python-ed25519)
(standalone C extension module)
- [python-ecdsa](https://github.com/warner/python-ecdsa)
(standalone pure-Python module)
- ECDSA from Crypto++ via pycryptopp
- non-EC DSA (eww)
- get Ed25519 into Crypto++, then expose in pycryptopp
- add Ed25519 into pycryptopp (making it more than just a python
binding to Crypto++, hence nicknamed "pycryptoppp")
- get ECDSA from pyOpenSSL (we think it isn't exposed)
- evaluation:
- security: David-Sarah prefers Ed25519, Zooko slightly
prefers ECDSA (older, more exposure), Brian (who currently has a
crush on everything 25519) slightly prefers Ed25519. Crypto++'s
entropy-using signature code includes nonce-safety (entropy is
hashed with message to mitigate VM-rollback failure). Ed25519
has deterministic signatures and nonce-safety.
- speed: requirement is <10s startup with 100 servers (specifically,
make sure the Announcement sign/verify is small compared to
connection establishment time). That's sign+verify<100ms . This
rules out python-ecdsa (sign+verify=330ms). Both a non-pure-python
ECDSA and Ed25519 will do. A really fast primitive (optimized
Ed25519 is like 20us) might enable new applications in the future
(key-per-lease, key-per-write-request).
- pure-python: slight preference for something that could be
pure-python in the future if PyPy could make it fast enough.
Seems unlikely in the near-term for any of the options.
- patents: murky, of course. !Redhat/Fedora core currently eschew
all ECC, might change, might not, too bad. Not a clear
differentiator between ECDSA and Ed25519. Nobody was willing to
tolerate non-EC DSA (would need 4kbit keys to feel safe, not
confident of hitting speed requirements). We can always back it
out if it proves to be a problem (at the cost of regenerating
all serverids). Hopefully the scene will settle down before we
want to use it for data (which would be harder to back out).
- packaging: biggest differentiator
- python-ed25519: must build eggs as we did for pycrypto, need to
get into debian (which has other benefits, but delays tahoe),
increases build pain marginally.
- python-ecdsa (too slow, ruled out): pure-python, so no need for
eggs, but still need to get into debian and increases build pain
- ECDSA-via-pycryptopp: easy, code is mostly done, needs final
review and polish, no new dependencies.
- ed25519-in-Crypto++: probably good idea in long term, but will
take a while (must convince Crypto++ to change, wait for a
release, then add bindings to pycryptopp). Must also wait for
distributions to pick up new Crypto++. Technically no new
dependencies, but increases the version requirements on an
external module with a historically slow (1/yr) release cycle.
- ed25519-in-pycryptopp: a bit weird (pycryptoppp), fairly fast (we
control pycryptopp), no external delays. No new dependencies.
- **winner: ed25519-in-pycryptopp** (aka pycryptoppp). Ed25519 wins over
ECDSA with potential better security and future-coolness-enabling
speed. Delivering in pycryptopp means no new dependency and no
external parties to block.
- future goal is to get python-ed25519 into debian, then switch Tahoe
to depend on it instead. And/or once Ed25519 gets into Crypto++,
remove the separate implementation from pycryptopp (i.e. remove one
"p" from pycryptoppp) and have pycryptopp rely on Crypto++'s version.
- also, get pycryptopp's ECDSA finished off in the ed25519-bearing
release, just to have it available.
- dev plan:
- brian makes new python-ed25519 release with API tweaks
- brian makes patch for pycryptopp with most of python-ed25519
- zooko makes new pycryptopp release
- get folks to build eggs of new release
- tahoe starts depending on new release
- land #466 accounting code that uses ed25519