add-only sets #795
Labels
No labels
c/code
c/code-dirnodes
c/code-encoding
c/code-frontend
c/code-frontend-cli
c/code-frontend-ftp-sftp
c/code-frontend-magic-folder
c/code-frontend-web
c/code-mutable
c/code-network
c/code-nodeadmin
c/code-peerselection
c/code-storage
c/contrib
c/dev-infrastructure
c/docs
c/operational
c/packaging
c/unknown
c/website
kw:2pc
kw:410
kw:9p
kw:ActivePerl
kw:AttributeError
kw:DataUnavailable
kw:DeadReferenceError
kw:DoS
kw:FileZilla
kw:GetLastError
kw:IFinishableConsumer
kw:K
kw:LeastAuthority
kw:Makefile
kw:RIStorageServer
kw:StringIO
kw:UncoordinatedWriteError
kw:about
kw:access
kw:access-control
kw:accessibility
kw:accounting
kw:accounting-crawler
kw:add-only
kw:aes
kw:aesthetics
kw:alias
kw:aliases
kw:aliens
kw:allmydata
kw:amazon
kw:ambient
kw:annotations
kw:anonymity
kw:anonymous
kw:anti-censorship
kw:api_auth_token
kw:appearance
kw:appname
kw:apport
kw:archive
kw:archlinux
kw:argparse
kw:arm
kw:assertion
kw:attachment
kw:auth
kw:authentication
kw:automation
kw:avahi
kw:availability
kw:aws
kw:azure
kw:backend
kw:backoff
kw:backup
kw:backupdb
kw:backward-compatibility
kw:bandwidth
kw:basedir
kw:bayes
kw:bbfreeze
kw:beta
kw:binaries
kw:binutils
kw:bitcoin
kw:bitrot
kw:blacklist
kw:blocker
kw:blocks-cloud-deployment
kw:blocks-cloud-merge
kw:blocks-magic-folder-merge
kw:blocks-merge
kw:blocks-raic
kw:blocks-release
kw:blog
kw:bom
kw:bonjour
kw:branch
kw:branding
kw:breadcrumbs
kw:brians-opinion-needed
kw:browser
kw:bsd
kw:build
kw:build-helpers
kw:buildbot
kw:builders
kw:buildslave
kw:buildslaves
kw:cache
kw:cap
kw:capleak
kw:captcha
kw:cast
kw:centos
kw:cffi
kw:chacha
kw:charset
kw:check
kw:checker
kw:chroot
kw:ci
kw:clean
kw:cleanup
kw:cli
kw:cloud
kw:cloud-backend
kw:cmdline
kw:code
kw:code-checks
kw:coding-standards
kw:coding-tools
kw:coding_tools
kw:collection
kw:compatibility
kw:completion
kw:compression
kw:confidentiality
kw:config
kw:configuration
kw:configuration.txt
kw:conflict
kw:connection
kw:connectivity
kw:consistency
kw:content
kw:control
kw:control.furl
kw:convergence
kw:coordination
kw:copyright
kw:corruption
kw:cors
kw:cost
kw:coverage
kw:coveralls
kw:coveralls.io
kw:cpu-watcher
kw:cpyext
kw:crash
kw:crawler
kw:crawlers
kw:create-container
kw:cruft
kw:crypto
kw:cryptography
kw:cryptography-lib
kw:cryptopp
kw:csp
kw:curl
kw:cutoff-date
kw:cycle
kw:cygwin
kw:d3
kw:daemon
kw:darcs
kw:darcsver
kw:database
kw:dataloss
kw:db
kw:dead-code
kw:deb
kw:debian
kw:debug
kw:deep-check
kw:defaults
kw:deferred
kw:delete
kw:deletion
kw:denial-of-service
kw:dependency
kw:deployment
kw:deprecation
kw:desert-island
kw:desert-island-build
kw:design
kw:design-review-needed
kw:detection
kw:dev-infrastructure
kw:devpay
kw:directory
kw:directory-page
kw:dirnode
kw:dirnodes
kw:disconnect
kw:discovery
kw:disk
kw:disk-backend
kw:distribute
kw:distutils
kw:dns
kw:do_http
kw:doc-needed
kw:docker
kw:docs
kw:docs-needed
kw:dokan
kw:dos
kw:download
kw:downloader
kw:dragonfly
kw:drop-upload
kw:duplicity
kw:dusty
kw:earth-dragon
kw:easy
kw:ec2
kw:ecdsa
kw:ed25519
kw:egg-needed
kw:eggs
kw:eliot
kw:email
kw:empty
kw:encoding
kw:endpoint
kw:enterprise
kw:enum34
kw:environment
kw:erasure
kw:erasure-coding
kw:error
kw:escaping
kw:etag
kw:etch
kw:evangelism
kw:eventual
kw:example
kw:excess-authority
kw:exec
kw:exocet
kw:expiration
kw:extensibility
kw:extension
kw:failure
kw:fedora
kw:ffp
kw:fhs
kw:figleaf
kw:file
kw:file-descriptor
kw:filename
kw:filesystem
kw:fileutil
kw:fips
kw:firewall
kw:first
kw:floatingpoint
kw:flog
kw:foolscap
kw:forward-compatibility
kw:forward-secrecy
kw:forwarding
kw:free
kw:freebsd
kw:frontend
kw:fsevents
kw:ftp
kw:ftpd
kw:full
kw:furl
kw:fuse
kw:garbage
kw:garbage-collection
kw:gateway
kw:gatherer
kw:gc
kw:gcc
kw:gentoo
kw:get
kw:git
kw:git-annex
kw:github
kw:glacier
kw:globalcaps
kw:glossary
kw:google-cloud-storage
kw:google-drive-backend
kw:gossip
kw:governance
kw:grid
kw:grid-manager
kw:gridid
kw:gridsync
kw:grsec
kw:gsoc
kw:gvfs
kw:hackfest
kw:hacktahoe
kw:hang
kw:hardlink
kw:heartbleed
kw:heisenbug
kw:help
kw:helper
kw:hint
kw:hooks
kw:how
kw:how-to
kw:howto
kw:hp
kw:hp-cloud
kw:html
kw:http
kw:https
kw:i18n
kw:i2p
kw:i2p-collab
kw:illustration
kw:image
kw:immutable
kw:impressions
kw:incentives
kw:incident
kw:init
kw:inlineCallbacks
kw:inotify
kw:install
kw:installer
kw:integration
kw:integration-test
kw:integrity
kw:interactive
kw:interface
kw:interfaces
kw:interoperability
kw:interstellar-exploration
kw:introducer
kw:introduction
kw:iphone
kw:ipkg
kw:iputil
kw:ipv6
kw:irc
kw:jail
kw:javascript
kw:joke
kw:jquery
kw:json
kw:jsui
kw:junk
kw:key-value-store
kw:kfreebsd
kw:known-issue
kw:konqueror
kw:kpreid
kw:kvm
kw:l10n
kw:lae
kw:large
kw:latency
kw:leak
kw:leasedb
kw:leases
kw:libgmp
kw:license
kw:licenss
kw:linecount
kw:link
kw:linux
kw:lit
kw:localhost
kw:location
kw:locking
kw:logging
kw:logo
kw:loopback
kw:lucid
kw:mac
kw:macintosh
kw:magic-folder
kw:manhole
kw:manifest
kw:manual-test-needed
kw:map
kw:mapupdate
kw:max_space
kw:mdmf
kw:memcheck
kw:memory
kw:memory-leak
kw:mesh
kw:metadata
kw:meter
kw:migration
kw:mime
kw:mingw
kw:minimal
kw:misc
kw:miscapture
kw:mlp
kw:mock
kw:more-info-needed
kw:mountain-lion
kw:move
kw:multi-users
kw:multiple
kw:multiuser-gateway
kw:munin
kw:music
kw:mutability
kw:mutable
kw:mystery
kw:names
kw:naming
kw:nas
kw:navigation
kw:needs-review
kw:needs-spawn
kw:netbsd
kw:network
kw:nevow
kw:new-user
kw:newcaps
kw:news
kw:news-done
kw:news-needed
kw:newsletter
kw:newurls
kw:nfc
kw:nginx
kw:nixos
kw:no-clobber
kw:node
kw:node-url
kw:notification
kw:notifyOnDisconnect
kw:nsa310
kw:nsa320
kw:nsa325
kw:numpy
kw:objects
kw:old
kw:openbsd
kw:openitp-packaging
kw:openssl
kw:openstack
kw:opensuse
kw:operation-helpers
kw:operational
kw:operations
kw:ophandle
kw:ophandles
kw:ops
kw:optimization
kw:optional
kw:options
kw:organization
kw:os
kw:os.abort
kw:ostrom
kw:osx
kw:osxfuse
kw:otf-magic-folder-objective1
kw:otf-magic-folder-objective2
kw:otf-magic-folder-objective3
kw:otf-magic-folder-objective4
kw:otf-magic-folder-objective5
kw:otf-magic-folder-objective6
kw:p2p
kw:packaging
kw:partial
kw:password
kw:path
kw:paths
kw:pause
kw:peer-selection
kw:performance
kw:permalink
kw:permissions
kw:persistence
kw:phone
kw:pickle
kw:pip
kw:pipermail
kw:pkg_resources
kw:placement
kw:planning
kw:policy
kw:port
kw:portability
kw:portal
kw:posthook
kw:pratchett
kw:preformance
kw:preservation
kw:privacy
kw:process
kw:profile
kw:profiling
kw:progress
kw:proxy
kw:publish
kw:pyOpenSSL
kw:pyasn1
kw:pycparser
kw:pycrypto
kw:pycrypto-lib
kw:pycryptopp
kw:pyfilesystem
kw:pyflakes
kw:pylint
kw:pypi
kw:pypy
kw:pysqlite
kw:python
kw:python3
kw:pythonpath
kw:pyutil
kw:pywin32
kw:quickstart
kw:quiet
kw:quotas
kw:quoting
kw:raic
kw:rainhill
kw:random
kw:random-access
kw:range
kw:raspberry-pi
kw:reactor
kw:readonly
kw:rebalancing
kw:recovery
kw:recursive
kw:redhat
kw:redirect
kw:redressing
kw:refactor
kw:referer
kw:referrer
kw:regression
kw:rekey
kw:relay
kw:release
kw:release-blocker
kw:reliability
kw:relnotes
kw:remote
kw:removable
kw:removable-disk
kw:rename
kw:renew
kw:repair
kw:replace
kw:report
kw:repository
kw:research
kw:reserved_space
kw:response-needed
kw:response-time
kw:restore
kw:retrieve
kw:retry
kw:review
kw:review-needed
kw:reviewed
kw:revocation
kw:roadmap
kw:rollback
kw:rpm
kw:rsa
kw:rss
kw:rst
kw:rsync
kw:rusty
kw:s3
kw:s3-backend
kw:s3-frontend
kw:s4
kw:same-origin
kw:sandbox
kw:scalability
kw:scaling
kw:scheduling
kw:schema
kw:scheme
kw:scp
kw:scripts
kw:sdist
kw:sdmf
kw:security
kw:self-contained
kw:server
kw:servermap
kw:servers-of-happiness
kw:service
kw:setup
kw:setup.py
kw:setup_requires
kw:setuptools
kw:setuptools_darcs
kw:sftp
kw:shared
kw:shareset
kw:shell
kw:signals
kw:simultaneous
kw:six
kw:size
kw:slackware
kw:slashes
kw:smb
kw:sneakernet
kw:snowleopard
kw:socket
kw:solaris
kw:space
kw:space-efficiency
kw:spam
kw:spec
kw:speed
kw:sqlite
kw:ssh
kw:ssh-keygen
kw:sshfs
kw:ssl
kw:stability
kw:standards
kw:start
kw:startup
kw:static
kw:static-analysis
kw:statistics
kw:stats
kw:stats_gatherer
kw:status
kw:stdeb
kw:storage
kw:streaming
kw:strports
kw:style
kw:stylesheet
kw:subprocess
kw:sumo
kw:survey
kw:svg
kw:symlink
kw:synchronous
kw:tac
kw:tahoe-*
kw:tahoe-add-alias
kw:tahoe-admin
kw:tahoe-archive
kw:tahoe-backup
kw:tahoe-check
kw:tahoe-cp
kw:tahoe-create-alias
kw:tahoe-create-introducer
kw:tahoe-debug
kw:tahoe-deep-check
kw:tahoe-deepcheck
kw:tahoe-lafs-trac-stream
kw:tahoe-list-aliases
kw:tahoe-ls
kw:tahoe-magic-folder
kw:tahoe-manifest
kw:tahoe-mkdir
kw:tahoe-mount
kw:tahoe-mv
kw:tahoe-put
kw:tahoe-restart
kw:tahoe-rm
kw:tahoe-run
kw:tahoe-start
kw:tahoe-stats
kw:tahoe-unlink
kw:tahoe-webopen
kw:tahoe.css
kw:tahoe_files
kw:tahoewapi
kw:tarball
kw:tarballs
kw:tempfile
kw:templates
kw:terminology
kw:test
kw:test-and-set
kw:test-from-egg
kw:test-needed
kw:testgrid
kw:testing
kw:tests
kw:throttling
kw:ticket999-s3-backend
kw:tiddly
kw:time
kw:timeout
kw:timing
kw:to
kw:to-be-closed-on-2011-08-01
kw:tor
kw:tor-protocol
kw:torsocks
kw:tox
kw:trac
kw:transparency
kw:travis
kw:travis-ci
kw:trial
kw:trickle
kw:trivial
kw:truckee
kw:tub
kw:tub.location
kw:twine
kw:twistd
kw:twistd.log
kw:twisted
kw:twisted-14
kw:twisted-trial
kw:twitter
kw:twn
kw:txaws
kw:type
kw:typeerror
kw:ubuntu
kw:ucwe
kw:ueb
kw:ui
kw:unclean
kw:uncoordinated-writes
kw:undeletable
kw:unfinished-business
kw:unhandled-error
kw:unhappy
kw:unicode
kw:unit
kw:unix
kw:unlink
kw:update
kw:upgrade
kw:upload
kw:upload-helper
kw:uri
kw:url
kw:usability
kw:use-case
kw:utf-8
kw:util
kw:uwsgi
kw:ux
kw:validation
kw:variables
kw:vdrive
kw:verify
kw:verlib
kw:version
kw:versioning
kw:versions
kw:video
kw:virtualbox
kw:virtualenv
kw:vista
kw:visualization
kw:visualizer
kw:vm
kw:volunteergrid2
kw:volunteers
kw:vpn
kw:wapi
kw:warners-opinion-needed
kw:warning
kw:weapi
kw:web
kw:web.port
kw:webapi
kw:webdav
kw:webdrive
kw:webport
kw:websec
kw:website
kw:websocket
kw:welcome
kw:welcome-page
kw:welcomepage
kw:wiki
kw:win32
kw:win64
kw:windows
kw:windows-related
kw:winscp
kw:workaround
kw:world-domination
kw:wrapper
kw:write-enabler
kw:wui
kw:x86
kw:x86-64
kw:xhtml
kw:xml
kw:xss
kw:zbase32
kw:zetuptoolz
kw:zfec
kw:zookos-opinion-needed
kw:zope
kw:zope.interface
p/blocker
p/critical
p/major
p/minor
p/normal
p/supercritical
p/trivial
r/cannot reproduce
r/duplicate
r/fixed
r/invalid
r/somebody else's problem
r/was already fixed
r/wontfix
r/worksforme
t/defect
t/enhancement
t/task
v/0.2.0
v/0.3.0
v/0.4.0
v/0.5.0
v/0.5.1
v/0.6.0
v/0.6.1
v/0.7.0
v/0.8.0
v/0.9.0
v/1.0.0
v/1.1.0
v/1.10.0
v/1.10.1
v/1.10.2
v/1.10a2
v/1.11.0
v/1.12.0
v/1.12.1
v/1.13.0
v/1.14.0
v/1.15.0
v/1.15.1
v/1.2.0
v/1.3.0
v/1.4.1
v/1.5.0
v/1.6.0
v/1.6.1
v/1.7.0
v/1.7.1
v/1.7β
v/1.8.0
v/1.8.1
v/1.8.2
v/1.8.3
v/1.8β
v/1.9.0
v/1.9.0-s3branch
v/1.9.0a1
v/1.9.0a2
v/1.9.0b1
v/1.9.1
v/1.9.2
v/1.9.2a1
v/cloud-branch
v/unknown
No milestone
No project
No assignees
5 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: tahoe-lafs/trac#795
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
It's a long-term goal, but it would be nice to have append-only
mutable files in Tahoe. Specifically, that means some sort of
mutable slot that has the following caps:
data)
Note that the appendcap and readcap are peers: neither can be
derived from the other.
This requires some tricky encryption, and will rely upon the
servers to enforce certain types of message updates. (this means
that a sufficiently-large set of colluding servers will be able
to roll back appended messages until someone with the writecap
comes along and flattens them all into the base file). I've got
some notes on this scheme somewhere, which I'll eventually
transcribe into this ticket. The basic approach is that the
appendcap gets you a signing key, and the server will accept
signed append messages, and put them next to all of the other
messages. The writecap gets you a different signing key, with
which the servers will accept arbitrary rewrite operations. I
also have notes on revocation schemes (where you can create
multiple "subwritecaps" for an object, but revoke them later),
which involves even more sets of signing keys.
This could be integrated with LDMF mutable files, in which the
servers hold a series of append-only revision nodes, forming a
version-history graph (which I still think should be based on
Mercurial's "revlog" format). In this case, writecap holders get
to add revision nodes, but not remove older ones.
oh, and of course, if the appendcap truely doesn't give you the ability to read any data, then this needs a public encryption key (like RSA or El-Gamal, not DSA). Each "append" message would have the data encrypted with a randomly-generated symmetric key, and then the key would be encrypted to the readcap's RSA decryption privkey.
There might be some other sort of "append-and-read-cap", which gives you both the ability to append messages and to read the existing messages (but not to remove anything: that is reserved for the writecap holder). I can imagine use-cases for both. This sort of cap would have a more straight-line derivation: writecap -> append-and-read-cap -> readcap.
More notes:
Basically, each append-only-able file would have three keypairs:
The writecap would get you everything, the appendcap gets you
Bs+Ce, and the readcap gets you just the three
verification/decryption keys Av+Bv+Cd. The server must know (and
enforce with) the verification keys Av+Bv.
The server would be responsible for:
to the file (or putting them in a unique file in the share
directory, or something)
Andy the Appendcap-holder would need to encrypt his message with
Ce, sign it with Bs, then send it to the servers. Rose the
readcap-holder would check all the signature and then use Cd to
decrypt the append messages. Val the verifycap-holder needs to be
able to check the hashes and signatures on the append messages,
but not decrypt them (so we need encrypt-then-sign, instead of
sign-then-encrypt). And Rob the Repairer needs to be able to make
new shares out of everything, so the body being signed should
contain the share hash tree root. Each append message will be
erasure-coded separately.
Every once in a while, Wally the writecap holder might merge all
the append messages down into the "base object". Until this
happened, a colluding set of servers could discard arbitrary
append messages (meaning the file would not be monotonically
increasing). There might be ways to have each message include the
hash of the previous ciphertext to detect this sort of thing
(forcing the server to discard all-or-nothing, restoring
monotonicity, but introducing race conditions).
Maybe "add-only collection" would be a better model for this,
instead of implying that the object contains a linear sequence
of bytes. In fact, we might say that the objects stored in this
collection are arbitrary strings, with IDs based upon hashing
their contents, and make the append(X) operation be idempotent,
and give readers an unsorted set of these strings. This would
make things like the add-only directory easier to model.
To turn this into a dirnode, the append message would probably
contain a "proposed child name" and child readcap/writecap.
If/when Wally merges down the file, any name conflicts would be
resolved (by appending "(1)", or whatever) then.
Preserving transitive-readonlyness probably requires that both
writecap holders and appendcap holders (Wally and Andy) be able
to get to the same symmetric key (the child-writecap
superencryption key). One noteable difference with the normal
mutable-file layout is that this child-writecap key must not
let you derive the read key. Rose the readcap holder can see the
(encrypted) child-writecap column but can't decrypt it. Andy the
appendcap holder can't see the child-writecap column ciphertext,
but if he could, he could decrypt it (since he's supposed to be
able to put child-writecaps somewhere that Wally can read them).
This reveals a collusion attack: Andy and Rose could work
together to learn the child writecaps.
So child-writecaps in append messages must be encrypted to the
writecap holder, and not to anybody else. This suggests we need
another encryption keypair, De/Dd, in which the writecap holder
gets Dd, and everybody else gets De (but only Andy will use it,
because those messages must also be signed by Bs, held only by
Andy and Wally). This can't be a combined sign+encryption key,
which might save space, since Val and Rob need Bv to verify
those append messages later.
Replying to warner:
... or if we embrace the add-only collection abstraction, a hash
of some subset of the existing entries.
This abstraction makes a lot of sense to me. I had been thinking
of add-only directories as being implemented in terms of
append-only files, but actually an add-only capability for a
set of byte strings is directly applicable to all of the use
cases I can think of.
For example,
to implement the immutable backup scenario, write each backup
as a set of new files (ideally using immutable directories, although
you could make do without them). Since they are not attached to an
existing directory structure, this does not require any authority
other than the necessary storage quota. Then, use an add-only set
to record the root caps for each backup. If the backups are
incremental, make them dependent on (by including a hash of)
previous backups. The read key for the set is kept off-line, or
generated from a passphrase.
to implement a tamper-resistant "log file", use an add-only set to
represent the set of timestamped log entries (perhaps entries at
about the same time could be batched for efficiency).
an add-only cap can represent the authority to submit a transaction,
with eventual consistency. This can represent any change to
application-level data structures, not just an addition. Multiple
holders of the add-only cap can submit transactions without being
able to interfere with each other.
The set abstraction also has the advantage of not appearing to give
consistency properties that are actually unimplementable. When you
read the set, you get some subset of its entries. Additions to the
set are only eventually seen by readers. You can make an addition
dependent on some of the existing entries, in which case anyone who
sees that entry is guaranteed to also see its dependent entries.
This sounds like a robust abstraction that you could imagine
building useful higher-level distributed systems on top of.
Can anyone think of important use cases where the add-only set
semantics are not sufficient?
Tagging issues relevant to new cap protocol design.
See also #796 (write-only backup caps). I'm not sure but think this ticket and #796 are subtly different use cases that might hopefully be addressed with a single mechanism.
See also the notes about append-only caps in wiki/NewCapDesign.
Nikita Borisov wrote on tahoe-dev:
This sounds like a very good fit to the add-only collection approach.
append-only filesto add-only setsHere's my rendition of our discussion of add-only sets at the Tahoe-LAFS Summit today. (As usual, I altered and embellished this story significantly while writing it down, and people who were present to witness the original discussion are invited to chime in.)
An add-only cap doesn't have to also be a write-only cap. It might be good for some use cases that you can give someone a cap that lets them read the whole set, and add elements into the set, without letting them remove elements or change previously-added elements. It might be good in some other use cases to have an "add-only&write-only" cap, which allows you to add elements into the set but doesn't let you read elements of the set, nor remove nor change previously-added elements. We agreed to focus on the former case for now, because it is easier to design and implement a solution to it. (See #796 for discussion of write-only caps.)
We agreed to forget about erasure-coding, which makes an already-confusing problem (how to implement add-only sets without allowing a few malicious servers, adders, or set-repairers to perform rollback attack or selection attack), into a very-confusing problem that exceeded my brain's ability to grapple with it.
So, for now, assume that add-only sets don't use erasure-coding at all.
Now, the basic design we came up with is like this. I'll explain it in multiple passes of successive refinement of the design.
FIRST PASS: DESIGN "0"
An authorized adder (someone who holds an add-cap) can generate "elements", which are bytestrings that can be added into the set. (I mispronounced "elements" as "elephants" at one point, and from that point forward the design was expressed in terms of a circus act involving elephants.)
Elephants have an identity as well as a value (bytestring), so:
results in
aos
containing two elephants, not one, even though each elephant has the same value (the bytestring with one hundred 0xFF bytes in it).aos.add_elephant()
generates a random 256-bit nonce to make this elephant different from any other elephant with the same value. I call this "putting a tag on the elephant's ear" — a "tagged elephant" is a value plus a nonce. Even if two elephants are identical twins, they can be distinguished by the unique nonce written on their ear-tags.aos.add_elephant()
then puts a digital signature on the tagged-elephant (using the add-only-cap, which contains an Ed25519 private key), and sends a copy of the tagged-elephant to every one ofN
different servers. Putting a digital signature on a tagged-elephant is called "wrapping a net around it".A reader downloads all the tagged-elephants from all the servers, checks all the signatures, takes the union of the results, and returns the resulting set of elephants.
Design "0" relies on ''at least one'' of the servers that you reach to save you from rollback or selection attacks. Such a server does this by knowing, and honestly serving up to you, a fresh and complete set of tagged-elephants. “Rollback” is serving you a version of the set that existed at some previous time, so the honest server giving you a copy of the most recent set protects you from rollback attack. “Selection” is omitting some elephants from the set, so the honest server giving you a complete copy of the set protects you from selection attack.
=== SECOND PASS: DESIGN "1"
We can extend Design "0" to make it harder for malicious servers to perform selection attacks on readers, even when the reader doesn't reach an honest server who has a complete copy of the most recent set.
The unnecessary vulnerability in Design "0" is that each tagged-elephant is signed independently of the other tagged-elephants, so malicious servers can deliver some tagged-elephants to a reader and withhold other tagged-elephants, and the reader will accept the resulting set, thus falling for a selection attack. To reduce this vulnerability, adders will sign all of the ''current'' tagged-elephants along with their new tagged-elephant with a single signature. More precisely, let the "identity" of a tagged-elephant be the secure hash of the tagged-elephant (i.e. the secure hash of the nonce concatenated with the value). The signature on a new tagged-elephant covers the identity of that tagged-elephant, concatenated with the identities of all extant tagged-elephants, under a single signature. In circus terms, you add the new tagged-elephant into a pile of tagged-elephants and throw a net over the entire pile, including the new tagged-elephant.
Now, malicious servers can't omit any of the older tagged-elephants without also omitting the new tagged-elephant. Readers will not accept the new tagged-elephant unless they also have a copy of all of the other tagged-elephants that were signed with the same signature. This limits the servers's options for selection attacks.
=== THIRD PASS: DESIGN "2"
We can refine Design "1" to make it cleaner and more CPU-efficient. This will also lay the groundwork for an efficient network protocol.
The unnecessary "dirtiness" in Design "1" is that the digital signatures on older tagged-elephants become extraneous once you add a new digital signature. We have a mass of tagged-elephants, we throw a net over the whole mass, then later when we add a new tagged-elephant to the pile, we throw a new net on top of the new (slightly larger) pile. Now the ''underlying'' net has become redundant: once you've verified the signature of the outermost net, there is no need to check the signature of the inner net. In fact, if one implementation checks the signature of the inner net and another implementation does not check it, then a malicious adder colluding with a malicious server could cause the implementations to differ in their results, by putting an invalid net (an invalid signature) topped by a new tagged-elephant with a valid net. (Daira was the one who noticed that issue.)
To make this cleaner and more efficient, we will never put a net around a net, and instead we'll keep each tagged-elephant in a box. When you want to add a new tagged-elephant to a set, you rip off and throw away any extant nets, then you ''put the new tagged-elephant in a box which is nailed on top of the previous topmost box''. Then you wrap a net around the new topmost box. "Nailing" box Y on top of box X means taking the secure hash of box X and appending that to box Y (before signing box Y). A "box" is a tagged-elephant concatenated with any number of "nails", each of which is the secure hash of a previous box.
(Note that you can sometimes have two or more boxes precariously perched at the top of a stack, when two adders have simultaneously added a box before each saw the other's new box. That's okay — the next time an adder adds a box on top of this stack, he'll nail his new box to ''each'' of the previous topmost boxes.)
Boxes are a lot more CPU-efficient than nets, and more importantly nobody (neither readers, adders, nor servers) needs to revisit a lower-down box in order to add a new top-most box. Once you nail box Y on top of box X, then you can later add box Z just by taking the hash of box Y, without revisiting box X.
Note that we need two different secure hashes here: one is the identity of a tagged-elephant, which is the secure hash of: the nonce concatenated with the value. The other is the hash of the box, which is the secure hash of: the identity of a tagged-elephant concatenated with the hashes of any previous boxes. We need the identity of a tagged-elephant for finding out whether a certain tagged-elephant already exists in a stack (regardless of what position it occupies within that stack), and we need the hash of the box for efficiently verifying that all the tagged-elephants in a stack were approved by an authorized adder.
This also leads to the efficient network protocol: an adder can remember (cache) the Directed Acyclic Graph of boxes which a given server previously told the adder about. When the adder wants to add a new tagged-elephant or a set of new tagged-elephants to that server, he can send just the boxes which would be ''new'' to that server, assuming that the server hasn't learned anything new since the last time they talked. Readers can do likewise, remembering what each server previously told them about, and asking the server to just tell them about things that are not already covered the topmost box(es) that the reader already knows about.
=== CONCLUSION
Okay, that's it! I think Design "2" is a good one. It has good security against rollback or selection attacks by malicious servers (assuming some kind of whitelisting of servers! Which is ticket #467 and is not yet implemented.) And, it doesn't go ''too'' far over the top in terms of complexity; it seems more intuitive to me than (my vague memories of) previous attempts to design add-only sets for LAFS.
(By the way, there are a few other possible [query:status=!closed&keywords=~rollback&order=priority ways to strengthen defenses against rollback attack], which we've previously considered in the context of mutable files, but they probably also apply to add-only sets.)
I'm excited about this design being feasible, because I think add-only sets could be a critical building block in valuable use-cases such as secure logging, secure email, secure backup, and more.
While #796 is my particular usecase, this seems to be the canonical thread for the base implementation, so I'll comment here; I was also looking for write-only caps (in particular append-only directories where you can only append files and not change, move/rename or remove files), for a write-only backup mechanism where no single server has the ability to remove any backups.
In my case, the threat model would be an attacker gaining (full or partial) access to a server that is being automatically backed up with Duplicity, intending to irrevocably wipe all data he can. A write-only cap would make the attacker unable to remove existing backups; only I personally would be able to do so, using a writecap that is not stored on any server.
I should also note that there'd be a potential confidentiality issue with non-read write-only caps (that I expect at least zooko to have already thought of): if you have a cap that is supposedly intended to prevent any discovery of data (including a directory listing), how would you reject an attempt at overwriting an entry (ie. a conflict) without revealing that the original entry exists, thus causing an information leak and confidentiality breach?
Either way, I very strongly +1 this write-only-cap suggestion - and I believe 'appendcap' would be an apt name for it :)
We're chatting about use cases during Tesla Coils & Corpses. These use cases imply slightly different semantics and I imply this by the capability names in environment variables:
syslog case
tahoe syslog-collector $APPEND_ONLY_CAP
tahoe get-records $READ_RECORD_CAP
(-note this is not a byte-oriented read.)backup case
tahoe backup
- Similar to syslog, except the records refer to immutable directories. Also, maybe ordering is less important?drop inbox
ls
,tar
,cp
etc...I've missed many spoken use cases while writing this.
edit: Minor markup changes to list delimiters and dropped some parentheses.
There was a really good discussion about this in a Tesla Coils & Corpses meeting, and I wrote pretty good notes from that meeting here: [//pipermail/tahoe-dev/2014-August/009141.html]
I needed something like this to prevent ransomware from dumping/corrupting my backups.
Not wanting to let the perfect be the enemy of the good, I wrote a terrible tool that does the job with existing data types.
Now obviously it's slow, and using raw URIs on the command line isn't the most secure thing in the world, but it only stores the write cap for the end of the linked list, so any data recorded prior to an intruder getting in to where they can log writecaps is effectively immutable. Store the first data node's URI as your immutable recovery start point.