anonymous client mode #1010
Labels
No labels
c/code
c/code-dirnodes
c/code-encoding
c/code-frontend
c/code-frontend-cli
c/code-frontend-ftp-sftp
c/code-frontend-magic-folder
c/code-frontend-web
c/code-mutable
c/code-network
c/code-nodeadmin
c/code-peerselection
c/code-storage
c/contrib
c/dev-infrastructure
c/docs
c/operational
c/packaging
c/unknown
c/website
kw:2pc
kw:410
kw:9p
kw:ActivePerl
kw:AttributeError
kw:DataUnavailable
kw:DeadReferenceError
kw:DoS
kw:FileZilla
kw:GetLastError
kw:IFinishableConsumer
kw:K
kw:LeastAuthority
kw:Makefile
kw:RIStorageServer
kw:StringIO
kw:UncoordinatedWriteError
kw:about
kw:access
kw:access-control
kw:accessibility
kw:accounting
kw:accounting-crawler
kw:add-only
kw:aes
kw:aesthetics
kw:alias
kw:aliases
kw:aliens
kw:allmydata
kw:amazon
kw:ambient
kw:annotations
kw:anonymity
kw:anonymous
kw:anti-censorship
kw:api_auth_token
kw:appearance
kw:appname
kw:apport
kw:archive
kw:archlinux
kw:argparse
kw:arm
kw:assertion
kw:attachment
kw:auth
kw:authentication
kw:automation
kw:avahi
kw:availability
kw:aws
kw:azure
kw:backend
kw:backoff
kw:backup
kw:backupdb
kw:backward-compatibility
kw:bandwidth
kw:basedir
kw:bayes
kw:bbfreeze
kw:beta
kw:binaries
kw:binutils
kw:bitcoin
kw:bitrot
kw:blacklist
kw:blocker
kw:blocks-cloud-deployment
kw:blocks-cloud-merge
kw:blocks-magic-folder-merge
kw:blocks-merge
kw:blocks-raic
kw:blocks-release
kw:blog
kw:bom
kw:bonjour
kw:branch
kw:branding
kw:breadcrumbs
kw:brians-opinion-needed
kw:browser
kw:bsd
kw:build
kw:build-helpers
kw:buildbot
kw:builders
kw:buildslave
kw:buildslaves
kw:cache
kw:cap
kw:capleak
kw:captcha
kw:cast
kw:centos
kw:cffi
kw:chacha
kw:charset
kw:check
kw:checker
kw:chroot
kw:ci
kw:clean
kw:cleanup
kw:cli
kw:cloud
kw:cloud-backend
kw:cmdline
kw:code
kw:code-checks
kw:coding-standards
kw:coding-tools
kw:coding_tools
kw:collection
kw:compatibility
kw:completion
kw:compression
kw:confidentiality
kw:config
kw:configuration
kw:configuration.txt
kw:conflict
kw:connection
kw:connectivity
kw:consistency
kw:content
kw:control
kw:control.furl
kw:convergence
kw:coordination
kw:copyright
kw:corruption
kw:cors
kw:cost
kw:coverage
kw:coveralls
kw:coveralls.io
kw:cpu-watcher
kw:cpyext
kw:crash
kw:crawler
kw:crawlers
kw:create-container
kw:cruft
kw:crypto
kw:cryptography
kw:cryptography-lib
kw:cryptopp
kw:csp
kw:curl
kw:cutoff-date
kw:cycle
kw:cygwin
kw:d3
kw:daemon
kw:darcs
kw:darcsver
kw:database
kw:dataloss
kw:db
kw:dead-code
kw:deb
kw:debian
kw:debug
kw:deep-check
kw:defaults
kw:deferred
kw:delete
kw:deletion
kw:denial-of-service
kw:dependency
kw:deployment
kw:deprecation
kw:desert-island
kw:desert-island-build
kw:design
kw:design-review-needed
kw:detection
kw:dev-infrastructure
kw:devpay
kw:directory
kw:directory-page
kw:dirnode
kw:dirnodes
kw:disconnect
kw:discovery
kw:disk
kw:disk-backend
kw:distribute
kw:distutils
kw:dns
kw:do_http
kw:doc-needed
kw:docker
kw:docs
kw:docs-needed
kw:dokan
kw:dos
kw:download
kw:downloader
kw:dragonfly
kw:drop-upload
kw:duplicity
kw:dusty
kw:earth-dragon
kw:easy
kw:ec2
kw:ecdsa
kw:ed25519
kw:egg-needed
kw:eggs
kw:eliot
kw:email
kw:empty
kw:encoding
kw:endpoint
kw:enterprise
kw:enum34
kw:environment
kw:erasure
kw:erasure-coding
kw:error
kw:escaping
kw:etag
kw:etch
kw:evangelism
kw:eventual
kw:example
kw:excess-authority
kw:exec
kw:exocet
kw:expiration
kw:extensibility
kw:extension
kw:failure
kw:fedora
kw:ffp
kw:fhs
kw:figleaf
kw:file
kw:file-descriptor
kw:filename
kw:filesystem
kw:fileutil
kw:fips
kw:firewall
kw:first
kw:floatingpoint
kw:flog
kw:foolscap
kw:forward-compatibility
kw:forward-secrecy
kw:forwarding
kw:free
kw:freebsd
kw:frontend
kw:fsevents
kw:ftp
kw:ftpd
kw:full
kw:furl
kw:fuse
kw:garbage
kw:garbage-collection
kw:gateway
kw:gatherer
kw:gc
kw:gcc
kw:gentoo
kw:get
kw:git
kw:git-annex
kw:github
kw:glacier
kw:globalcaps
kw:glossary
kw:google-cloud-storage
kw:google-drive-backend
kw:gossip
kw:governance
kw:grid
kw:grid-manager
kw:gridid
kw:gridsync
kw:grsec
kw:gsoc
kw:gvfs
kw:hackfest
kw:hacktahoe
kw:hang
kw:hardlink
kw:heartbleed
kw:heisenbug
kw:help
kw:helper
kw:hint
kw:hooks
kw:how
kw:how-to
kw:howto
kw:hp
kw:hp-cloud
kw:html
kw:http
kw:https
kw:i18n
kw:i2p
kw:i2p-collab
kw:illustration
kw:image
kw:immutable
kw:impressions
kw:incentives
kw:incident
kw:init
kw:inlineCallbacks
kw:inotify
kw:install
kw:installer
kw:integration
kw:integration-test
kw:integrity
kw:interactive
kw:interface
kw:interfaces
kw:interoperability
kw:interstellar-exploration
kw:introducer
kw:introduction
kw:iphone
kw:ipkg
kw:iputil
kw:ipv6
kw:irc
kw:jail
kw:javascript
kw:joke
kw:jquery
kw:json
kw:jsui
kw:junk
kw:key-value-store
kw:kfreebsd
kw:known-issue
kw:konqueror
kw:kpreid
kw:kvm
kw:l10n
kw:lae
kw:large
kw:latency
kw:leak
kw:leasedb
kw:leases
kw:libgmp
kw:license
kw:licenss
kw:linecount
kw:link
kw:linux
kw:lit
kw:localhost
kw:location
kw:locking
kw:logging
kw:logo
kw:loopback
kw:lucid
kw:mac
kw:macintosh
kw:magic-folder
kw:manhole
kw:manifest
kw:manual-test-needed
kw:map
kw:mapupdate
kw:max_space
kw:mdmf
kw:memcheck
kw:memory
kw:memory-leak
kw:mesh
kw:metadata
kw:meter
kw:migration
kw:mime
kw:mingw
kw:minimal
kw:misc
kw:miscapture
kw:mlp
kw:mock
kw:more-info-needed
kw:mountain-lion
kw:move
kw:multi-users
kw:multiple
kw:multiuser-gateway
kw:munin
kw:music
kw:mutability
kw:mutable
kw:mystery
kw:names
kw:naming
kw:nas
kw:navigation
kw:needs-review
kw:needs-spawn
kw:netbsd
kw:network
kw:nevow
kw:new-user
kw:newcaps
kw:news
kw:news-done
kw:news-needed
kw:newsletter
kw:newurls
kw:nfc
kw:nginx
kw:nixos
kw:no-clobber
kw:node
kw:node-url
kw:notification
kw:notifyOnDisconnect
kw:nsa310
kw:nsa320
kw:nsa325
kw:numpy
kw:objects
kw:old
kw:openbsd
kw:openitp-packaging
kw:openssl
kw:openstack
kw:opensuse
kw:operation-helpers
kw:operational
kw:operations
kw:ophandle
kw:ophandles
kw:ops
kw:optimization
kw:optional
kw:options
kw:organization
kw:os
kw:os.abort
kw:ostrom
kw:osx
kw:osxfuse
kw:otf-magic-folder-objective1
kw:otf-magic-folder-objective2
kw:otf-magic-folder-objective3
kw:otf-magic-folder-objective4
kw:otf-magic-folder-objective5
kw:otf-magic-folder-objective6
kw:p2p
kw:packaging
kw:partial
kw:password
kw:path
kw:paths
kw:pause
kw:peer-selection
kw:performance
kw:permalink
kw:permissions
kw:persistence
kw:phone
kw:pickle
kw:pip
kw:pipermail
kw:pkg_resources
kw:placement
kw:planning
kw:policy
kw:port
kw:portability
kw:portal
kw:posthook
kw:pratchett
kw:preformance
kw:preservation
kw:privacy
kw:process
kw:profile
kw:profiling
kw:progress
kw:proxy
kw:publish
kw:pyOpenSSL
kw:pyasn1
kw:pycparser
kw:pycrypto
kw:pycrypto-lib
kw:pycryptopp
kw:pyfilesystem
kw:pyflakes
kw:pylint
kw:pypi
kw:pypy
kw:pysqlite
kw:python
kw:python3
kw:pythonpath
kw:pyutil
kw:pywin32
kw:quickstart
kw:quiet
kw:quotas
kw:quoting
kw:raic
kw:rainhill
kw:random
kw:random-access
kw:range
kw:raspberry-pi
kw:reactor
kw:readonly
kw:rebalancing
kw:recovery
kw:recursive
kw:redhat
kw:redirect
kw:redressing
kw:refactor
kw:referer
kw:referrer
kw:regression
kw:rekey
kw:relay
kw:release
kw:release-blocker
kw:reliability
kw:relnotes
kw:remote
kw:removable
kw:removable-disk
kw:rename
kw:renew
kw:repair
kw:replace
kw:report
kw:repository
kw:research
kw:reserved_space
kw:response-needed
kw:response-time
kw:restore
kw:retrieve
kw:retry
kw:review
kw:review-needed
kw:reviewed
kw:revocation
kw:roadmap
kw:rollback
kw:rpm
kw:rsa
kw:rss
kw:rst
kw:rsync
kw:rusty
kw:s3
kw:s3-backend
kw:s3-frontend
kw:s4
kw:same-origin
kw:sandbox
kw:scalability
kw:scaling
kw:scheduling
kw:schema
kw:scheme
kw:scp
kw:scripts
kw:sdist
kw:sdmf
kw:security
kw:self-contained
kw:server
kw:servermap
kw:servers-of-happiness
kw:service
kw:setup
kw:setup.py
kw:setup_requires
kw:setuptools
kw:setuptools_darcs
kw:sftp
kw:shared
kw:shareset
kw:shell
kw:signals
kw:simultaneous
kw:six
kw:size
kw:slackware
kw:slashes
kw:smb
kw:sneakernet
kw:snowleopard
kw:socket
kw:solaris
kw:space
kw:space-efficiency
kw:spam
kw:spec
kw:speed
kw:sqlite
kw:ssh
kw:ssh-keygen
kw:sshfs
kw:ssl
kw:stability
kw:standards
kw:start
kw:startup
kw:static
kw:static-analysis
kw:statistics
kw:stats
kw:stats_gatherer
kw:status
kw:stdeb
kw:storage
kw:streaming
kw:strports
kw:style
kw:stylesheet
kw:subprocess
kw:sumo
kw:survey
kw:svg
kw:symlink
kw:synchronous
kw:tac
kw:tahoe-*
kw:tahoe-add-alias
kw:tahoe-admin
kw:tahoe-archive
kw:tahoe-backup
kw:tahoe-check
kw:tahoe-cp
kw:tahoe-create-alias
kw:tahoe-create-introducer
kw:tahoe-debug
kw:tahoe-deep-check
kw:tahoe-deepcheck
kw:tahoe-lafs-trac-stream
kw:tahoe-list-aliases
kw:tahoe-ls
kw:tahoe-magic-folder
kw:tahoe-manifest
kw:tahoe-mkdir
kw:tahoe-mount
kw:tahoe-mv
kw:tahoe-put
kw:tahoe-restart
kw:tahoe-rm
kw:tahoe-run
kw:tahoe-start
kw:tahoe-stats
kw:tahoe-unlink
kw:tahoe-webopen
kw:tahoe.css
kw:tahoe_files
kw:tahoewapi
kw:tarball
kw:tarballs
kw:tempfile
kw:templates
kw:terminology
kw:test
kw:test-and-set
kw:test-from-egg
kw:test-needed
kw:testgrid
kw:testing
kw:tests
kw:throttling
kw:ticket999-s3-backend
kw:tiddly
kw:time
kw:timeout
kw:timing
kw:to
kw:to-be-closed-on-2011-08-01
kw:tor
kw:tor-protocol
kw:torsocks
kw:tox
kw:trac
kw:transparency
kw:travis
kw:travis-ci
kw:trial
kw:trickle
kw:trivial
kw:truckee
kw:tub
kw:tub.location
kw:twine
kw:twistd
kw:twistd.log
kw:twisted
kw:twisted-14
kw:twisted-trial
kw:twitter
kw:twn
kw:txaws
kw:type
kw:typeerror
kw:ubuntu
kw:ucwe
kw:ueb
kw:ui
kw:unclean
kw:uncoordinated-writes
kw:undeletable
kw:unfinished-business
kw:unhandled-error
kw:unhappy
kw:unicode
kw:unit
kw:unix
kw:unlink
kw:update
kw:upgrade
kw:upload
kw:upload-helper
kw:uri
kw:url
kw:usability
kw:use-case
kw:utf-8
kw:util
kw:uwsgi
kw:ux
kw:validation
kw:variables
kw:vdrive
kw:verify
kw:verlib
kw:version
kw:versioning
kw:versions
kw:video
kw:virtualbox
kw:virtualenv
kw:vista
kw:visualization
kw:visualizer
kw:vm
kw:volunteergrid2
kw:volunteers
kw:vpn
kw:wapi
kw:warners-opinion-needed
kw:warning
kw:weapi
kw:web
kw:web.port
kw:webapi
kw:webdav
kw:webdrive
kw:webport
kw:websec
kw:website
kw:websocket
kw:welcome
kw:welcome-page
kw:welcomepage
kw:wiki
kw:win32
kw:win64
kw:windows
kw:windows-related
kw:winscp
kw:workaround
kw:world-domination
kw:wrapper
kw:write-enabler
kw:wui
kw:x86
kw:x86-64
kw:xhtml
kw:xml
kw:xss
kw:zbase32
kw:zetuptoolz
kw:zfec
kw:zookos-opinion-needed
kw:zope
kw:zope.interface
p/blocker
p/critical
p/major
p/minor
p/normal
p/supercritical
p/trivial
r/cannot reproduce
r/duplicate
r/fixed
r/invalid
r/somebody else's problem
r/was already fixed
r/wontfix
r/worksforme
t/defect
t/enhancement
t/task
v/0.2.0
v/0.3.0
v/0.4.0
v/0.5.0
v/0.5.1
v/0.6.0
v/0.6.1
v/0.7.0
v/0.8.0
v/0.9.0
v/1.0.0
v/1.1.0
v/1.10.0
v/1.10.1
v/1.10.2
v/1.10a2
v/1.11.0
v/1.12.0
v/1.12.1
v/1.13.0
v/1.14.0
v/1.15.0
v/1.15.1
v/1.2.0
v/1.3.0
v/1.4.1
v/1.5.0
v/1.6.0
v/1.6.1
v/1.7.0
v/1.7.1
v/1.7β
v/1.8.0
v/1.8.1
v/1.8.2
v/1.8.3
v/1.8β
v/1.9.0
v/1.9.0-s3branch
v/1.9.0a1
v/1.9.0a2
v/1.9.0b1
v/1.9.1
v/1.9.2
v/1.9.2a1
v/cloud-branch
v/unknown
No milestone
No project
No assignees
8 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: tahoe-lafs/trac#1010
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
For the anonymous network use case (such as I2P and Tor), we want to only use
127.0.0.1
as loopback address. Right now Tahoe discovers all local addresses through various strategies and discloses them to (at least) the introducer.For I2P we have introduced the configuration option
anonymize_local_addresses
(which we consider renaming totub.anonymize
) to disable this lookup.Example of configuration in
tahoe.cfg
:Snippit showing how this is used in
node.py
:Attachment 1010_anonymize_local_addresses.diff (1259 bytes) added
What is the motivation for using all local addresses normally?
It seems like a great idea to have a tub.anonymize flag. It would be fantastic to have Tahoe throw exceptions like confetti if that option is set and a few key (tub.address) settings aren't configured. Anything less may lead to information PII leakage.
There are a number of sanitation issues that need to be carefully handled. Having a single bit to say we want to try to handle those issues is probably a good start.
Replying to ioerror:
+1 for having a single config bit to indicate that the user wants anonymous operation.
However,
tub.*
" options apply to foolscap tubs, and the single config bit may need to change behaviour other than in foolscap.Also needs a doc patch.
Unsetting
review-needed
since it needs a doc patch before going back to thereview-needed
status.I think that "tub.location = " (setting it to an empty string) is even better. tub.location is the right thing to set here: everything sent to the introducer will derive from what the Tub concludes, and tub.location is the way to override that automatically-figure-out-my-own-addresses behavior.
Perhaps a "
nodeanonymous
" flag would be useful as a statement of policy, and implemented as a check in various places: if "anonymous=true", then we scan the tub's location just before sending it to the introducer, and throw an exception if it isn't empty? And if we add other places that reveal identifiable information in the future, we also guard those with the "if not anonymous" check?davidsarah: the motivation for including 127.0.0.1 in the list-of-addresses is to allow two nodes on the same machine to establish a fast (loopback) connection to each other. I use this all the time in test scenarios, and in grids in which the helper runs on the same node as something else (generally an introducer or a storage server). There's half an argument to remove it, but I think that most of those cases are handled better by having people publish an explicit tub.location that doesn't include it.
Replying to warner:
I meant, what is lost by only having 127.0.0.1 in the list of addresses, all the time?
What's the next step on this ticket? Brian answer David-Sarah's question from comment:8? Duck write docs as requested in comment:377644?
Brian: please answer David-Sarah's question from comment:8.
Adding keyword "tor" since the same issues probably apply to tor users as apply to i2p users. (Indeed, I suspect all "tor" and "i2p" tags should probably just be converted to "privacy" tags.)
Attachment 1010-anonymize-local-addresses.patch (2695 bytes) added
Allow for local address anonymization
Brian's suggestion from comment:377647 has been taken; if "tub.location = " (an empty string), then the local address discovery is replaced by just using 127.0.0.1 as address; this replaces the previously suggested
anonymize_local_addresses = true
option.While an anonymous policy flag as suggested by ioerror in comment:377642 and by Brian in comment:377646 would be a good idea, I consider that material for another ticket.
In addition to this an unit test has been written. Once #1301 is implemented the trial patch decorator can be used instead.
This is the Ticket of the Week in Tahoe-LAFS Weekly News edition 4: http://tahoe-lafs.org/~zooko/TWN4.html
If a node is only a client, then perhaps it should default to not disclosing addresses. ( I realize that in theory a client with a public address could get incoming connections from a server without a routable address, but I consider servers without routable addresses to be buggy.)
Only use 127.0.0.1 as local addressto use only 127.0.0.1 as local addressI'm putting this into the 1.9 Milestone because duck has done the work we asked of him and it would feel good to therefore include his patch in 1.9. (
review-needed
!)Reviewed patch and looks good to me.
whoops, I really lost track of this one.
DS> I meant, what is lost by only having 127.0.0.1 in the list of
DS> addresses, all the time?
Um, if everyone in the grid only publishes 127.0.0.1, then how will
distant nodes ever connect to each other? Since we don't have UPnP or
NAT traversal, we need everyone (well, N-1) to have+publish a public IP
address.
This patch needs docs: in addition to making sure people can
successfully use this feature, we need to a place to answer the user
confusion that's likely to occur when someone assumes that
"
tub.location=
" should behave the same way as "#tub.location=
". (I'm notas sure that
["tub.location=" == anonymous]
is the best UI for this, atleast not as sure as I was 15 months ago. The whole-config "anonymous"
flag feels like an important addition.)
gdt's observation in comment:377655 is a good one. Ideally, pure-clients
should be able to hang out behind NAT and not admit to having a real
address. (I think the FURL location-hint format will tolerate this,
but I haven't actually tested it). We've always been on the fence about
whether Tahoe is a client-server system or a P2P system. Having clients
announce their addresses makes it more P2Pish.
I've had topology problems (servers behind NAT) which made me glad that
it's possible for servers to connect to clients too. To actually enable
this, I had to make my "clients" pretend to be servers (but with
storage.readonly=true): otherwise the servers wouldn't hear about the
client and wouldn't try to connect. Nodes only actually publish their
storage FURLs to the Introducer if they're configured as servers (see
init_storage()
in client.py). But they'll reveal their FURLs(along with their IP address) in any reference that passes over the
wire.
So anyways, I'm ok with this patch if it includes a paragraph in
docs/configuration.rst (in the section on tub.location) explaining what
happens when you use "tub.location=" and why you might want to do that.
If we find that it's hard to explain this feature in there, then maybe
it's not a good feature to add.
Replying to warner:
Yes, I realized that was a silly question (for nodes in general) but forgot to unask it. I suppose I was thinking only of clients, or more precisely non-servers.
gdt's suggestion of only including 127.0.0.1 in the list of addresses for non-servers makes sense to me, especially given this:
actually in that case we might as well not advertise any addresses. Thta's a fairly clear hint that we don't want other people trying to contact us :)
Replying to warner:
I wasn't sure if that would break anything, but yes.
How does this interact with #1086? Do we still want servers not to try to connect to clients by default?
Attachment 1010-docs.darcs.patch (35828 bytes) added
docs/configuration.rst: document 'tub.location =' for hiding local IP addresses. refs #1010
Attachment fix-1010.darcs.patch (34807 bytes) added
node.py: implement 'tub.location =' for hiding local IP addresses. fixes #1010
Attachment test-1010.darcs.patch (35548 bytes) added
test_node.py: test that 'tub.location =' hides local IP addresses. This version unpatches on synchronous exceptions, and uses fileutil.write. refs #1010
Let's kick the question of what addresses the client should advertise by default out to the next release. I don't think that this patch conflicts with any decision we would be likely to make about that. It also doesn't conflict with adding a whole-config anonymous flag.
I've recorded the changes as darcs patches, added some docs, and made a couple of minor improvements to the test (see its description). The fix itself hasn't changed and is already reviewed.
Reviewed attachment:1010-docs.darcs.patch and found no glaring errors. Review still needed for attachment:test-1010.darcs.patch.
Attachment cleanup-1010.dpatch (41952 bytes) added
combined some cleanup with the other three patches
Review:
I haven't thought through davidsarah's assertion in comment:377664 that this patch won't make it harder to do what I want (clients listen for connections from servers by default, and anonymous-mode is an explicit flag instead of setting
location=''
) in the future, but I'll take their word for it.I find it very confusing that
location=*
in thetahoe.cfg
file means "Emit only 127.0.0.1" butlocation=*
in source:src/allmydata/node.py's_setup_tub()
means "Discover all local IP addresses and emit them.". This is on top of my slight confusion about the fact thatlocation=None
intahoe.cfg
has a different meaning fromlocation=''
. Or wait -- does it? Is one of them the same as not having alocation
entry at all?Off I go to search for answers in the docs in attachment:cleanup-1010.dpatch. But the fact that I experience this much confusion after glancing at a few lines of the code and the config file is a bad sign.
Okay, I've started reading the docs patch from attachment:cleanup-1010.dpatch:
It helps to allay my confusion because it explicitly says "Note that this is not the same as omitting
tub.location
.". However, it doesn't help all the way: what does it mean if you omittub.location
? (The answer, I believe is, that it discovers your local IP addresses and advertises them.) Does it mean anything if you saytub.location=None
, or is that an error?I think there are four different use cases here, three of which are currently supported, and our docs should be more explicit about enumerating them.
Also: is there a valid distinction between
Now this patch currently lets the user express option 2 by setting
tub.location=
, option 1 by setting notub.location
, and option 3 by settingtub.location=207.7.145.194
. I'm -1 on this design:node.anonymous = True
, as we discussed above.tub.location=127.0.0.1
.We discussed one of those alternative designs above, and we said maybe we can changed out minds later, but this may be a mistake because
Do we have a plan for how to provide backwards compatibility if we were to change to a different design in a future release? I guess we might need to have a phase where the
tahoe
node understood both old and new configuration formats, stopped with a fatal error if they were inconsistent, and emitted a warning if the old one existed at all. Then eventually we might go through another cycle like the one we just finished with #1385 where we stop with a fatal error if the old style is present. (By the way, #1385 turned out to be a lot more painful of a patch to integrate and debug than I had anticipated.)I'm willing to listen to counter-arguments, but at the moment I'm -1 on this design. I haven't finished reviewing the actual patches in attachment:cleanup-1010.dpatch, but I'm going to stop here and focus on #393 instead. I'm sorry I didn't think about the backward-compatibility issues earlier and that I didn't think about the specific configuration format earlier so I would realize I was uncomfortable with it before this late stage. (Also it is too bad I didn't think of the potential identity-revealing weakness earlier so we could have time to think it through before this late stage.)
Once we finish this ticket, we should see if that means ticket #517 can also be closed or if there is further work to do for #517.
#1207 is a closely related ticket which shows that some people (starting with gdt) want to have yet another variation, where unrouteable IP addresses are excluded from the advertised list.
It seems as though you can already achieve exactly the effect of the current patch with "
tub.location = 127.0.0.1
" -- or a similar effect with "tub.location = unreachable.example.org:0
" as the [documentation suggests]source:trunk/docs/configuration.rst#overall-node-configuration.So on reflection, I'm also -1 on including this in v1.9.
(The cleanup patch also has the improvement of not calling
iputil.get_local_addresses_async()
if its value is going to be discarded, but that's not urgent.)Attachment 1010-use-only-127.patch (6149 bytes) added
cleanup and refactored against current trunk
I added an updated patch against trunk which uses Brian's last patch.
In the dev meeting zooko suggested an anonymize flag that would instead of having a blank tub.location. The anonymize flag would ensure that sensitive information like IP addresses are not broadcast. Having a blank tub.location value would be confused with auto detected location. Perhaps using keywords like "auto" could be used to indicate auto configured address with that behavior being default.
I just reviewed attachment:1010-use-only-127.patch (during Weekly Dev Chat).
Thank you for updating this patch to apply to the current trunk! The patch makes sense and is usefully addressing this issue. However, we talked it over at our recent Weekly Dev Chat ([//pipermail/tahoe-dev/2013-August/008674.html notes]), and have a few requirements for safety of the configuration:
Let's add a
nodeanonymize
flag to thetahoe.cfg
file. The meaning of this flag is: stop the process and print an error message if any of the configuration options would compromise my identity. There are also probably going to be other meanings of this flag added in other patches (i.e., this flag will probably come to mean also: do not allow any outgoing connections that are not over a anonymous routing layer such as Tor or I2P).Instead of "
tub.location=
" (the empty string) meaning to not advertise any location, lettub.location=UNREACHABLE
mean that. (This is in order to avoid confusion in the mind of the user about the distinction betweentub.location
being absent versus it being present with an empty value. See also below, about backward compatibility.)If
tub.location=UNREACHABLE
, then pass the special hardcoded valueunreachable.example.org:0
to foolscap instead of the empty string to foolscap. (This is because foolscap currently can't handle the empty string for its connection hints — see http://foolscap.lothar.com/trac/ticket/208 .)Instead of expressing that the node's IP address should be auto-detected by the absence of
tub.location
, express it bytub.location
being set toAUTODETECT
.Note that there is a third option besides
AUTODETECT
andUNREACHABLE
, and that is to settub.location
to a specific set of IP address+port, DNS name+port, I2P addresses, or Tor (.onion) addresses. I don't know if Tor or I2P users would always do the latter, or if they would sometimes set it toUNREACHABLE
.Therefore, if
nodeanonymize
is set toTrue
, then:tub.location
setting (including iftub.location
is commented-out), the node will abort on startup. (This is important because people who created their node with an older release of Tahoe-LAFS will have atahoe.cfg
withtub.location
commented out. See below about backward-compatibility.)tub.location
is set toAUTODETECT
, the node will abort on a startup with an error message.tub.location
is set to a specific connection-hints value which includes an IP address or domain name, then the node will abort on startup with an error message.tub.location
is set to aUNREACHABLE
, the node will start up normally.tub.location
is set to a specific connection-hints value which contains only I2P and/or Tor (.onion) addresses, the node will start up normally.tahoe.cfg
's (generated by thecreate-client
or {{create-node}}} command) should come withtub.location = AUTODETECT
instead of a commented out "#tub.location = put your IP address here
" (see [create_node.py]source:trunk/src/allmydata/scripts/create_node.py?annotate=blame&rev=3ee950f09ed8b7f6cc72a98c26eefe9e02c11d85#L91.)Okay, now what about backward-compatibility?
For backwards compatibility, we still accept the absence of
tub.location
as meaning to AUTODETECT. But only if thenodeanonymize
flag isn't on! Because if thenodeanonymize
flag makes a setting fortub.location
be required.Maybe in a future release we'll start emitting a warning about the absence of a
tub.location
setting, but for now, no warning.I agree with most of this design, but I'm unconvinced of the value of requiring an explicit
tub.location = AUTODETECT
, rather than keeping that as the default as it is now. Thenode anonymize
flag would still disallow auto-detection, that's independent of whether auto-detection is the default.Note that if we keep auto-detection as the default, we can still change the comment that is added to a new
tahoe.cfg
to something likeThanks for the design-review, daira. I still want to eventually switch to spelling this as
tub.location = AUTODETECT
, even if it isn't necessary to do so, because:tub.location =
for meaning "set my tub location to the empty string".So for those reasons, I'd prefer to move ahead with making an explicit
AUTODETECT
be the preferred way to indicate this configuration (as in comment:377674).Oh, I see that my specification in comment:377674 omits a case: what if
tub.location
is set to the empty string? I propose that this is treated as a configuration error (the node stops at startup with a verbose error message about this), regardless of the setting ofnodeanonymize
.#1947 was a duplicate of this, and #1947 has lots of good content on it. Please read #1947 if you are working on this ticket.
Note that in the modern world, "only use 127.0.0.1" should really be "only use 127.0.0.1 and ::1". I will refrain from updating the ticket-title, but IMHO we should purge v4-only statements.
Greetings,
My branch implements of Zooko's design in comment:377674 :
https://github.com/david415/tahoe-lafs/tree/david-ticket1010-unittests
Please let me know what else I can do to get this trac ticket resolved.
For reference, the relevant commits are here:
https://github.com/david415/tahoe-lafs/compare/david-truckee-venv...david-ticket1010-unittests
There is also an earlier commit in the same branch which drafts a Tor-only mode, meaning that the branch can't be directly merged to close this ticket:
https://github.com/david415/tahoe-lafs/commit/d9757d75aebe675ca6114d63673ac597e1198084
Reviewed the commits for this ticket (from the link I posted above).
nodeanonymize
for the flag. The commits implementnodetub.anonymize
instead. I don't know what the Tahoe and Foolscap conventions are, but I expect thatnodetub.*
are intended to be passed through to Foolscap, and this flag will not be.check_anonymity_config()
:_startService()
) and 429 (in_setup_tub()
) confuse me. Why istub.location
being set to empty? Is this being written to the config file?tahoe.cfg
files.use only 127.0.0.1 as local addressto anonymous client modeOK... I've cleaned up my code here:
https://github.com/david415/tahoe-lafs/tree/david-ticket1010-unittests
Additionally this latest change uses the my Foolscap branch from Foolscap trac ticket 208:
http://foolscap.lothar.com/trac/ticket/208
This ticket now requires code review. Thanks!
Reviewed :)
src/allmydata/node.py
:anonymize.py
).is_err
incheck_anonymity_config()
can be replaced with set unions (joining the expressions withor
-s).tubport = ...
) can be moved inside theself.anonymize
check._unreachable_tub()
necessary? AFAICT removing lines 356-359 and changing line 360 toif location != "UNREACHABLE" and not self.anonymize:
would have the same effect (apart from the log line, which could be checked inside_setup_tub()
instead. This is probably a coding style decision, I will defer to Zooko et al. on this.src/allmydata/util/anonymize.py
:is_anonymous()
still contains code to split a location hint into parts, which is what IMHO should be centralized (outside ofanonymize.py
). I haven't hunted for where else location hints are split up currently, but per the anonymity roadmap there will be more parts in Tahoe that will need to do so (e.g. per-config client endpoint string parameters).Overall IMHO this is looking very nice. It can't be directly merged to close this ticket because of the Tor-only content, and I'm not sure whether it can even be cherry-picked (I haven't checked which commits do what).
I would prefer
nodeanonymous
rather thannodeanonymize
, because it has the same spelling in U.S. and British English. Also I think it more clearly conveys that this option is asserting that other options are compatible with anonymity, not changing the behaviour itself.FYI, http://foolscap.lothar.com/trac/ticket/236 is a plan that dawuud and I came up with for making Foolscap handle Tor/i2p Listeners and connection-hints cleanly. I need to re-read this ticket and see how/if it interacts with the changes we propose over there.
I have cherry-picked the #1010 changes out of dawuud's branch and rebased them onto master:
https://github.com/str4d/tahoe-lafs/tree/1010-anonymous-client-mode
I have intentionally left out commit
e03ac001387f8341240e730cd918027c2d111b7d
because Foolscap #208 is still undecided.Other changes:
AUTO
as a flag intub.location
, basically implementing use case 4 from comment:377667. Therefore I have replacedtub.location == AUTODETECT
checks withAUTO in tub.location
.nodeanonymize
has been changed tonodeanonymous
per comment:377686.In today's meeting, while sketching out the tor-socks-proxy syntax (/tahoe-lafs/trac/issues/26461#comment:377672), we talked briefly about how the Accounting "client ID" would interact with anonymity. In particular, without other changes, Accounting-enabled Tahoe clients will use the same !ed25519 key for all connections. There are (at least) three things that might be linkable, and using Tor/I2P will only remove one of them:
We could change the accounting system to provide a way to use random client-ids (or none at all), but then storage servers can't enforce any of their Accounting things. And even with that, clients who always start at the same rootcap will be linkable by the storage server observing repeated accesses to the same storage index.
The upshot is that now I'm wondering if a tahoe.cfg flag named
anonymous
might be better spelledpsuedonymous
, orip-anonymous
, ornetwork-anonymous
, or something that makes a slightly weaker claim. Or, should we say thatanonymous=true
also requires that you've added some other (as-yet undefined) tahoe.cfg flag which disables Accounting client-ids?For more terminology discussion, see this tweet by Zooko, and this one, and their replies.
IMHO random Accounting client-ids are not necessary. From the I2P perspective, a Tahoe client is going to have a visible known I2P Destination at least for the duration of the process (by default client sessions are transient), and so an Accounting client-id that persists as long as the I2P Destination does is within the existing threat model. For a Tahoe client that is also a server, this will be long-term; for client-only nodes, this will be until restart (unless code was added to intentionally cycle Destinations on a shorter timescale). The Tor side will have similar considerations. I don't see a use case that requires a Tahoe client's I2P activity and Tor activity to be unlinked.
Another consideration raised by this: we need to ensure that if the Tahoe configuration is changed, all identifiers are cycled (as best as possible). That means, if a user is running a "non-anonymous" client, and subsequently configures it to be "anonymous", then Tahoe needs to regenerate:
The shares being accessed is a harder identifier to prevent leaking across, and I'm not sure how it could be done within the Tahoe-LAFS operational model. For comparison: the BitTorrent client Vuze has an I2P plugin, and internally it runs two I2P Destinations - one for "pure" I2P traffic, and one for "mixed" I2P traffic (for any torrents being seeded simultaneously to/from I2P and the clearnet). It explicitly warns the user that once a torrent has been added with a given privacy setting, it should not be changed, because the already-fetched-blocks are a unique fingerprint for incomplete torrents. To change, they recommend deleting and re-adding the torrent. It's not something they can defend against, but they try to at least give the user plenty of warning.
We agreed in today's Nuts & Bolts meeting to bump better Tor/I2P support out to 1.11.0.
Milestone renamed
We should release some basic features as described in this ticket soon so that we can complete Phase 1 of native Tor integration for Tahoe-LAFS as described in warner's roadmap in a comment here: (@@https://tahoe-lafs.org/trac/tahoe-lafs/ticket/517#comment:377678@@)
It seems like progress on this ticket has stalled. What can I do about it?
I don't think we need to implement accounting first. We just need to have some basic features and I think it should be really simple to have an anonymize=true option; it should error if another option is set in conflict with the policy and it should turn off autodetect. anything else? str4d rebased my old patch. perhaps i need to change it to use the foolscap trac ticket 208 feature now that ticket is closed?:
https://foolscap.lothar.com/trac/ticket/208
This here is the latest with upstream/master merged in;
https://github.com/david415/tahoe-lafs/tree/1010-anonymous-client-mode
It seems like we need to fix the above changeset:
fixed here:
https://github.com/david415/tahoe-lafs/tree/1010-anonymous-client-mode.2
In various conversations and tickets Leif and Zooko both point out that our language is misleading because in this case the word "anonymous" doesn't mean unlinkable identity from the storage server's perspective at all... but merely means that our origin IP is hidden via the network transport. This is arguably not anonymity at all. It is important to make this distinction I think.
Question: Do any you have suggestions for how to make this explicitly clear to the user? Should we change the name of the configuration option to something else?
How about mask-origin-ip?
moving most tickets from 1.12 to 1.13 so we can release 1.12 with magic-folders
I think we're ready to add this flag, and then a
tahoe create-client
CLI argument to turn it on from the very beginning. So we need to make some decisions. I'm going to propose the following.. please let me know what you think.tahoe create-client --anonymous
ortahoe create-node --anonymous
causesnode anonymous = true
to be written totahoe.cfg
node anonymous = true
, any of the following problems will causetahoe start
to throw an exception before any network traffic has occurred:node tub.location =
contains anytcp:
hintsnode tub.location =
is empty or missing, since that meansAUTO
, which means atcp:
hint with automatically-detected addressesconnections
lacks atcp = tor
line, since otherwise introducer and server connections could use raw TCP connectionsThere are a few other things we might consider adding, but I'm inclined to not include them:
tub.location
hostnames (for any type of hint) to end in.onion
or.i2p
tub.socks_port
to point at a local host (maybe limit it to127.0.0.1
andlocalhost
, or maybe to any RFC1918 address)storage enabled = false
andhelper enabled = false
(i.e. we're a pure client), then requiretub.port=
(empty), to forbid the main tub from listening at allI'm tentatively pulling this into the 1.12 milestone, because I think we're close, and it'd be awesome to include proper (client-side) Tor/I2P support, and I think this flag is a necessary part of that.
From today's devchat, folks seemed ok with my proposal, and with omitting the other three items (constraints on tub.location hostnames, tub.socks_port, and forbidding pure-clients from listening).
However Zooko (and others) pointed out that "anonymous" is not the best name for this flag (it's inaccurate, imprecise, and carries negative connotations for a lot of folks outside our community).
private
, orprivate-ip
seems better:Switching to a term that makes it clear that we're specifically protecting the IP address means that we don't need to include #2384 in its scope (randomized TubIDs).
Do people prefer
private = true
, orprivate-ip = true
? Or something else?My pal George Tankersley suggested a great idea for this today:
(the default value of
reveal-IP-address=
is True, when left unspecified)That is specific (Tor/I2P are only about not revealing your IP address), non-negatively-connotative, and encourages the obvious constructive question of "why the heck isn't
false
the default?" (which then begins the conversation about performance consequences of Tor/I2P connections and the additional install/run-time dependencies).(https://github.com/tahoe-lafs/tahoe-lafs/pull/326) adds this safety flag, and docs/tests.
If the syntax is ok with everyone, the next step will be to add a
tahoe create-client
/create-node
CLI argument which sets this flag. Maybe--no-reveal-IP-address
? Or--reveal-IP-address=false
?--hide-IP-address
?In d47fc0f/trunk:
I think we're converging on
--hide-ip
. It's not as provocative as--no-reveal-ip
or--reveal-ip=false
, but I think it's simpler, and just as accurate.--hide-ip
patch in https://github.com/tahoe-lafs/tahoe-lafs/pull/330In d0da17a/trunk:
Ok, with the landing of
--hide-ip
, I think we can close this one: we've implemented pretty much everything we've talked about in this ticket.In 5a195e2/trunk: