implement distributed introduction, remove Introducer as a single point of failure #68
Labels
No labels
c/code
c/code-dirnodes
c/code-encoding
c/code-frontend
c/code-frontend-cli
c/code-frontend-ftp-sftp
c/code-frontend-magic-folder
c/code-frontend-web
c/code-mutable
c/code-network
c/code-nodeadmin
c/code-peerselection
c/code-storage
c/contrib
c/dev-infrastructure
c/docs
c/operational
c/packaging
c/unknown
c/website
kw:2pc
kw:410
kw:9p
kw:ActivePerl
kw:AttributeError
kw:DataUnavailable
kw:DeadReferenceError
kw:DoS
kw:FileZilla
kw:GetLastError
kw:IFinishableConsumer
kw:K
kw:LeastAuthority
kw:Makefile
kw:RIStorageServer
kw:StringIO
kw:UncoordinatedWriteError
kw:about
kw:access
kw:access-control
kw:accessibility
kw:accounting
kw:accounting-crawler
kw:add-only
kw:aes
kw:aesthetics
kw:alias
kw:aliases
kw:aliens
kw:allmydata
kw:amazon
kw:ambient
kw:annotations
kw:anonymity
kw:anonymous
kw:anti-censorship
kw:api_auth_token
kw:appearance
kw:appname
kw:apport
kw:archive
kw:archlinux
kw:argparse
kw:arm
kw:assertion
kw:attachment
kw:auth
kw:authentication
kw:automation
kw:avahi
kw:availability
kw:aws
kw:azure
kw:backend
kw:backoff
kw:backup
kw:backupdb
kw:backward-compatibility
kw:bandwidth
kw:basedir
kw:bayes
kw:bbfreeze
kw:beta
kw:binaries
kw:binutils
kw:bitcoin
kw:bitrot
kw:blacklist
kw:blocker
kw:blocks-cloud-deployment
kw:blocks-cloud-merge
kw:blocks-magic-folder-merge
kw:blocks-merge
kw:blocks-raic
kw:blocks-release
kw:blog
kw:bom
kw:bonjour
kw:branch
kw:branding
kw:breadcrumbs
kw:brians-opinion-needed
kw:browser
kw:bsd
kw:build
kw:build-helpers
kw:buildbot
kw:builders
kw:buildslave
kw:buildslaves
kw:cache
kw:cap
kw:capleak
kw:captcha
kw:cast
kw:centos
kw:cffi
kw:chacha
kw:charset
kw:check
kw:checker
kw:chroot
kw:ci
kw:clean
kw:cleanup
kw:cli
kw:cloud
kw:cloud-backend
kw:cmdline
kw:code
kw:code-checks
kw:coding-standards
kw:coding-tools
kw:coding_tools
kw:collection
kw:compatibility
kw:completion
kw:compression
kw:confidentiality
kw:config
kw:configuration
kw:configuration.txt
kw:conflict
kw:connection
kw:connectivity
kw:consistency
kw:content
kw:control
kw:control.furl
kw:convergence
kw:coordination
kw:copyright
kw:corruption
kw:cors
kw:cost
kw:coverage
kw:coveralls
kw:coveralls.io
kw:cpu-watcher
kw:cpyext
kw:crash
kw:crawler
kw:crawlers
kw:create-container
kw:cruft
kw:crypto
kw:cryptography
kw:cryptography-lib
kw:cryptopp
kw:csp
kw:curl
kw:cutoff-date
kw:cycle
kw:cygwin
kw:d3
kw:daemon
kw:darcs
kw:darcsver
kw:database
kw:dataloss
kw:db
kw:dead-code
kw:deb
kw:debian
kw:debug
kw:deep-check
kw:defaults
kw:deferred
kw:delete
kw:deletion
kw:denial-of-service
kw:dependency
kw:deployment
kw:deprecation
kw:desert-island
kw:desert-island-build
kw:design
kw:design-review-needed
kw:detection
kw:dev-infrastructure
kw:devpay
kw:directory
kw:directory-page
kw:dirnode
kw:dirnodes
kw:disconnect
kw:discovery
kw:disk
kw:disk-backend
kw:distribute
kw:distutils
kw:dns
kw:do_http
kw:doc-needed
kw:docker
kw:docs
kw:docs-needed
kw:dokan
kw:dos
kw:download
kw:downloader
kw:dragonfly
kw:drop-upload
kw:duplicity
kw:dusty
kw:earth-dragon
kw:easy
kw:ec2
kw:ecdsa
kw:ed25519
kw:egg-needed
kw:eggs
kw:eliot
kw:email
kw:empty
kw:encoding
kw:endpoint
kw:enterprise
kw:enum34
kw:environment
kw:erasure
kw:erasure-coding
kw:error
kw:escaping
kw:etag
kw:etch
kw:evangelism
kw:eventual
kw:example
kw:excess-authority
kw:exec
kw:exocet
kw:expiration
kw:extensibility
kw:extension
kw:failure
kw:fedora
kw:ffp
kw:fhs
kw:figleaf
kw:file
kw:file-descriptor
kw:filename
kw:filesystem
kw:fileutil
kw:fips
kw:firewall
kw:first
kw:floatingpoint
kw:flog
kw:foolscap
kw:forward-compatibility
kw:forward-secrecy
kw:forwarding
kw:free
kw:freebsd
kw:frontend
kw:fsevents
kw:ftp
kw:ftpd
kw:full
kw:furl
kw:fuse
kw:garbage
kw:garbage-collection
kw:gateway
kw:gatherer
kw:gc
kw:gcc
kw:gentoo
kw:get
kw:git
kw:git-annex
kw:github
kw:glacier
kw:globalcaps
kw:glossary
kw:google-cloud-storage
kw:google-drive-backend
kw:gossip
kw:governance
kw:grid
kw:grid-manager
kw:gridid
kw:gridsync
kw:grsec
kw:gsoc
kw:gvfs
kw:hackfest
kw:hacktahoe
kw:hang
kw:hardlink
kw:heartbleed
kw:heisenbug
kw:help
kw:helper
kw:hint
kw:hooks
kw:how
kw:how-to
kw:howto
kw:hp
kw:hp-cloud
kw:html
kw:http
kw:https
kw:i18n
kw:i2p
kw:i2p-collab
kw:illustration
kw:image
kw:immutable
kw:impressions
kw:incentives
kw:incident
kw:init
kw:inlineCallbacks
kw:inotify
kw:install
kw:installer
kw:integration
kw:integration-test
kw:integrity
kw:interactive
kw:interface
kw:interfaces
kw:interoperability
kw:interstellar-exploration
kw:introducer
kw:introduction
kw:iphone
kw:ipkg
kw:iputil
kw:ipv6
kw:irc
kw:jail
kw:javascript
kw:joke
kw:jquery
kw:json
kw:jsui
kw:junk
kw:key-value-store
kw:kfreebsd
kw:known-issue
kw:konqueror
kw:kpreid
kw:kvm
kw:l10n
kw:lae
kw:large
kw:latency
kw:leak
kw:leasedb
kw:leases
kw:libgmp
kw:license
kw:licenss
kw:linecount
kw:link
kw:linux
kw:lit
kw:localhost
kw:location
kw:locking
kw:logging
kw:logo
kw:loopback
kw:lucid
kw:mac
kw:macintosh
kw:magic-folder
kw:manhole
kw:manifest
kw:manual-test-needed
kw:map
kw:mapupdate
kw:max_space
kw:mdmf
kw:memcheck
kw:memory
kw:memory-leak
kw:mesh
kw:metadata
kw:meter
kw:migration
kw:mime
kw:mingw
kw:minimal
kw:misc
kw:miscapture
kw:mlp
kw:mock
kw:more-info-needed
kw:mountain-lion
kw:move
kw:multi-users
kw:multiple
kw:multiuser-gateway
kw:munin
kw:music
kw:mutability
kw:mutable
kw:mystery
kw:names
kw:naming
kw:nas
kw:navigation
kw:needs-review
kw:needs-spawn
kw:netbsd
kw:network
kw:nevow
kw:new-user
kw:newcaps
kw:news
kw:news-done
kw:news-needed
kw:newsletter
kw:newurls
kw:nfc
kw:nginx
kw:nixos
kw:no-clobber
kw:node
kw:node-url
kw:notification
kw:notifyOnDisconnect
kw:nsa310
kw:nsa320
kw:nsa325
kw:numpy
kw:objects
kw:old
kw:openbsd
kw:openitp-packaging
kw:openssl
kw:openstack
kw:opensuse
kw:operation-helpers
kw:operational
kw:operations
kw:ophandle
kw:ophandles
kw:ops
kw:optimization
kw:optional
kw:options
kw:organization
kw:os
kw:os.abort
kw:ostrom
kw:osx
kw:osxfuse
kw:otf-magic-folder-objective1
kw:otf-magic-folder-objective2
kw:otf-magic-folder-objective3
kw:otf-magic-folder-objective4
kw:otf-magic-folder-objective5
kw:otf-magic-folder-objective6
kw:p2p
kw:packaging
kw:partial
kw:password
kw:path
kw:paths
kw:pause
kw:peer-selection
kw:performance
kw:permalink
kw:permissions
kw:persistence
kw:phone
kw:pickle
kw:pip
kw:pipermail
kw:pkg_resources
kw:placement
kw:planning
kw:policy
kw:port
kw:portability
kw:portal
kw:posthook
kw:pratchett
kw:preformance
kw:preservation
kw:privacy
kw:process
kw:profile
kw:profiling
kw:progress
kw:proxy
kw:publish
kw:pyOpenSSL
kw:pyasn1
kw:pycparser
kw:pycrypto
kw:pycrypto-lib
kw:pycryptopp
kw:pyfilesystem
kw:pyflakes
kw:pylint
kw:pypi
kw:pypy
kw:pysqlite
kw:python
kw:python3
kw:pythonpath
kw:pyutil
kw:pywin32
kw:quickstart
kw:quiet
kw:quotas
kw:quoting
kw:raic
kw:rainhill
kw:random
kw:random-access
kw:range
kw:raspberry-pi
kw:reactor
kw:readonly
kw:rebalancing
kw:recovery
kw:recursive
kw:redhat
kw:redirect
kw:redressing
kw:refactor
kw:referer
kw:referrer
kw:regression
kw:rekey
kw:relay
kw:release
kw:release-blocker
kw:reliability
kw:relnotes
kw:remote
kw:removable
kw:removable-disk
kw:rename
kw:renew
kw:repair
kw:replace
kw:report
kw:repository
kw:research
kw:reserved_space
kw:response-needed
kw:response-time
kw:restore
kw:retrieve
kw:retry
kw:review
kw:review-needed
kw:reviewed
kw:revocation
kw:roadmap
kw:rollback
kw:rpm
kw:rsa
kw:rss
kw:rst
kw:rsync
kw:rusty
kw:s3
kw:s3-backend
kw:s3-frontend
kw:s4
kw:same-origin
kw:sandbox
kw:scalability
kw:scaling
kw:scheduling
kw:schema
kw:scheme
kw:scp
kw:scripts
kw:sdist
kw:sdmf
kw:security
kw:self-contained
kw:server
kw:servermap
kw:servers-of-happiness
kw:service
kw:setup
kw:setup.py
kw:setup_requires
kw:setuptools
kw:setuptools_darcs
kw:sftp
kw:shared
kw:shareset
kw:shell
kw:signals
kw:simultaneous
kw:six
kw:size
kw:slackware
kw:slashes
kw:smb
kw:sneakernet
kw:snowleopard
kw:socket
kw:solaris
kw:space
kw:space-efficiency
kw:spam
kw:spec
kw:speed
kw:sqlite
kw:ssh
kw:ssh-keygen
kw:sshfs
kw:ssl
kw:stability
kw:standards
kw:start
kw:startup
kw:static
kw:static-analysis
kw:statistics
kw:stats
kw:stats_gatherer
kw:status
kw:stdeb
kw:storage
kw:streaming
kw:strports
kw:style
kw:stylesheet
kw:subprocess
kw:sumo
kw:survey
kw:svg
kw:symlink
kw:synchronous
kw:tac
kw:tahoe-*
kw:tahoe-add-alias
kw:tahoe-admin
kw:tahoe-archive
kw:tahoe-backup
kw:tahoe-check
kw:tahoe-cp
kw:tahoe-create-alias
kw:tahoe-create-introducer
kw:tahoe-debug
kw:tahoe-deep-check
kw:tahoe-deepcheck
kw:tahoe-lafs-trac-stream
kw:tahoe-list-aliases
kw:tahoe-ls
kw:tahoe-magic-folder
kw:tahoe-manifest
kw:tahoe-mkdir
kw:tahoe-mount
kw:tahoe-mv
kw:tahoe-put
kw:tahoe-restart
kw:tahoe-rm
kw:tahoe-run
kw:tahoe-start
kw:tahoe-stats
kw:tahoe-unlink
kw:tahoe-webopen
kw:tahoe.css
kw:tahoe_files
kw:tahoewapi
kw:tarball
kw:tarballs
kw:tempfile
kw:templates
kw:terminology
kw:test
kw:test-and-set
kw:test-from-egg
kw:test-needed
kw:testgrid
kw:testing
kw:tests
kw:throttling
kw:ticket999-s3-backend
kw:tiddly
kw:time
kw:timeout
kw:timing
kw:to
kw:to-be-closed-on-2011-08-01
kw:tor
kw:tor-protocol
kw:torsocks
kw:tox
kw:trac
kw:transparency
kw:travis
kw:travis-ci
kw:trial
kw:trickle
kw:trivial
kw:truckee
kw:tub
kw:tub.location
kw:twine
kw:twistd
kw:twistd.log
kw:twisted
kw:twisted-14
kw:twisted-trial
kw:twitter
kw:twn
kw:txaws
kw:type
kw:typeerror
kw:ubuntu
kw:ucwe
kw:ueb
kw:ui
kw:unclean
kw:uncoordinated-writes
kw:undeletable
kw:unfinished-business
kw:unhandled-error
kw:unhappy
kw:unicode
kw:unit
kw:unix
kw:unlink
kw:update
kw:upgrade
kw:upload
kw:upload-helper
kw:uri
kw:url
kw:usability
kw:use-case
kw:utf-8
kw:util
kw:uwsgi
kw:ux
kw:validation
kw:variables
kw:vdrive
kw:verify
kw:verlib
kw:version
kw:versioning
kw:versions
kw:video
kw:virtualbox
kw:virtualenv
kw:vista
kw:visualization
kw:visualizer
kw:vm
kw:volunteergrid2
kw:volunteers
kw:vpn
kw:wapi
kw:warners-opinion-needed
kw:warning
kw:weapi
kw:web
kw:web.port
kw:webapi
kw:webdav
kw:webdrive
kw:webport
kw:websec
kw:website
kw:websocket
kw:welcome
kw:welcome-page
kw:welcomepage
kw:wiki
kw:win32
kw:win64
kw:windows
kw:windows-related
kw:winscp
kw:workaround
kw:world-domination
kw:wrapper
kw:write-enabler
kw:wui
kw:x86
kw:x86-64
kw:xhtml
kw:xml
kw:xss
kw:zbase32
kw:zetuptoolz
kw:zfec
kw:zookos-opinion-needed
kw:zope
kw:zope.interface
p/blocker
p/critical
p/major
p/minor
p/normal
p/supercritical
p/trivial
r/cannot reproduce
r/duplicate
r/fixed
r/invalid
r/somebody else's problem
r/was already fixed
r/wontfix
r/worksforme
t/defect
t/enhancement
t/task
v/0.2.0
v/0.3.0
v/0.4.0
v/0.5.0
v/0.5.1
v/0.6.0
v/0.6.1
v/0.7.0
v/0.8.0
v/0.9.0
v/1.0.0
v/1.1.0
v/1.10.0
v/1.10.1
v/1.10.2
v/1.10a2
v/1.11.0
v/1.12.0
v/1.12.1
v/1.13.0
v/1.14.0
v/1.15.0
v/1.15.1
v/1.2.0
v/1.3.0
v/1.4.1
v/1.5.0
v/1.6.0
v/1.6.1
v/1.7.0
v/1.7.1
v/1.7β
v/1.8.0
v/1.8.1
v/1.8.2
v/1.8.3
v/1.8β
v/1.9.0
v/1.9.0-s3branch
v/1.9.0a1
v/1.9.0a2
v/1.9.0b1
v/1.9.1
v/1.9.2
v/1.9.2a1
v/cloud-branch
v/unknown
No milestone
No project
No assignees
6 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: tahoe-lafs/trac#68
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
I am quite sure you are aware of the problem of an introducer [...] crash bringing everything down. I read your roadmap.txt but I didn't find anything specific to address this. May I suggest using introducers.furl [...] where multiple entries can be used and the information is updated to all introducers at the same time when a peer makes an update.
Also, upon adding a new introducer, there should be a way to discover all the info currently on the existing introducer. I think I am making this part sounds more trivial than it is.
Thanks
Lu
I would like to make a fully decentralized introduction scheme, such as the one I had in Mnet. Basically, every node would be an Introducer. This doesn't scale up in terms of number of nodes in the network unless we add some cleverness to it, but currently Tahoe networks are neither capable of scaling up to more than 100 nodes nor required to scale up to more than 100 nodes. (See UseCases.)
I will add to source:roadmap.txt about this issue.
I updated source:roadmap.txt . I'm tempted to think we should go directly to connection management v4 and not stop at v3.
Sam Stoller mentioned that it would be cool if peer nodes were discoverable through Bonjour. I agree!
This would be nice, but I think it's a lower priority than the connection management. I think we'll hit scaling problems earlier because of the number of connections held open by client nodes (windows boxes with minimal memory and python-vs-windows limitations) than because of the number of connections held open by a central Introducer (which will be running on a well-provisioned unix box, with plenty of memory and bandwidth, in a professionally-run colo facility). We can also introduce multiple central Introducers without too much effort, which would make them even more available.
Also note that an Introducer failure will prevent new clients from seeing the mesh, but will not prevent already-running clients from continuing to use each other, so such a failure is somewhat graceful.
I'm also thinking that relay is higher priority than distributed introduction.
Also, I'm thinking that we may want to provide for some sort of private mesh in the future, which will mean creating some sort of "membership badge" credentials, which would need to be checked at connection establishment (or lease-request) time, and we might want to at least lay out some requirements for that before building the distributed introduction scheme.
That said, if we choose to build a single global mesh, I very much like the gossip approach to learning about other nodes, and zeroconf/Bonjour would also be pretty slick (although I can only see it being useful when there is already a tahoe node on your local LAN), so I would like to see those implemented sooner or later.
-Brian
One introducer = single point of failureto implement distributed introduction, remove Introducer as a single point of failureZooko and I hashed out a good scheme to do this while I was in Boulder this
week. Here's the plan:
internal setup and emits a FURL to email/IM/paste to your friend
accepts this FURL.
to be merged.
mesh is fully connected
With this approach, there is no single introducer. In addition, it enables
the following interesting properties:
pet name path from themselves to every other node in the mesh.
and options to impose individual quotas, or cut them off entirely
The vdrive server is still an outstanding question: until we get distributed
dirnodes (#115), each dirnode will still be attached to a single host, which
needs to be visible to anyone who's interested in reading the directory. So
our first release that removes the Introducer will probably retain the vdrive
server, and we'll have to figure out a reasonable UI that handles this.
In one of our current designs, the API for the PersonalIntroducer held by
each node on each other node (not necessarily reified as a distinct
Foolscap-Referenceable object, but that would be an easy implementation)
would have the following API:
and
RIPersonalStorageServer
would have the same API as the currentRIStorageServer
, with allocate_buckets, etc.tell_me_about_peers
would use the provided list to filter out all peersthat the asker already knows about, and would then go to all of the remaining
peers with a
please_meet
message to produce newRIPersonalIntroducer
facets for the asker, then return a list of thesefacets. This is the place where Horton will go: until we get that, each node
that does an introduction gets to take advantage of the facet that ought to
be reserved for the asker (i.e. the introductee is vulnerable to the
introducer). With Horton, the same attack exists, but the two nodes will see
different identifiers for the MitM, so that if Bob ever comes to learn about
Carol through a different path, he will perceive her as being different than
the pseudoCarol that Alice gave him.
With a maximally transparent Horton built in to Foolscap, the
tell_me_about_peers method just returns a list of Alice's existing
RIPersonalIntroducer proxies, and Foolscap will do the Horton work to
transform them into Bob-oriented proxies. Also, the please_meet method would
move into the customized Stub class (where it would behave much the same
way).
This is an important feature, but I don't think we are going to get it done in the next six weeks, so I'm putting it in Milestone 1.0.
Here is a simple scheme for decentralized introduction. It probably scales up at least as well as the rest of our current network architecture does — i.e. it is scalable enough for now.
First, implement #271 — "subscriber-only introducer client". DONE
Second, make announcement idempotent in the introducer — i.e. make it so that if a node announces themselves when they are already in the introducer's set of announced nodes, that the introducer ignores the announcement. (This makes sense, anyway, because the introducer doesn't need to inform any subscribers about the re-announcement since the clients will already have heard the earlier announcement from the introducer, and the only time a node would announce itself redundantly would be if that node were buggy.) DONE
Third, make "introducers" a class of publishable, subscribable thing like storage server and read-only storage server and upload helper (as per #271).
Fourth, make all publishers — introducer clients that send announcements — send their announcements to an evenly distributed subset of the introducers, namely the "Chord fingers" — the introducer halfway around the circle from the publisher, plus the one a quarter fo the way around the circle, etc.
Fifth, tell each introducer to subscribe to the introducer-announcements of a small set of other introducers — again choosing the Chord fingers — and whenever the introducer hears announcements from the introducers that it subscribes to, then it announces those announcements themselves, just as if it had just heard them from a client. (Of course, it still ignores any announcements of nodes which it has previously announced, as above.)
Now the load of handling introductions is evenly spread among all introducers, and there is no Single Point of !Failure/Single Point of Load.
Each introducer receives log(N) redundant announcements of each new node, where N is the total number of introducers in the system.
See #295 for how to add access control for the authority to act as a server and the authority to act as a client on top of distributed introduction.
This review of Tahoe-LAFS on arstechnica.com reminds me that while issue #68 seems relatively non-urgent to me because I know how little the grid relies on the introducer and how easy it is to replicate introducers, it would be much better if we could simply say "the grid is fully decentralized" and then introductory articles like this one could optimize out a whole paragraph describing the introducer.
http://allmydata.org/pipermail/tahoe-dev/2009-August/002509.html
Also, fixing this ticket would be fun. Someone should do it. :-)
To get started on this, see source:src/allmydata/introducer/client.py and source:src/allmydata/introducer/server.py. Each of those files is fairly small and you should be able to read through them both and understand the current implementation. See also source:src/allmydata/introducer/interfaces.py which defines the interfaces between the components.
Attachment DualInroducerScenario1.jpeg (76784 bytes) added
Dual Introducer Scenario1 (26/5/10)
Attachment DualInroducerScenario1-Modified.png (99503 bytes) added
Dual Introducer Scenario 1 Modified (27/5/10)
Snapshot A: Client1 and client2 is connected with Introducer X and Y. But Client3 is only connected to IntroducerY.
Snapshot B: IntroducerY becomes down, Client4 joins with a configured to talk to Introducer X and Y. Client3 has no knowledge about Client4 and vice versa.
Snapshot C: Introducer X becomes down and Y becomes up. So all clients come to know about each other.
The main target of this scenario is to enable clients to talk to multiple introducers.
Attachment client(can-subscribe-to-multi-introducer-backward-compat).dpatch (5205 bytes) added
Given a file "introducers" in client basedir, each line containing single introducer_furl, this patch can subscribe to all of them keeping backward compatibility
Backward compatibility is maintained by:
Note this patch does not update client's webui with all connected introducers.
Faruq: glad to see this patch! Okay here are my comments.
What does this comment mean? Do you mean keep it in order not to break any reference to it?
It can't be equal to
'\n'
after a.split('\n')
. Maybe change this to:Now this code needs tests. Let's save this code aside, write a unit test which turned red, and then put this patch back into place and see if it turns the unit test green. So the unit test could, for example, populate the "introducer" file with two introducers, then instantiate the
Client
object (from source:src/allmydata/client.py), then invoke some method of thatClient
object which it will handle correctly only if it knows about both of the introducers.Oh, I've got to go to lunch. I'll look at this more later!
Thanks for corrections. Regarding the reference, that's my intent, not to break any reference to it. If this code is fine, I'd like to add another patch that changes web/root.py and web/welcome.xhtml to show the connected introducers etc.
Attachment connected_to_introducers.png (30184 bytes) added
Client's welcome page shows a list of connected introducers.
Attachment client(can-show-connected-introducers-in-welcome-page).dpatch (5883 bytes) added
Serving the connection status to multiple introducers, still backwrad compatible
Attachment root(can-show-connected-introducers-in-welcome-page).dpatch (1070 bytes) added
Attachment welcome(can-show-connected-introducers-in-welcome-page).dpatch (1385 bytes) added
These patches (probably one patch would be better) fetches the connection status to multiple introducers in somewhat crude way. Tested with enabling and disabling introducers. These patches are also backward compatible, not breaking any reference to old connected_to_introducer(), but new code should call connected_to_introducers() that also supply the status of the single introducer.
Nice work! Next, please write a unit test of these patches. One unit test should verify that the client learns about a server when that server is announced to one introducer and also when that server is announced to the other introducer. The unit test should use a "mock IntroducerClient class" to test that code that your patch changed in source:src/allmydata/client.py@4193#L173. The idea is that the code in source:src/allmydata/client.py thinks that it is instantiating an instance of [IntroducerClient]source:src/allmydata/introducer/client.py@3931#L13, but actually the test code has set it up so that when the code under test instantiates
IntroducerClient()
then instead it gets an instance of the mock introducer client.You can accomplish this using the Python mock library's
mock.patch
decorator. You can copy the way we usemock.patch
in other places in our tests if you like to learn by code copying (I like to learn that way).http://www.voidspace.org.uk/python/mock/
Attachment test_multi_introducers.py (1209 bytes) added
Demo test file that checks if the number of introducer_clients is same as the number of introducers_furls found in "introducers" cfg file
Nice work! Now that there is a unit test we can start thinking about actually committing these patches to trunk.
This test would notice if the code under test failed to read the
.tahoe/introducers
config file correctly or failed to create an IntroducerClient for each one, right?Now can you write a test (or extend the test you already wrote) to notice if the code under test failed to subscribe to all of the introducers that it knew about? For example, maybe the test would configure two introducers in the
.tahoe/introducers
file,mock.patch()
the IntroducerClient class, then instantiate the src/allmydata/client.pyClient
class, then check that two mock IntroducerClients got created and that each of them had their.subscribe_to()
method called.After that, I can't think of any way that your patch to allmydata/client.py would have a bug which would not be caught by these tests. Can you?
Replying to writefaruq:
Instead of doing this, please search the codebase for any other reference to the
self.introducer_furl
attribute and change that code to reference the newself.introducer_furls
attribute instead. Note also that any such code will have unit tests that will turn red if your patch which removesself.introducer_furl
breaks that code, so run the unit tests after you have removedself.introducer_furl
and after you have searched the codebase for other code that usesintroducer_furl
.Likewise in an earlier comment you mentioned:
This is not the sort of "backward compatibility" that we want. If you are adding a new feature in the code or changing a feature in the code then instead of leaving the old feature in place in the code in case anyone is calling it, we prefer to find all callers and update them.
On the other hand the things that you said about backward compatibility of the tahoe.cfg file is the sort of "backward compatibility" that we want. That has to do with users who might be using an older version of Tahoe-LAFS and then upgrade to a newer version which has your patch. We want the behavior of the new version to be some good behavior that they expected even if they do not make any change to their config files.
allmydata.client.Client.self.introducer_furl
is called fromallmydata.web.root.Root
for fetching the list of introducer furls. But that can be replaced by new code that is tested bytest_root.py
.self.introducer_furl
is also called from various testing modules, e.g. test/common.py (line 471). I'm not sure if they need to be patched at this moment.Replying to writefaruq:
As I mentioned on IRC, I want you to do "test-driven development" on this part. Step 1 is to remove the attribute
introducer_furl
from theallmydata.client.Client
class. Step 2 is to run the complete (current) test suite and see which tests, if any, go red. Step 3 is to think about the places that you know of in the code that refer to the old, now-removed attribute, and think about whether the tests that are currently red are the right tests to exercise those places of the code. If they are not the right way to test that code (they test that code only "by accident", in some sense, or you think it is a bad way to test that code for some reason) then write a new test that tests that code. Now for the important point in "test-driven development": you are not allowed to fix the bad code which refers to the now-deletedintroducer_furl
attribute until you have a red test which you think is a good test for that code! Step 4: fix the code. :-)Attachment test_root.py (1009 bytes) added
corrected test for checking the use of introducer_furl by root.py
Looks good! Except, heh heh heh. Isn't this test testing that
data_introducer_furl()
queries the client object's.introducer_furl
method? Maybe you should now change the test to say that if thedata_introducer_furl()
method queries the client object's.introducer_furl
method then it fails the test, but if it queries the client object's.introducer_furls
attribute instead then it passes the test?Then run it and confirm that it fails the test.
Then fix it!
:-)
Attachment enable_client_with_multi_introducer.dpatch (10264 bytes) added
Revised patch for client.py web/root.py web/welcome.xhtml
I have some questions about how decentralized (gossip-based) introduction is supposed to work. Faruq (and everyone who cares about decentralized introduction!) please tell me if my assumptions are wrong.
Assumption 1: there will be a flat text file in your ".tahoe" base dir named "introducers" containing a list of introducer furls that the node will read at start-up.
Assumption 2: whenever the node learns about new introducers it will write the furl of that new introducer into the file.
Assumption 3: if there is no "introducers" file at startup then it will instead look into the .tahoe/tahoe.cfg file to find the "introducer.furl" entry (which is how introducer was configured up until Tahoe-LAFS v1.7.0), and if it finds it then it will write it into the ".tahoe/introducers" file and use it.
Assumption 4: if there is an "introducers" file at startup then it will not look into the .tahoe/tahoe.cfg file to find the "introducer.furl" entry, and any entry which is in there will be ignored.
Question 1: is this what you are trying to implement, Faruq?
Question 2: is this what people want to use in Tahoe-LAFS v1.8?
Regards,
Zooko
Attachment test-run-after-client_py-web-root_py-welcome_xhtml-patched.log (81805 bytes) added
Test results after applying the previous enable-client-* patch
Assumption 1 is implemented and tested.
Regarding assumption 2 and later part of 3:
Assumption 4 was not considered before.
Review needed for GSoC mid-term evaluations.
Faruq: hey we're making progress! Maybe we could even finish assumption 1, the latter 3 and 4 and Terrell's comment that it should warn if it is ignoring an old setting:
http://tahoe-lafs.org/pipermail/tahoe-dev/2010-July/004636.html
If we finished that of behaviors, including tests (which I think you have already done a pretty good job of) and docs, then we could commit that to trunk and people could start using it even before we implement assumption 2. What do you think?
Combining assumption 1, 3-4 and Terrell's comment the following strategy can be coded into Client.
Step 1: Try to load "basedir/introducers"
Step 2A: If "basedir/introducers" found: a) load introducer furls from this file b) warn if there is any introducer_furl entry in tahoe.cfg
Step 2B: If no "basedir/introducers" found: a) create one "basedir/introducers" b) write introducer_furl entry from tahoe.cfg to this file.
If this is fine, I can proceed to implement this strategy.
Replying to writefaruq:
For an existing basedir, 2B b) would cause the
introducer_furl
to be written tobasedir/introducers
on the first run, and then 2A b) would cause a warning on subsequent runs. The warning seems unnecessary in this case, since there's no reason to believe the user was confused about the config settings; they were changed automatically.Replying to [davidsarah]comment:39:
That's a good point, but how could we do better? I don't think it is a good idea to automatically edit the tahoe.cfg file (to delete the old introducer.furl). Currently Tahoe-LAFS never edits that file -- it is for humans to edit only. I think it should still be a warning because we don't want the human to look into the tahoe.cfg file, see the introducer.furl there, and think that they have now seen the introducer config. We could suppress the warning in the case that tahoe.cfg's introducer.furl and the "introducers" file are the exact same thing (i.e. there is only one entry in "introducers" and it is this one).
Any other ideas?
Faruq: your strategy in comment:361284 sounds perfect to me. Except for the open question about whether or how to indicate warnings to the user, then the only other outstanding issue is that this change needs docs.
All of the following docs need to be updated to accept this into trunk:
I think you are close to getting this first working version completely implemented, doc'ed, tested, and ready for inclusion in trunk.
Replying to [zooko]comment:40:
I think we should do this.
I've drafted the following text. Please correct me!
For configuration.txt:
If a Tahoe grid has multiple introducers, each introducer's FURL must be placed in "BASEDIR/introducers" file. Each line of this file contains exactly one FURL entry. Any FURL entry found in tahoe.cfg will be copied to that file.
For architecture.txt:
By deploying multiple introducers in a Tahoe grid, the above SPoF challenge can be overcome. In that case if one introducer fails clients are still be able to get announcement about new servers from remaining introducers. This is our first step towards implementing a fully distributed introduction.
For future releases, we have plans to enhance our distributed introduction, allowing any server to tell a new client about all the others.
For running.html:
To use multiple introducers, write all introducers' FURLs in "BASEDIR/introducers" file, one FURL per line.
Faruq:
Great! Please go ahead and take my suggestions below then write documentation patches like these and attach a darcs patch to this ticket for just the documentation patches.
The current plan is to finish the strategy from comment:361284, except that for
change it to:
(This is as described in my comment:40 and davidsarah's comment:42.)
Also about your docs: consider that once your patches land in trunk then configuring the "introducers" file will be the preferred way to do it and the "introducer.furl" entry in tahoe.cfg will be supported only for backward-compatibility reasons and will not be recommended to new users. So the documentation should describe the "introducers" file as the way to configure it and mention the "introducer.furl" entry in tahoe.cfg only when explaining that such an entry, if it exists, will be automatically written into the "introducers" file.
Replying to writefaruq:
Don't say "If" here, just say that this is the way to configure any introducers (regardless of if it is one or more). It is necessary to mention the automatic copying of the FURL entry from tahoe.cfg so that readers of configuration.txt will have a complete understanding and understand the backward-compatibility implications.
Also, please call it "Tahoe-LAFS" instead of "Tahoe" in docs. (For one thing, I don't want to have a name collision with http://sourceforge.net/projects/tahoe/ . For another thing, I think of "LAFS" as the protocol and the data formats and specification, and "Tahoe-LAFS" as the current Python implementation.)
Nice!
Again, edit running.html so that the "BASEDIR/introducers" is the only method of configuring introducers. It is not necessary to mention the automatic copying of introducer.furl from tahoe.cfg in running.html.
Please for each patch that you submit write a descriptive patch name and description like these ones: changeset:8ba536319689ec8e, changeset:1de4d2c594ee64c8, changeset:d0706d27ea2624b5, changeset:63b28d707b12202f, changeset:c18b934c6a8442f8, changeset:7cadb49b88c03209, changeset:be6139dad72cdf49.
Okay, good work on this! I'm hoping that by the time I have to write a mid-term review for Google (which I guess I have to do by Friday), that I will be able to say that you've completed a working subset of your summer goal.
Please post the doc patch as a darcs patch and I will review it right away. Now what about test patches. You've already posted attachment:test_root.py and attachment:test_multi_introducers.py . Are those the complete set of tests for the "comment:361284" strategy?
Oh no, looking at them I see that attachment:test_root.py is asking the code-under-test to look at the old
.introducer_furl
attribute. That is not right, it should instead be requiring the code-under-test to not look at the old.introducer_furl
attribute and instead to look only at the.introducer_furls
attribute.I see that attachment:test_multi_introducers.py is requiring the code-under-test to have 1 introducer for the "introducer.furl" entry in tahoe.cfg plus however many are in the "introducers" file. But what "introducers" file is used for this test? When this test code runs it will be inside a temporary directory (named "_trial_temp") which will not already have any "introducers" file present.
Let's make the test code provide an introducers file to the code-under-test, something like this:
That test would be testing that the
Client
discovers the two furls in the "introducers" file. Then we also need the following tests of the "comment:361284" strategy:Client
object, and then check that it has an introducer client object for the furl entry from the tahoe.cfg file, and then check that a new "basedir/introducers" file has been created with that furl in it.Attachment multiple-introducers-changes-in-architecture-configuration-running.dpatch (13563 bytes) added
doc chages for multiple introducers
I have kept the multiple introducers config file name as usual. But
"introducers.cfg"
can be another alternative. Another question, is this file initially be generated for user like tahoe.cfg ?To implement modified comment:361284 strategy, I re-structure the code in Client's
init_introducer_clients
like this:But is warning to be sent to somewhere else? Which one should be called
self.log()
orlog.msg()
?Attachment test_root.2.py (877 bytes) added
corrected test for checking the use of introducer_furls by root.py (multiple introducer version)
This test counts the number of furls loaded by the Client and see if that is equal to the response of the query made in root.py. Tested with 0-2 introducers (in cfg file) and found working.
Replying to writefaruq:
You should use
self.log()
for logging (if the object in question subclasses from some class so that it has aself.log()
method. In this case it does becauseClient
's parent classNode
defines alog()
method.).I wonder if there is a better way to communicate to the user than just logging a message. Not sure.
I'm really not sure that I agree with Brian's comment in http://tahoe-lafs.org/pipermail/tahoe-dev/2010-July/004663.html . The way Brian proposed and Faruq agreed to do it means that there are "two ways to do it"--you can either edit your tahoe.cfg's introducer.furl or you can edit your introducer.furls file. Users who see one of them may assume that it is the only one and then be surprised when they get different behavior than they expected (due to the existence of the other one). I guess I'm too sleepy to go into detail right now, but I want Faruq to know that I looked at this ticket tonight. :-)
See also my reply to Brian on tahoe-dev:
http://tahoe-lafs.org/pipermail/tahoe-dev/2010-July/004713.html
Replying to [zooko]comment:49:
Yes. But for displaying a warning to the user, I would
print >>sys.stderr
. (For tests,sys.stderr
can be captured; see the existing tests in source:src/allmydata/test/test_runner.py .)Replying to [davidsarah]comment:52:
That works for cli scripts, but for the Tahoe-LAFS node itself (unless it launched with
tahoe run
or a possible futuretahoe start --nodaemon
), where would lines written to stderr go? I would hope that they would be logged, but it is possible they would be silently dropped.Replying to [zooko]comment:53:
Good point. But the config files are only read at startup, so perhaps
tahoe start
could read and parse them just in order to display any warnings, before launching the node.(I realize this doesn't guarantee that the contents of the files haven't changed between when
tahoe start
reads them and when the node does, but that would be very unusual.)Alternatively, a solution to #71 ("client node probably started") might allow the node to communicate messages to the runner process at startup.
Attachment test_introducers_cfg.py (1122 bytes) added
Check if a new "introducers" cfg file can be created and tahoe.cfg's introducer_furl can be written in this file
Unsetting review-needed. This patch is not ready to be reviewed and then applied to trunk. However, it would probably be a good help and encouragement to Faruq if anyone would look at his code, docs, or comments and give him your thoughts. :-)
Attachment test_multi_introducers.2.py (640 bytes) added
Check if Client's number of introducer_clients equals to the number of furls in "introducers" file
attachment:test_multi_introducers.2.py looks like a good test of whether the
allmydata.client.Client
correctly reads all of the entries from the "introducers" config file. Please run pyflakes on it (you can just runpython setup.py flakes
) and fix any warnings that pyflakes reports.Re: attachment:test_introducers_cfg.py please add a docstring to the
test_introducer_clients_count()
method saying what this test is looking for in the behavior of the code under test. The comment that comes with the attachment on trac says:Check if a new "introducers" cfg file can be created and tahoe.cfg's introducer_furl can be written in this file
But of course a file can be created! I guess from looking at the code and the name of
test_introducer_clients_count()
that it is intended to do something like this:The
basedir
variable is unnecessary—remove it and replaceos.path.join(basedir, "tahoe.cfg")
with just"tahoe.cfg"
. The line at the end that readsMULTI_INTRODUCERS_CFG
doesn't do anything—remove it.Otherwise this looks like a good test.
Attachment test_introducers_cfg.2.py (1045 bytes) added
code refined by pyflakes
Attachment test_multi_introducers.3.py (544 bytes) added
code refined by pyflakes
Attachment test_root.3.py (850 bytes) added
code refined by pyflakes
Faruq:
Please merge all the tests into one file named test_multi_introducer.py.
Here is a branch to hold your work:
http://tahoe-lafs.org/trac/tahoe-lafs/browser/ticket68-multi-introducer
Here is a view of the buildbot which shows the history of builds of your branch (only showing the Supported Builders):
http://tahoe-lafs.org/buildbot/waterfall?builder=hardy-amd64&builder=windows&builder=Kyle+OpenBSD-4.6+amd64&builder=Arthur+lenny+c7+32bit&builder=David+A.+OpenSolaris+i386&builder=Ruben+Fedora&builder=Eugen+lenny-amd64&builder=Zooko+zomp+Mac-amd64+10.6+py2.6&builder=tarballs&branch=ticket68-multi-introducer
Please attach your most recent patches to this ticket and I will apply them to that branch and then trigger the buildbot to run the tests on all of our buildslaves.
Attachment test_multi_introducers.4.py (3871 bytes) added
Merged all tests
Attachment multiple-introducer-client-side-002.dpatch (5410 bytes) added
multi-introducers doc patch
The last three files: attachment:multiple-introducer-client-side-001.dpatch attachment:multiple-introducer-client-side-002.dpatch attachment:test_multi_introducers.4.py (patch sending failed for some unknown reason) should be applied/added to test repo.
Okay I applied the two patches and I copied attachment:test_multi_introducers.4.py into src/allmydata/test/test_multi_introducers.py . Then I ran these tests with this command:
The output from that command ended with this message:
Have you tried this yourself? I would have expected you to get the same error.
Attachment multiple-introducer-client-side-001.dpatch (11357 bytes) added
Client side code changes combined together, fixed warn_flag error
This error should be escaped by undo the last patch attachment:multiple-introducer-client-side-001.dpatch and apply the latest one. I've replaced with the correct version now.
Faruq: now that we've started storing your patches in this branch: source:ticket68-multi-introducer, there is no longer a good way to undo the old patches. So would you please provide a patch which gets added on top of the patches that are already in your branch? One way to do this would be to get a new repo from your branch, like this:
Then cd into the
ticket68-multi-introducer
repository and change the code there in to make the tests pass. But do not usedarcs unrecord
,darcs obliterate
, ordarcs amend-record
in that repository, because those commands work by removing patches from the repository, and we can't (or don't want to) remove patches from the repository http://tahoe-lafs.org/source/tahoe-lafs/ticket68-multi-introducer on the server.Okay I merged trunk (which is currently 1.8.0rc1) into the source:ticket68-multi-introducer branch and ran a full build here are the results. Then I applied your three patches from comment:361304 and ran a full build again: here are the results.
Replying to writefaruq:
I can't undo the last patch attachment:multiple-introducer-client-side-001.dpatch because, as described in comment:361307, we are going to maintain a history of all patches on source:ticket68-multi-introducer. For example, here is the history of such patches: http://tahoe-lafs.org/trac/tahoe-lafs/log/ticket68-multi-introducer/ and the one that you attached as the last attachment:multiple-introducer-client-side-001.dpatch I have now applied to that branch as [20100801142304-e2516-411e80c14e29287e8d9ce700e7b359e23fb45105].
Attachment multiple-introducer-client-side-001-x1.dpatch (2034 bytes) added
Fixed warn_flag error
Faruq: did you run the tests after you fixed the warn_flag error? If you did, what do you think of the results? If you did not, please run the tests and paste the results in here.
My note in comment:361305 tells you how to run the tests.
I've tested after applying this patch. Test result is at here: http://pastebin.com/1Ac3b6Jk A summary is given below.
Okay, good, now also please run more of the other tests to see if your patches broke anything else.
Attachment multiple-introducer-client-side-001-x2.dpatch (5056 bytes) added
tweaks to pass the full-tests
Just for reference, here is a hyperlink that shows you the most recent results of building the source:ticket68-multi-introducer branch on all of our Supported Builder: buildbot link
Faruq: I committed your latest patches and triggered the buildbot to test them. Use the buildbot link to see the results (I committed them just now, so look for the builds that started at 22:41:21 PDT on 2010-08-11).
You can see the patches that are on the branch here: http://tahoe-lafs.org/trac/tahoe-lafs/log/ticket68-multi-introducer/
The builds haven't finished yet so I don't know whether all the tests passed on all platforms, but I'm going to sleep now. :-)
Okay as you can see from the buildbot link that shows Supported Builders testing this branch the tests pass on the buildbot. Adding the
review-needed
tag to this ticket.I added a question about multiple introducer to the FAQ wiki page.
So after close this ticket, please edit FAQ page
Full source available at
http://tahoe-lafs.org/source/tahoe-lafs/ticket68-multi-introducer/
The final GSoC code is here
http://code.google.com/p/google-summer-of-code-2010-tahoe-lafs/downloads/detail?name=MOFaruque_Sarker.tar.gz&can=2&q=#makechanges
Some hints to use it
A seond file "BASEDIR/introducers" configures introducers. It is necessary to
write all FURL entries into this file. Each line in this file contains exactly
one FURL entry. For backward compatibility reasons, any "introducer.furl"
entry found in tahoe.cfg file will automatically be copied into this file. Keeping
any FURL entry in tahoe.cfg file is not recommended for new users.
Edit BASEDIR/introducers and add FURLs for each introducer. Of course you need to run them before you get a FURL.
Play with them as you like.
Attachment ticket68-multi-introducer.tar.gz (1309681 bytes) added
A snapshot of working repository
I've installed the snapshot on 2 systems. Started an introducer on both systems. Started a storage node on both systems with both furls. I can see the storage nodes appearing in the web interface of both introducers and the storage node web interface contains both introducers.
Shutdown one system, web interface still shows off-line system as active for introducer and storage.
Trying to create a new directory causes the node to contact the off-line system and keeps busy with that (no time-out?). Request stays in "active operations" list, even after stopping the request.
Hm. Myckel: Could you please reproduce this and then about 10 seconds after you shutdown one storage server, click the "Report an Incident" button on the welcome page. Then again when you attempt to mkdir, please click the "Report an Incident" button a few seconds after you've done so.
Each time you click "Report an Incident" it creates a file in the
logs/incidents
. Please attach those files to this ticket.Faruq: we should write a unit test of this workflow—create two introducers, create a storage server point at both introducers, create a storage client pointing at both introducers, shutdown one of the servers, then, um, then initiate an operation in the storage client, such as mkdir (which is what Myckel did manually) or any other operation that uses storage servers.
Attachment incident-2010-10-31-082948-tx5qoxy.flog.bz2 (7846 bytes) added
First incident report (after shutdown, before making dir)
Attachment incident-2010-10-31-083037-4o3degq.flog.bz2 (8570 bytes) added
2nd incident log (after mkdir)
Replying to zooko:
Ok, files are attached. I hope they are useful, because after making the incident report I noticed that the storage server recovered. This might also not be related to the multiple introducer situation, because I've had it also happening when trying with volunteer grid (one storage node went off-line, I couldn't do anything any more, until restarting my storage node).
I've restarted the storage node and introducer that I shutdown. Took a few minutes before the other storage node and introducer noticed the new storage node and introducer.
Is there some heartbeat or small time out in place?
Replying to [Myckel]comment:76:
Wait, what? I'm confused. You created two introducers and two storage nodes, right? And then were you using one of the storage nodes to also be a gateway (== a storage client)? And then did you shut down the other one by running
tahoe stop $BASEDIR
on it?Replying to Myckel:
Yes, you retry to open a connection to each peer periodically, in an exponential back-off pattern (until you have backed off to trying only once per hour, at which point you keep trying at that rate indefinitely).
So if the peer was down for 5 minutes then it might take up to 5 minutes after it is brought back up before you reconnect to it.
Replying to [zooko]comment:78:
Guess I was not so clear. This is what I did:
2 computers:
Computer 1:
Run both an introducer and storage client (access it through the web interface).
Computer 2:
Run both an introducer and a storage client (can access it trough the web interface, but don't bother with that).
Both introducers see both storage clients. Both storage clients say they are connected to the introducers. All fine so far.
Then I shutdown system 2, so NO tahoe stop $BASEDIR (I could also plug the power or do a hard reset).
Then on system 1 I try to make a dir through the web interface, and then everything stays busy while it tries to contact the storage node/client on system 2.
Faruq: as per comment:361320, we should add a test for this case. Removing the
review-needed
tag and adding thetest-needed
flag.There's been some discussion of this ticket on the mailing list here and here and in the Tahoe-LAFS Weekly News.
Out of time for v1.9.0! But anyone who loves this, please jump in. There's no time like the present! Do some manual testing of Faruq's patch, write a new patch, write unit tests, etc. :-)
I've been using this patch with the ones in #1007 and #1010 (and foolscap tickets 150 and 151) on I2P with v1.8.3 and so far there haven't been any issues with the functionality.
It seems, however, that comments aren't allowed in
$TAHOENODE/introducers
. At least # doesn't work as a comment character. Having the ability to add comments would be a very welcome addition.Just to give a heads up: Most of the 18 storage nodes on our smallish grid on I2P have been using the multiple introducer patch since late November and things are still working well for us.
Also one of our users made some modifications that add colors to the introducer list as can be seen at http://i.imgur.com/aPbaY.png. After I refactor the patch for the current git revision I'll add it to this ticket.
killyourtv: cool! Thank you for the note. If I recall correctly, Faruq's patch didn't have a thorough unit test.
I've noticed several good contributions to Tahoe-LAFS that are blocked on not having unit tests. I think a lot of people know how to write Python code but aren't sure what we expect in terms of testing, or don't know how to use trial's features to test results that are deferred until a subsequent event. I've been thinking that having a "unit test tutorial" party could be fun, where everyone who has a patch for Tahoe-LAFS that needs tests comes to the IRC channel and we pick one and walk through how to write tests for it...
For anyone who wants to contribute to this ticket, the patches are available through darcs from this repo https://tahoe-lafs.org/trac/tahoe-lafs/browser/ticket68-multi-introducer , i.e.
darcs get --lazy <https://tahoe-lafs.org/source/tahoe-lafs/ticket68-multi-introducer>
. killyourtv probably has them available in another form (unified diff?). It would be cool to port the darcs repo to be a git branch. If you do that, please add a comment to this ticket pointing to the git branch.In case it's of use: https://github.com/kytvi2p/tahoe-lafs.
I made two branches, one for what I think should be close to 1.8.3 (it's not tagged) and one for 1.9.x (current git).
The patchset has been refactored to apply on top of the current build at https://github.com/kytvi2p/tahoe-lafs/tree/68-multi
Although everything seems to work, the unit tests are (unfortunately) still broken.
I've been working on restructuring the new
IntroducerClient
so that we can implement multi-introducer grids without losing announcement deduplication logic in the client. My work so far is here: https://github.com/lebek/tahoe-lafs/compare/master...68-multi-introducerThe way configuration works is a new
clientintroducer.furls
option which takes multiple values (whitespace or line break separated). If bothclientintroducer.furl
andclientintroducer.furls
are set the values are appended.All introducer tests are passing at the moment, so theoretically this might work already. I'm still working on new tests specific to the multi-introducer setting. I also still need to make announcements idempotent in the introducer. Finally, I'll import Faruq's patches to the WUI and documentation, they shouldn't require much modification.
This isn't ready for Tahoe-LAFS v1.10, but as [//pipermail/tahoe-dev/2012-November/007867.html recently discussed], we've decided we'd like to try integrating it into trunk ASAP! Lebek, or anyone else who wants to help, please see that mailing list discussion and reply on tahoe-dev, or this ticket, or join us at the next Weekly Dev Chat.
Removing obsolete reference to vdrive servers in the Description.
I'd like to get this into trunk ASAP! So it can get thoroughly tested out for Tahoe-LAFS v1.11. If I understand correctly, lebek's notes at comment:361336 and [//pipermail/tahoe-dev/2012-November/007867.html our discussion] from a weekly dev chat are telling us what next steps to take.
#1402 was a duplicate, and there was a patch attached to it by socrates:
attachment:relay.py🎫1402
Replying to killyourtv:
david415 and I began updating this patch to work with post-1.10 versions of tahoe:
https://github.com/leif/tahoe-lafs/commits/ticket68 (tests do not pass yet, but it is connecting to multiple introducers).
Hopefully we'll have a cleaned up patch soon.
I'm cross-posting this comment to #68 and #467.
Here is a squashed commit of the multi-introducer and introducerless patches on top of the current master:
https://github.com/leif/tahoe-lafs/compare/master...introless-multiintro-squashed
And here is a 3-way merge combining the history of both feature branches with master in such a way that
git log
andgit blame
can still find the original commits: https://github.com/leif/tahoe-lafs/compare/master...introless-multiintro-with-history (creating this was a git adventure; I ended up doing the 3-way merge using-s ours
and then doing another squash merge followed bygit commit --amend
)I'm going to write more tests before submitting a pull request with one of these. But, if anyone wants to review or test it now I'd appreciate it!
Here is the latest introducerless/multi-introducer patch: https://github.com/leif/tahoe-lafs/commit/1ae5aaecbb68f13019b6bc2ba4632bb4a5623aaa (that is a squash merge on top of two other commits which will hopefully land on master soon).
It should perhaps have some more tests, but testing/review/feedback would be welcomed.
I gave some feedback, although it's a huge diff and probably needs more eyes on it.
I'm optimistically putting this in the 1.10.3 milestone; it may well get booted out to 1.11.
Here is the new version, after addressing daira's comments: https://github.com/leif/tahoe-lafs/commit/8fc8cd9151d4dc4c041867bac98aefff6a105729
I think this is nearly ready to merge, so more review and/or testing would be appreciated.
The one thing remaining that I think needs to be done is to add some tests to
test_web
.Here is the latest introless-multiintro branch (with full history) with a few more commits since the squashed commit in my previous comment.
I posted a comment about my next steps for this branch on ticket #467.
Out of time for 1.10.3.
Milestone renamed
moving most tickets from 1.12 to 1.13 so we can release 1.12 with magic-folders
i've got this dev branch where i added
init_introducer_clients
:https://github.com/david415/tahoe-lafs/tree/68.multi_intro.0
in the above dev branch i've gotten all the unit tests to pass... so i opened this pull-request here:
https://github.com/tahoe-lafs/tahoe-lafs/pull/338
please review
In 3b24e7e/trunk:
In d802135/trunk:
In 2e3ec41/trunk:
Ok, at long last, this ticket is done. We didn't implement the cool "gossip" approach, or the limited-flood thing, or the invitation thing. But nodes can now be configured with zero/1/many introducer FURLs (via a combination of tahoe.cfg
introducer.furl=
and the newNODEDIR/private/introducers.yaml
), and servers will announce themselves to all introducers, and clients will merge announcements from all introducers.