UCWE when mapupdate gives up too early, then server errors require replacement servers #893
Labels
No labels
c/code
c/code-dirnodes
c/code-encoding
c/code-frontend
c/code-frontend-cli
c/code-frontend-ftp-sftp
c/code-frontend-magic-folder
c/code-frontend-web
c/code-mutable
c/code-network
c/code-nodeadmin
c/code-peerselection
c/code-storage
c/contrib
c/dev-infrastructure
c/docs
c/operational
c/packaging
c/unknown
c/website
kw:2pc
kw:410
kw:9p
kw:ActivePerl
kw:AttributeError
kw:DataUnavailable
kw:DeadReferenceError
kw:DoS
kw:FileZilla
kw:GetLastError
kw:IFinishableConsumer
kw:K
kw:LeastAuthority
kw:Makefile
kw:RIStorageServer
kw:StringIO
kw:UncoordinatedWriteError
kw:about
kw:access
kw:access-control
kw:accessibility
kw:accounting
kw:accounting-crawler
kw:add-only
kw:aes
kw:aesthetics
kw:alias
kw:aliases
kw:aliens
kw:allmydata
kw:amazon
kw:ambient
kw:annotations
kw:anonymity
kw:anonymous
kw:anti-censorship
kw:api_auth_token
kw:appearance
kw:appname
kw:apport
kw:archive
kw:archlinux
kw:argparse
kw:arm
kw:assertion
kw:attachment
kw:auth
kw:authentication
kw:automation
kw:avahi
kw:availability
kw:aws
kw:azure
kw:backend
kw:backoff
kw:backup
kw:backupdb
kw:backward-compatibility
kw:bandwidth
kw:basedir
kw:bayes
kw:bbfreeze
kw:beta
kw:binaries
kw:binutils
kw:bitcoin
kw:bitrot
kw:blacklist
kw:blocker
kw:blocks-cloud-deployment
kw:blocks-cloud-merge
kw:blocks-magic-folder-merge
kw:blocks-merge
kw:blocks-raic
kw:blocks-release
kw:blog
kw:bom
kw:bonjour
kw:branch
kw:branding
kw:breadcrumbs
kw:brians-opinion-needed
kw:browser
kw:bsd
kw:build
kw:build-helpers
kw:buildbot
kw:builders
kw:buildslave
kw:buildslaves
kw:cache
kw:cap
kw:capleak
kw:captcha
kw:cast
kw:centos
kw:cffi
kw:chacha
kw:charset
kw:check
kw:checker
kw:chroot
kw:ci
kw:clean
kw:cleanup
kw:cli
kw:cloud
kw:cloud-backend
kw:cmdline
kw:code
kw:code-checks
kw:coding-standards
kw:coding-tools
kw:coding_tools
kw:collection
kw:compatibility
kw:completion
kw:compression
kw:confidentiality
kw:config
kw:configuration
kw:configuration.txt
kw:conflict
kw:connection
kw:connectivity
kw:consistency
kw:content
kw:control
kw:control.furl
kw:convergence
kw:coordination
kw:copyright
kw:corruption
kw:cors
kw:cost
kw:coverage
kw:coveralls
kw:coveralls.io
kw:cpu-watcher
kw:cpyext
kw:crash
kw:crawler
kw:crawlers
kw:create-container
kw:cruft
kw:crypto
kw:cryptography
kw:cryptography-lib
kw:cryptopp
kw:csp
kw:curl
kw:cutoff-date
kw:cycle
kw:cygwin
kw:d3
kw:daemon
kw:darcs
kw:darcsver
kw:database
kw:dataloss
kw:db
kw:dead-code
kw:deb
kw:debian
kw:debug
kw:deep-check
kw:defaults
kw:deferred
kw:delete
kw:deletion
kw:denial-of-service
kw:dependency
kw:deployment
kw:deprecation
kw:desert-island
kw:desert-island-build
kw:design
kw:design-review-needed
kw:detection
kw:dev-infrastructure
kw:devpay
kw:directory
kw:directory-page
kw:dirnode
kw:dirnodes
kw:disconnect
kw:discovery
kw:disk
kw:disk-backend
kw:distribute
kw:distutils
kw:dns
kw:do_http
kw:doc-needed
kw:docker
kw:docs
kw:docs-needed
kw:dokan
kw:dos
kw:download
kw:downloader
kw:dragonfly
kw:drop-upload
kw:duplicity
kw:dusty
kw:earth-dragon
kw:easy
kw:ec2
kw:ecdsa
kw:ed25519
kw:egg-needed
kw:eggs
kw:eliot
kw:email
kw:empty
kw:encoding
kw:endpoint
kw:enterprise
kw:enum34
kw:environment
kw:erasure
kw:erasure-coding
kw:error
kw:escaping
kw:etag
kw:etch
kw:evangelism
kw:eventual
kw:example
kw:excess-authority
kw:exec
kw:exocet
kw:expiration
kw:extensibility
kw:extension
kw:failure
kw:fedora
kw:ffp
kw:fhs
kw:figleaf
kw:file
kw:file-descriptor
kw:filename
kw:filesystem
kw:fileutil
kw:fips
kw:firewall
kw:first
kw:floatingpoint
kw:flog
kw:foolscap
kw:forward-compatibility
kw:forward-secrecy
kw:forwarding
kw:free
kw:freebsd
kw:frontend
kw:fsevents
kw:ftp
kw:ftpd
kw:full
kw:furl
kw:fuse
kw:garbage
kw:garbage-collection
kw:gateway
kw:gatherer
kw:gc
kw:gcc
kw:gentoo
kw:get
kw:git
kw:git-annex
kw:github
kw:glacier
kw:globalcaps
kw:glossary
kw:google-cloud-storage
kw:google-drive-backend
kw:gossip
kw:governance
kw:grid
kw:grid-manager
kw:gridid
kw:gridsync
kw:grsec
kw:gsoc
kw:gvfs
kw:hackfest
kw:hacktahoe
kw:hang
kw:hardlink
kw:heartbleed
kw:heisenbug
kw:help
kw:helper
kw:hint
kw:hooks
kw:how
kw:how-to
kw:howto
kw:hp
kw:hp-cloud
kw:html
kw:http
kw:https
kw:i18n
kw:i2p
kw:i2p-collab
kw:illustration
kw:image
kw:immutable
kw:impressions
kw:incentives
kw:incident
kw:init
kw:inlineCallbacks
kw:inotify
kw:install
kw:installer
kw:integration
kw:integration-test
kw:integrity
kw:interactive
kw:interface
kw:interfaces
kw:interoperability
kw:interstellar-exploration
kw:introducer
kw:introduction
kw:iphone
kw:ipkg
kw:iputil
kw:ipv6
kw:irc
kw:jail
kw:javascript
kw:joke
kw:jquery
kw:json
kw:jsui
kw:junk
kw:key-value-store
kw:kfreebsd
kw:known-issue
kw:konqueror
kw:kpreid
kw:kvm
kw:l10n
kw:lae
kw:large
kw:latency
kw:leak
kw:leasedb
kw:leases
kw:libgmp
kw:license
kw:licenss
kw:linecount
kw:link
kw:linux
kw:lit
kw:localhost
kw:location
kw:locking
kw:logging
kw:logo
kw:loopback
kw:lucid
kw:mac
kw:macintosh
kw:magic-folder
kw:manhole
kw:manifest
kw:manual-test-needed
kw:map
kw:mapupdate
kw:max_space
kw:mdmf
kw:memcheck
kw:memory
kw:memory-leak
kw:mesh
kw:metadata
kw:meter
kw:migration
kw:mime
kw:mingw
kw:minimal
kw:misc
kw:miscapture
kw:mlp
kw:mock
kw:more-info-needed
kw:mountain-lion
kw:move
kw:multi-users
kw:multiple
kw:multiuser-gateway
kw:munin
kw:music
kw:mutability
kw:mutable
kw:mystery
kw:names
kw:naming
kw:nas
kw:navigation
kw:needs-review
kw:needs-spawn
kw:netbsd
kw:network
kw:nevow
kw:new-user
kw:newcaps
kw:news
kw:news-done
kw:news-needed
kw:newsletter
kw:newurls
kw:nfc
kw:nginx
kw:nixos
kw:no-clobber
kw:node
kw:node-url
kw:notification
kw:notifyOnDisconnect
kw:nsa310
kw:nsa320
kw:nsa325
kw:numpy
kw:objects
kw:old
kw:openbsd
kw:openitp-packaging
kw:openssl
kw:openstack
kw:opensuse
kw:operation-helpers
kw:operational
kw:operations
kw:ophandle
kw:ophandles
kw:ops
kw:optimization
kw:optional
kw:options
kw:organization
kw:os
kw:os.abort
kw:ostrom
kw:osx
kw:osxfuse
kw:otf-magic-folder-objective1
kw:otf-magic-folder-objective2
kw:otf-magic-folder-objective3
kw:otf-magic-folder-objective4
kw:otf-magic-folder-objective5
kw:otf-magic-folder-objective6
kw:p2p
kw:packaging
kw:partial
kw:password
kw:path
kw:paths
kw:pause
kw:peer-selection
kw:performance
kw:permalink
kw:permissions
kw:persistence
kw:phone
kw:pickle
kw:pip
kw:pipermail
kw:pkg_resources
kw:placement
kw:planning
kw:policy
kw:port
kw:portability
kw:portal
kw:posthook
kw:pratchett
kw:preformance
kw:preservation
kw:privacy
kw:process
kw:profile
kw:profiling
kw:progress
kw:proxy
kw:publish
kw:pyOpenSSL
kw:pyasn1
kw:pycparser
kw:pycrypto
kw:pycrypto-lib
kw:pycryptopp
kw:pyfilesystem
kw:pyflakes
kw:pylint
kw:pypi
kw:pypy
kw:pysqlite
kw:python
kw:python3
kw:pythonpath
kw:pyutil
kw:pywin32
kw:quickstart
kw:quiet
kw:quotas
kw:quoting
kw:raic
kw:rainhill
kw:random
kw:random-access
kw:range
kw:raspberry-pi
kw:reactor
kw:readonly
kw:rebalancing
kw:recovery
kw:recursive
kw:redhat
kw:redirect
kw:redressing
kw:refactor
kw:referer
kw:referrer
kw:regression
kw:rekey
kw:relay
kw:release
kw:release-blocker
kw:reliability
kw:relnotes
kw:remote
kw:removable
kw:removable-disk
kw:rename
kw:renew
kw:repair
kw:replace
kw:report
kw:repository
kw:research
kw:reserved_space
kw:response-needed
kw:response-time
kw:restore
kw:retrieve
kw:retry
kw:review
kw:review-needed
kw:reviewed
kw:revocation
kw:roadmap
kw:rollback
kw:rpm
kw:rsa
kw:rss
kw:rst
kw:rsync
kw:rusty
kw:s3
kw:s3-backend
kw:s3-frontend
kw:s4
kw:same-origin
kw:sandbox
kw:scalability
kw:scaling
kw:scheduling
kw:schema
kw:scheme
kw:scp
kw:scripts
kw:sdist
kw:sdmf
kw:security
kw:self-contained
kw:server
kw:servermap
kw:servers-of-happiness
kw:service
kw:setup
kw:setup.py
kw:setup_requires
kw:setuptools
kw:setuptools_darcs
kw:sftp
kw:shared
kw:shareset
kw:shell
kw:signals
kw:simultaneous
kw:six
kw:size
kw:slackware
kw:slashes
kw:smb
kw:sneakernet
kw:snowleopard
kw:socket
kw:solaris
kw:space
kw:space-efficiency
kw:spam
kw:spec
kw:speed
kw:sqlite
kw:ssh
kw:ssh-keygen
kw:sshfs
kw:ssl
kw:stability
kw:standards
kw:start
kw:startup
kw:static
kw:static-analysis
kw:statistics
kw:stats
kw:stats_gatherer
kw:status
kw:stdeb
kw:storage
kw:streaming
kw:strports
kw:style
kw:stylesheet
kw:subprocess
kw:sumo
kw:survey
kw:svg
kw:symlink
kw:synchronous
kw:tac
kw:tahoe-*
kw:tahoe-add-alias
kw:tahoe-admin
kw:tahoe-archive
kw:tahoe-backup
kw:tahoe-check
kw:tahoe-cp
kw:tahoe-create-alias
kw:tahoe-create-introducer
kw:tahoe-debug
kw:tahoe-deep-check
kw:tahoe-deepcheck
kw:tahoe-lafs-trac-stream
kw:tahoe-list-aliases
kw:tahoe-ls
kw:tahoe-magic-folder
kw:tahoe-manifest
kw:tahoe-mkdir
kw:tahoe-mount
kw:tahoe-mv
kw:tahoe-put
kw:tahoe-restart
kw:tahoe-rm
kw:tahoe-run
kw:tahoe-start
kw:tahoe-stats
kw:tahoe-unlink
kw:tahoe-webopen
kw:tahoe.css
kw:tahoe_files
kw:tahoewapi
kw:tarball
kw:tarballs
kw:tempfile
kw:templates
kw:terminology
kw:test
kw:test-and-set
kw:test-from-egg
kw:test-needed
kw:testgrid
kw:testing
kw:tests
kw:throttling
kw:ticket999-s3-backend
kw:tiddly
kw:time
kw:timeout
kw:timing
kw:to
kw:to-be-closed-on-2011-08-01
kw:tor
kw:tor-protocol
kw:torsocks
kw:tox
kw:trac
kw:transparency
kw:travis
kw:travis-ci
kw:trial
kw:trickle
kw:trivial
kw:truckee
kw:tub
kw:tub.location
kw:twine
kw:twistd
kw:twistd.log
kw:twisted
kw:twisted-14
kw:twisted-trial
kw:twitter
kw:twn
kw:txaws
kw:type
kw:typeerror
kw:ubuntu
kw:ucwe
kw:ueb
kw:ui
kw:unclean
kw:uncoordinated-writes
kw:undeletable
kw:unfinished-business
kw:unhandled-error
kw:unhappy
kw:unicode
kw:unit
kw:unix
kw:unlink
kw:update
kw:upgrade
kw:upload
kw:upload-helper
kw:uri
kw:url
kw:usability
kw:use-case
kw:utf-8
kw:util
kw:uwsgi
kw:ux
kw:validation
kw:variables
kw:vdrive
kw:verify
kw:verlib
kw:version
kw:versioning
kw:versions
kw:video
kw:virtualbox
kw:virtualenv
kw:vista
kw:visualization
kw:visualizer
kw:vm
kw:volunteergrid2
kw:volunteers
kw:vpn
kw:wapi
kw:warners-opinion-needed
kw:warning
kw:weapi
kw:web
kw:web.port
kw:webapi
kw:webdav
kw:webdrive
kw:webport
kw:websec
kw:website
kw:websocket
kw:welcome
kw:welcome-page
kw:welcomepage
kw:wiki
kw:win32
kw:win64
kw:windows
kw:windows-related
kw:winscp
kw:workaround
kw:world-domination
kw:wrapper
kw:write-enabler
kw:wui
kw:x86
kw:x86-64
kw:xhtml
kw:xml
kw:xss
kw:zbase32
kw:zetuptoolz
kw:zfec
kw:zookos-opinion-needed
kw:zope
kw:zope.interface
p/blocker
p/critical
p/major
p/minor
p/normal
p/supercritical
p/trivial
r/cannot reproduce
r/duplicate
r/fixed
r/invalid
r/somebody else's problem
r/was already fixed
r/wontfix
r/worksforme
t/defect
t/enhancement
t/task
v/0.2.0
v/0.3.0
v/0.4.0
v/0.5.0
v/0.5.1
v/0.6.0
v/0.6.1
v/0.7.0
v/0.8.0
v/0.9.0
v/1.0.0
v/1.1.0
v/1.10.0
v/1.10.1
v/1.10.2
v/1.10a2
v/1.11.0
v/1.12.0
v/1.12.1
v/1.13.0
v/1.14.0
v/1.15.0
v/1.15.1
v/1.2.0
v/1.3.0
v/1.4.1
v/1.5.0
v/1.6.0
v/1.6.1
v/1.7.0
v/1.7.1
v/1.7β
v/1.8.0
v/1.8.1
v/1.8.2
v/1.8.3
v/1.8β
v/1.9.0
v/1.9.0-s3branch
v/1.9.0a1
v/1.9.0a2
v/1.9.0b1
v/1.9.1
v/1.9.2
v/1.9.2a1
v/cloud-branch
v/unknown
No milestone
No project
No assignees
3 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: tahoe-lafs/trac#893
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
The Incidents that were reported in #877 (by Zooko, referring to
mutable write errors experienced by nejucomo) indicate a thorny
problem that is distinct from both #877 (caused by a reentrancy
error) and #540 (caused by a logic bug that affects small grids
where a publish wraps around the peer circle). Here's the setup:
be updated. Ideally, the shares are concentrated at the beginning
of the permuted peerlist, on the first N servers.
heuristic: if we've seen enough shares, and we've also seen a
span of contiguous servers (in permuted order) that tell us they
do not have a share, then we stop the search. The size of this
span was intended to be N+epsilon (where 'epsilon' is a tradeoff
between performance and safety, and is set to k). Unfortunately
1.5.0 had a bug, and the span size was set to k+epsilon instead.
In the Incident:
the responses had returned, because it found a boundary early.
I'm not entirely sure how the sharemap was shaped, but it looks
like it stopped with a span of 3 servers ("found our boundary,
11000"), where k=3 and N=10, and returned a sharemap with 8
shares in it, some doubled up (I think there were 5 servers
involved)
(from
w6o6
, with two shares, but was ignoredUnfortunately, two of these (both owned by secorp) experienced
"Permission Denied" errors when attempting to write out the new
shares, suggesting a configuration error (maybe the tahoe node
process is owned by the wrong user)
picks suffer from the same error, so the fallover process repeats
a few times.
w6o6
, and sends it anew share, thinking that w6o6 has no shares (because the
servermap was not updated to include w6o6's late response).
shares that the test vector did not expect, causing a UCWE error
happened, even though UCWE was raised.
The biggest problem with this failure is that it is persistent. We
don't record any information that would tell a subsequent operation
to look further for existing shares, so exactly the same thing will
happen the next time we try to modify or repair the directory.
If secorp's servers weren't throwing errors, then I think the
condition would eventually fix itself: new shares would be placed on
his servers, bridging the span of servers without shares, and then
later mapupdate calls would keep going until they'd really seen all
of the shares.
Recently (changeset:eb1868628465a243, 08-Dec-2009) I fixed mapupdate to use N+epsilon
instead of k+epsilon. But the incident report suggests that it
stopped with a span of only 3 no-share servers. Looking more closely
at the code, I think it only waits for a span of epsilon (not
k+epsilon or N+epsilon), and that changeset:eb1868628465a243 changed something
different. I don't know if the thing that was changed would have
prevented this issue or not. It's possible that this is a
manifestation of #547 (mapupdate triggers on a false boundary), or
#549, or one of the other problems described in #546.
In general, we need to query more servers. But even if we increase
the span size or epsilon or whatever, there will always be a weird
situtation that could be handled better if we queried more servers.
We'd like to have something more adaptive: if the code hits UCWE
because it didn't try hard enough, then it should try harder next
time.
How should we deal with this? We need something to persist from one
operation on a given mutable filenode to the next, some sort of hint
that says "Hey, last time we were surprised, so next time you should
look further". Or something that tells us that we learned about
shares on servers X+Y+Z, and so the next time we do a mapupdate, we
shouldn't consider it complete until we've gotten responses from
those servers (in addition to any others that we might decide to
query).
The most natural place to keep this state would be on the mutable
filenode instance. This would help with UCWE that occurs inside a
modify() call, because the same filenode is used for each retry, but
in general filenodes are pretty short-lived. We don't want to keep
the mutable filenode around in RAM forever. Maybe a LRU cache that
keeps filenodes around for a few minutes, so that users who
experience UCWE and retry the operation can benefit from recent
history.
A storage protocol that included "where-are-the-other-shares" hints
(#599) would help: this would improve the reliability of mapupdate,
since the persistent information would be kept on the storage
servers, next to the shares.
A publish process which rebalanced the shares (#232, or
#661/#543/#864) might help, by filling in the gaps, except that here
the gap was caused by a batch of servers all suffering from the same
configuration problem.
The right answer probably lies in having UCWE triggering an
immediate repair, and having repair fill in the gaps. But it'd be
nice if there were a way to stash some information on the shares
before the gaps that let later operations know that they should look
past the gap.
looking at the changeset:eb1868628465a243 change more closely, I think it could be improved. The change increases
self.num_peers_to_query
, but that's only actually used by MODE_READ. The finish-criteria code for MODE_WRITE (ServermapUpdater._check_for_done
) should not let the process finish unless we've received answers from at leastself.num_peers_to_query
, just like MODE_READ (or if we've run out of servers to query).The gap size was always defined to be epsilon, but the minimum
num_peers_to_query
value was intended to make sure we'd looked far enough.I don't know if this is really a 1.6.0 or critical issue, but since #877 from whence it spawned was both, I guess I should set it to the same. Feel free to downgrade or defer it. I don't think we can find a good long-term solution in the 1.6.0 timeframe, but maybe the
num_peers_to_query
change could fit in.Maybe the uploader should have been unsurprised to find shares on
w6o6
. "Surprises" are when a server told you something during the mapupdate and then told you something different during the next phase. If you didn't ask a server during the mapupdate, then you shouldn't be surprised what he says in the next phase.I guess this is because I think of surprises as indicating UncoordinatedWriteError, rather than indicating "the world of storage servers are in a stranger state than you assumed".
What do you think of that, Brian? If you already decided to upload version
V+1
of your file, and then during the process you find some previously unused storage servers which are holding shares<= V
then you should just unsurprisedly resend your write request updated to say "Yes I know you have version U, and please replace it with this version V+1.". Now if you find previously unused servers that have a version> V+1
or a version==V
and with a different root hash then you should stop (but not with UncoordinatedWriteError).I think extending the finish criteria for MODE_WRITE (
ServermapUpdater._check_for_done
) would be a good improvement (it would have solved nejucomo's problem) and is probably feasible for Tahoe-LAFS v1.6.hm. Yeah, I suppose that if we were writing out version V+1, then it would be safe to overwrite any version (<V) or (=V,=oldroothash) that we saw. This would require either a clever test vector or some code to receive the "test failed, so no write" response and send out a new request with a new test vector. Not trivial, but not too hard.
I think that discovering a version (=V+1,!=newroothash) or (>V+1) should fire UCWE.. sounds to me like the exact definition of UCWE. Why do you think this should use a different error?
I'm not sure what it should do if it sees (=V,!=oldroothash). This indicates that a UCWE occurred in the past, and now there are two equal-version contenders for the file. Ideally the earlier write should have noticed this and performed an immediate repair. For the current publish, it should stop, but I'll agree that UCWE might not be the most accurate way to report it. Dunno.
Hm, the server doesn't currently support an operation that will apply a test-and-set if the share is present or else just apply the set if the share is absent, does it? source:src/allmydata/storage/server.py?rev=4118#L475 Oh look! It does! "
# compare the vectors against an empty share, in which all reads return empty strings
" So the current test vector should already work for this case, right? The client just needs to send a test vector saying if the server has a version less than the client's version then overwrite.Nejucomo expostulated: "UncoordinatedWriteError! I'm the only person in the universe who knows this write cap!"
And he was right -- there was never an uncoordinated write event. The client can't tell for sure, but if you never saw a server claim that it had version
V
and then claim that it had versionV+u
, then you should perhaps blame your problem on random network problems or bugs sooner than onUncoordinatedWriteError
. Could we have a related exception named something likeMultipleVersionsExist
of whichUncoordinatedWriteError
is the subclass that we use when we know that either some other client just now wrote when we were writing or else some server is doing a "roll-forward" in order to make it look like some other client has done so?This is the same case as server's V > client's V, right? So having the client send a testv which was
< V
should handle both cases.So there are three proposals for code changes in this ticket.
ServermapUpdater._check_for_done
) should not let the process finish unless we've received answers from at leastself.num_peers_to_query
, just like MODE_READ (or if we've run out of servers to query).MultipleVersionsExisted
and raise that instead ofUncoordinatedWriteError
if you didn't see actual evidence of a server changing its store between your mapupdate and your write. I earlier proposed thatUncoordinatedWriteError
could be a subtype ofMultipleVersionsExisted
, but I'm not sure whether it should be a subtype or just a separate type. Either way is fine with me.This might be related to #899, newly reported by Kyle Markley and Andrej Falout.
Not sure if we can fix this for 1.6.1, but it's definitely a candidate.
We're bumping this out of v1.6.1 just because we're not 100% sure that proposals 2 and 3 from comment:375412 are actually easy and safe.
It's really bothering me that mutable file upload and download behavior is so finicky, buggy, inefficient, hard to understand, different from immutable file upload and download behavior, etc. So I'm putting a bunch of tickets into the "1.8" Milestone. I am not, however, at this time, volunteering to work on these tickets, so it might be a mistake to put them into the 1.8 Milestone, but I really hope that someone else will volunteer or that I will decide to do it myself. :-)