intermittent allmydata.test.test_introducer.NonV1Server.test_failure failure #2896

Open
opened 2017-07-26 15:26:59 +00:00 by exarkun · 4 comments
exarkun commented 2017-07-26 15:26:59 +00:00
Owner
Traceback (most recent call last):
Failure: twisted.trial.util.DirtyReactorAggregateError: Reactor was unclean.
[DelayedCalls](wiki/DelayedCalls): (set twisted.internet.base.DelayedCall.debug = True to debug)
<DelayedCall 0x10a5a0bd8 [240.085453033s] called=0 cancelled=0 Broker.keepaliveTimerFired()>
<DelayedCall 0x10a5a04d0 [240.08481288s] called=0 cancelled=0 Broker.keepaliveTimerFired()>

and

Traceback (most recent call last):
Failure: twisted.trial.util.DirtyReactorAggregateError: Reactor was unclean.
Selectables:
<Negotiation #0 on 55074>
<<class 'twisted.internet.tcp.Client'> to ('127.0.0.1', 55074) at 10a3a4b50>

ie, a leaked TCP client and a couple delayed calls.

``` Traceback (most recent call last): Failure: twisted.trial.util.DirtyReactorAggregateError: Reactor was unclean. [DelayedCalls](wiki/DelayedCalls): (set twisted.internet.base.DelayedCall.debug = True to debug) <DelayedCall 0x10a5a0bd8 [240.085453033s] called=0 cancelled=0 Broker.keepaliveTimerFired()> <DelayedCall 0x10a5a04d0 [240.08481288s] called=0 cancelled=0 Broker.keepaliveTimerFired()> ``` and ``` Traceback (most recent call last): Failure: twisted.trial.util.DirtyReactorAggregateError: Reactor was unclean. Selectables: <Negotiation #0 on 55074> <<class 'twisted.internet.tcp.Client'> to ('127.0.0.1', 55074) at 10a3a4b50> ``` ie, a leaked TCP client and a couple delayed calls.
tahoe-lafs added the
code
normal
defect
1.12.1
labels 2017-07-26 15:26:59 +00:00
tahoe-lafs added this to the undecided milestone 2017-07-26 15:26:59 +00:00
exarkun commented 2017-07-26 15:47:18 +00:00
Author
Owner

This was observed on the OS X travis-ci job. It may be that this test can only realistically fail on OS X due to distinctive ordering of TCP events on that platform.

This was observed on the OS X travis-ci job. It may be that this test can only realistically fail on OS X due to distinctive ordering of TCP events on that platform.
exarkun commented 2017-07-26 17:38:08 +00:00
Author
Owner

It looks like Tub.stopService doesn't actually wait for its (Foolscap) Broker connections to close. Instead, it closes them and waits about a reactor tick ("eventually"). Evidently this is good enough most of the time but it's not a guarantee the connections will have been closed in time.

Fortunately, Broker already has an API for being notified when its connection is actually lost. Unfortunately, for some reason, Tub.stopService explicitly disables that functionality when disconnecting its Brokers! I don't know why it does this.

It looks like `Tub.stopService` doesn't actually wait for its (Foolscap) `Broker` connections to close. Instead, it closes them and waits about a reactor tick ("eventually"). Evidently this is good enough most of the time but it's not a guarantee the connections will have been closed in time. Fortunately, `Broker` already has an API for being notified when its connection is actually lost. Unfortunately, for some reason, `Tub.stopService` explicitly disables that functionality when disconnecting its `Broker`s! I don't know why it does this.
exarkun commented 2017-07-26 20:14:39 +00:00
Author
Owner

I think this will be fixed by a Foolscap change, https://github.com/warner/foolscap/pull/38

I think this will be fixed by a Foolscap change, <https://github.com/warner/foolscap/pull/38>
warner commented 2017-07-27 07:15:59 +00:00
Author
Owner

Foolscap-0.12.7 has been released, with that change.

Foolscap-0.12.7 has been released, with that change.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: tahoe-lafs/trac-2024-07-25#2896
No description provided.