tahoe-lafs/trac-2024-07-25

add tcpdump data to viz tool #1269

New issue

Open

opened 2010-11-24 00:00:11 +00:00 by warner · 0 comments

warner commented

2010-11-24 00:00:11 +00:00

Owner

as mentioned in comment:121264 in the context of the not-yet-landed #1200 viz tool:

These visualization tools are a lot of fun. One direction to explore
is to record some packet timings (with tcpdump) and add it as an extra
row: that would show us how much latency/load Foolscap is spending
before it delivers a message response to the application.

The idea would be to start a tcpdump process just before starting
a download, then run a tool over the output to extract just the relevant
packets (actually you'd want a tool that starts by asking the tahoe
client for a list of its connections, to get the port numbers, then runs
tcpdump itself with the right filter arguments). You'd store some
condensed form of the output (maybe a pickled list of timestamps) in a
directory where web/status.py could find it. Then status.py
would serve packet timestamps in the same JSON bundle as the other
download events (in particular the tx/rx of data-block requests). These
packet timestamps would then be shown on the same chart as the
application-level requests.

(another thought is to have the tcpdump process publish its data over
HTTP, and put a box on the viz page to paste in the URL of that process,
so it can fetch the data itself. This requires a browser that allows
CORS (also see
here), but that
dates back to Firefox 3.5 and maybe IE7).

The goal would be to eyeball how much overhead is coming from Foolscap
and the network layer. Even though the data inside the SSL connections
would be opaque to tcpdump, all we really care about is the
timing. It should also be possible to see how multiple small messages
are combined into a single packet (Nagle), and maybe how a small message
gets stalled behind some other large messages (head-of-line-blocking).
Contention between parallel requests to multiple servers might also show
up here.

It would be great to be able to do this on the server side as well, and
get a sense for how the delay is divided between the outbound network
trip, the server's internal processing, and the return network trip. Of
course, this assumes synchronized clocks, but perhaps the
tcpdump-running tool could exchange a couple of packets with timestamps
before the download starts, a sort of cheap stripped-down NTP, and apply
the offset to the resulting packet trace.

as mentioned in [comment:121264](/tahoe-lafs/trac-2024-07-25/issues/1170#issuecomment-121264) in the context of the not-yet-landed #1200 viz tool: > These visualization tools are a lot of fun. One direction to explore > is to record some packet timings (with tcpdump) and add it as an extra > row: that would show us how much latency/load Foolscap is spending > before it delivers a message response to the application. The idea would be to start a `tcpdump` process just before starting a download, then run a tool over the output to extract just the relevant packets (actually you'd want a tool that starts by asking the tahoe client for a list of its connections, to get the port numbers, then runs tcpdump itself with the right filter arguments). You'd store some condensed form of the output (maybe a pickled list of timestamps) in a directory where `web/status.py` could find it. Then `status.py` would serve packet timestamps in the same JSON bundle as the other download events (in particular the tx/rx of data-block requests). These packet timestamps would then be shown on the same chart as the application-level requests. (another thought is to have the tcpdump process publish its data over HTTP, and put a box on the viz page to paste in the URL of that process, so it can fetch the data itself. This requires a browser that allows [CORS](http://www.w3.org/TR/cors/) (also see [here](https://developer.mozilla.org/en/http_access_control)), but that dates back to Firefox 3.5 and maybe IE7). The goal would be to eyeball how much overhead is coming from Foolscap and the network layer. Even though the data inside the SSL connections would be opaque to `tcpdump`, all we really care about is the timing. It should also be possible to see how multiple small messages are combined into a single packet (Nagle), and maybe how a small message gets stalled behind some other large messages (head-of-line-blocking). Contention between parallel requests to multiple servers might also show up here. It would be great to be able to do this on the server side as well, and get a sense for how the delay is divided between the outbound network trip, the server's internal processing, and the return network trip. Of course, this assumes synchronized clocks, but perhaps the tcpdump-running tool could exchange a couple of packets with timestamps before the download starts, a sort of cheap stripped-down NTP, and apply the offset to the resulting packet trace.