[Netalyzr] upstream buffering test

Mon Mar 5 09:05:23 PST 2012

On Mar 4, 2012, at 5:25 PM, Jim Gettys wrote:

> On 03/04/2012 07:34 PM, Vern Paxson wrote:
>> [user Brian dropped from thread for now]
>> 
>>> Can it not be made to more accurately reflect the actual situation then?
>>> ....
>>> Ultimately though netalyzer is making people thing they have a
>>> bufferbloat problem when they don't, if I'm understanding all of this
>>> correctly.
>>> 
>>> I just wonder if the test cannot be made to better reflect the
>>> boatedness of the buffer in question so that people are not chasing ghosts.
>> Well, he asks a good question.  In particular, if AQM actually is deployed
>> a non-trivial proportion of the time, then it really behooves us to try
>> to detect that rather than playing into Jim Gettys' pet gripe.
> 
> I've never seen netalyzr give me false positives; false negatives, yes,
> but not false positives.
> 
> However, I'd sure like a test to tell if the ISP is doing AQM or not:
> Van doesn't trust the one paper I've seen on the topic (Dischinger, et.
> al, IIRC).  And at some point, we sure would like home routers running
> an AQM too (Dave Taht has a hack together using RED at the moment, but
> CoDel is *much* more interesting...)

I think this sort of test would need to be very different than the current test.

The current test is a "Slam the net with UDP and see what happens", but an AQM based test needs to be more subtle:

For base traffic, it would need to be multiple TCP streams (probably 3 or 4, based on the Ookla speedtest model that 3-4 is good for almost all to saturate >20 Mbps regardless of OS initial window size issues), with the TCP stream having internal timing channel in the data stream.

And for background measurement we'd need a separate UDP ping plus a low-rate (~10 pps), in-channel TCP ping.

If its a big buffer, we'd see the big timing stretch on the ping packets.

If its just a "small buffer", we should see the UDP and TCP pings stretch out to the buffer size, but may still see loss in the UDP ping or delays in the TCP ping due to packet loss.

If its a multiple-queue buffer, we shouldn't see any latency increase in the UDP and TCP pings, but should see the in-channel timing stretch.

I'm not sure if we'd get enough traffic or run long enough to detect RED or similar, since we'd need to be long enough to see a lot of drops in the TCP flows to be sure that the low-rate TCP flow is just not being lucky and not seeing losses.  But we could maintain the old UDP test as well and if the UDP test measures substantially higher than the TCP test, this would suggest active queue management of some sort.

OTOH, maintaining the old UDP test would add 30 seconds to the execution time, which is currently 25% or so.