[Netalyzr] Error reported on test...

Mon Apr 14 10:48:40 PDT 2014

On Mon, Apr 14, 2014 at 9:10 AM, Jim Gettys <jg at freedesktop.org> wrote:
>
>
>
> On Mon, Apr 14, 2014 at 12:03 PM, Nicholas Weaver
> <nweaver at icsi.berkeley.edu> wrote:
>>
>> I responded just to Jim and not the group.
>>
>> But I should add some on the latter comment by Jim, as I didn't answer in
>> as much detail and its probably of interest to the general group as well,
>> and starting a discussion here would be good.

I am inclined to let you independently come up with ideas and not say anything
in the interest of you folk finding newer and more creative analysis methods.

But, oh, well, here goes...

please take a look at netperf-wrapper's test and analysis suite.

While it's sprouted a gui and a ton of new tests and plots, it's still
only a tool
a professional could love. Do something simpler please. :) Suggestions
as to better plotting mechanisms and tests highly desired.

>>
>> On Apr 9, 2014, at 12:23 PM, Jim Gettys <jg at freedesktop.org> wrote:
>> > Any progress on adding a test for whether flow queuing is enabled?
>> > There are commercial routers out there shipping fq_codel, and it would be
>> > interesting to be able to watch it deploy.

I'd be happy to have any data on what extent sfq, drr, and sqf are deployed.

free.fr (the biggest fq_codel deployment I know of) deployed sfq 7+
years back and
it still runs on most of their older cpe that can't do fq_codel. Over
5 mil users in total.

sfq is not only fq but has a fairly low (127) packet limit by default.

I'd love to know if the existing netalyzer tests of stuff from their AS number
showed anything interesting, be that measured buffer size or other behaviors.

SFQ had a problem in that permutation of the
hash happened every 10 seconds, which would reorder and frequently reset
big tcp flows (which was part of it's hidden advantages). You can clearly
see this SFQ behavior in longer term flow data.

Free's fq_codel enabled revolution v6 box started to deploy in august 2012.
optimized on the upload only.

Sadly, in very few cases I expect to see much except policing on the
download side.

Perhaps by developing a new test that tests for fq techniques, older data can be
reanalyzed?

>> This has been on our todo list for a while, but we haven't implemented it
>> yet, and we are looking for feedback before we start coding.
>>
>> Our thoughts are as follows:
>>
>> Create a "load traffic" that is 3 TCP streams.  3 seems to be a good
>> consensus load value for bandwidth testing (its what speedtest.net uses).

I assume this is a test that does upload + measurement, and then download
+ measurement, not all 3 at the same time?

Scaling up tcp is a function of the RTT and bandwidth to the test
server, as well as other factors,
like tcp itself.

RRUL does all three tests at the same time, and tosses in
classification as well.

For example it took 15 seconds at a 65ms baseline RTT to get the worst
case behavior
out of a cable modem and CMTS in this test series using the rrul test.

http://snapon.lab.bufferbloat.net/~cero2/jimreisert/results.html

At a 15ms RTT you'd think it'd be faster to peak, right? No... roughly
same cable modem,
same service, different tcp, 1/4 the RTT.

http://snapon.lab.bufferbloat.net/~cero2/armory.com/unshaped/149.20.63.30.tcp_download.comcast_unshaped.svg

There are a ton of other plots and data from that test series you can
browse through and reparse through netperf-wrapper's gui

http://snapon.lab.bufferbloat.net/~cero2/armory.com/

>> In parallel, use our UDP ping test (the one we use just before to generate
>> the quiescent latency) and use that to determine the RTT.

I have seen qos systems that blithely optimize all udp traffic over tcp traffic.

So I might argue for an additional ping-like over over tcp...

I'd also prefer a voip-like flow (20ms interval independent
isochronous streams),
rather than "ping".

>> In cases where there is no smart queuing, this should get the same results
>> as before, as 3 streams should start quickly enough that the 5 second rampup
>> should leave things in steady state.

For TCP depends on the RTT and bandwidth and other factors per above.

>> But if there is smart queueing, the UDP packets should (hopefully) be
>> given enough priority to not see queuing delays.  Since UDP, rather than
>> TCP, is used for the interactive tasks most affected by buffering issues, if
>> the UDP is not given priority in the queueing delay, then this isn't an
>> effective solution for the problem.

Reasoning is slightly flawed, but I can live with it as a simple test. We test
both udp and icmp at the same time with different classifications in
the rrul suite.

Plug - a simple test for e2e (in both directions) diffserv
preservation would be nice.

I ran out of time funding and inspiration to get "twd" off the ground.

https://github.com/dtaht/twd/blob/master/rfc/middle.mkd#tos-field-preservation-tests

>>
>
> This will work. Doesn't matter whether you use a UDP ping test or not; you
> could just as well arrange a ping operation on a separate TCP flow that you
> are not driving to saturation. if the bottleneck link identifies it as a
> separate flow and implements flow queuing, the observed time should not
> significantly increase (or will only increase to the remaining uncontrolled
> buffering at the bottleneck, a common situation since drivers may have
> buffering out of control of a flow queuing algorithm).
>
> Dave, any other suggestions?

Well, moving forward, netanlyzer detecting the codel portion of the algorithm
would be nice - seeing a drop rate increase along a rough invsqrt interval would
be good detection mechanism. You can see codel's drop behavior vs tail
drop here:

http://www.ietf.org/proceedings/89/slides/slides-89-tsvarea-1.pdf

I have had plenty of users saying the codel wasn't working that they
were detecting
500ms worth of buffering....

>                         - Jim
>

-- 
Dave Täht

NSFW: https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article