From vern at icir.org Sun Mar 4 16:34:35 2012 From: vern at icir.org (Vern Paxson) Date: Sun, 04 Mar 2012 16:34:35 -0800 Subject: [Netalyzr] upstream buffering test In-Reply-To: <4F4B8CD8.9080401@interlinx.bc.ca> (Mon, 27 Feb 2012 09:02:00 EST). Message-ID: <20120305003435.13E402C4002@rock.ICSI.Berkeley.EDU> [user Brian dropped from thread for now] > Can it not be made to more accurately reflect the actual situation then? > .... > Ultimately though netalyzer is making people thing they have a > bufferbloat problem when they don't, if I'm understanding all of this > correctly. > > I just wonder if the test cannot be made to better reflect the > boatedness of the buffer in question so that people are not chasing ghosts. Well, he asks a good question. In particular, if AQM actually is deployed a non-trivial proportion of the time, then it really behooves us to try to detect that rather than playing into Jim Gettys' pet gripe. What about doing trying a second flow (ideally, TCP, though that's messy to measure) concurrent with the test to see if it's somehow unaffected? I'm not sure whether that would fit with his particular setup (depends on the nature of the QoS), but would be interesting to try, and he certainly sounds willing to test it for us. Vern From jg at freedesktop.org Sun Mar 4 17:25:59 2012 From: jg at freedesktop.org (Jim Gettys) Date: Sun, 04 Mar 2012 20:25:59 -0500 Subject: [Netalyzr] upstream buffering test In-Reply-To: <20120305003435.13E402C4002@rock.ICSI.Berkeley.EDU> References: <20120305003435.13E402C4002@rock.ICSI.Berkeley.EDU> Message-ID: <4F541627.5000105@freedesktop.org> On 03/04/2012 07:34 PM, Vern Paxson wrote: > [user Brian dropped from thread for now] > >> Can it not be made to more accurately reflect the actual situation then? >> .... >> Ultimately though netalyzer is making people thing they have a >> bufferbloat problem when they don't, if I'm understanding all of this >> correctly. >> >> I just wonder if the test cannot be made to better reflect the >> boatedness of the buffer in question so that people are not chasing ghosts. > Well, he asks a good question. In particular, if AQM actually is deployed > a non-trivial proportion of the time, then it really behooves us to try > to detect that rather than playing into Jim Gettys' pet gripe. I've never seen netalyzr give me false positives; false negatives, yes, but not false positives. However, I'd sure like a test to tell if the ISP is doing AQM or not: Van doesn't trust the one paper I've seen on the topic (Dischinger, et. al, IIRC). And at some point, we sure would like home routers running an AQM too (Dave Taht has a hack together using RED at the moment, but CoDel is *much* more interesting...) I do want to understand better what OpenWrt is actually doing when it's QOS scripts are invoked. But I've had a cold this week, and haven't gone and looked yet. I know Dave has done major surgery to them in his recent CeroWrt work. - Jim > > What about doing trying a second flow (ideally, TCP, though that's messy > to measure) concurrent with the test to see if it's somehow unaffected? > I'm not sure whether that would fit with his particular setup (depends on > the nature of the QoS), but would be interesting to try, and he certainly > sounds willing to test it for us. > > Vern > _______________________________________________ > Netalyzr mailing list > Netalyzr at mailman.ICSI.Berkeley.EDU > http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/netalyzr From nweaver at ICSI.Berkeley.EDU Mon Mar 5 09:05:23 2012 From: nweaver at ICSI.Berkeley.EDU (Nicholas Weaver) Date: Mon, 5 Mar 2012 09:05:23 -0800 Subject: [Netalyzr] upstream buffering test In-Reply-To: <4F541627.5000105@freedesktop.org> References: <20120305003435.13E402C4002@rock.ICSI.Berkeley.EDU> <4F541627.5000105@freedesktop.org> Message-ID: <03B7747A-1CAA-46E9-B91D-FF06F2604030@ICSI.Berkeley.EDU> On Mar 4, 2012, at 5:25 PM, Jim Gettys wrote: > On 03/04/2012 07:34 PM, Vern Paxson wrote: >> [user Brian dropped from thread for now] >> >>> Can it not be made to more accurately reflect the actual situation then? >>> .... >>> Ultimately though netalyzer is making people thing they have a >>> bufferbloat problem when they don't, if I'm understanding all of this >>> correctly. >>> >>> I just wonder if the test cannot be made to better reflect the >>> boatedness of the buffer in question so that people are not chasing ghosts. >> Well, he asks a good question. In particular, if AQM actually is deployed >> a non-trivial proportion of the time, then it really behooves us to try >> to detect that rather than playing into Jim Gettys' pet gripe. > > I've never seen netalyzr give me false positives; false negatives, yes, > but not false positives. > > However, I'd sure like a test to tell if the ISP is doing AQM or not: > Van doesn't trust the one paper I've seen on the topic (Dischinger, et. > al, IIRC). And at some point, we sure would like home routers running > an AQM too (Dave Taht has a hack together using RED at the moment, but > CoDel is *much* more interesting...) I think this sort of test would need to be very different than the current test. The current test is a "Slam the net with UDP and see what happens", but an AQM based test needs to be more subtle: For base traffic, it would need to be multiple TCP streams (probably 3 or 4, based on the Ookla speedtest model that 3-4 is good for almost all to saturate >20 Mbps regardless of OS initial window size issues), with the TCP stream having internal timing channel in the data stream. And for background measurement we'd need a separate UDP ping plus a low-rate (~10 pps), in-channel TCP ping. If its a big buffer, we'd see the big timing stretch on the ping packets. If its just a "small buffer", we should see the UDP and TCP pings stretch out to the buffer size, but may still see loss in the UDP ping or delays in the TCP ping due to packet loss. If its a multiple-queue buffer, we shouldn't see any latency increase in the UDP and TCP pings, but should see the in-channel timing stretch. I'm not sure if we'd get enough traffic or run long enough to detect RED or similar, since we'd need to be long enough to see a lot of drops in the TCP flows to be sure that the low-rate TCP flow is just not being lucky and not seeing losses. But we could maintain the old UDP test as well and if the UDP test measures substantially higher than the TCP test, this would suggest active queue management of some sort. OTOH, maintaining the old UDP test would add 30 seconds to the execution time, which is currently 25% or so. From cory at codeware.com Fri Mar 23 08:10:02 2012 From: cory at codeware.com (Cory Riddell) Date: Fri, 23 Mar 2012 10:10:02 -0500 Subject: [Netalyzr] What is TCP Connection Setup Latency? Message-ID: <4F6C924A.7030001@codeware.com> We have been having weird problems with Windows remote desktop sessions (several times per day the connection is lost but the client is able to automatically reconnect). So I ran your tool and found a very high TCP connection setup latency. The network latency was good (I think it was 54 ms) but the TCP connection setup latency is often well over 1000 ms with 3400 ms being the highest value I've seen. I was guessing the TCP connection setup latency should be roughly 3x the network latency because it takes three packets (for ack, syn-ack, syn). But today I'm getting 50ms for the network latency and 54 seconds for the TCP connection setup latency, so my theory is obviously incorrect. What is TCP connection setup latency? Cory From christian at icir.org Fri Mar 23 16:13:51 2012 From: christian at icir.org (Christian Kreibich) Date: Fri, 23 Mar 2012 16:13:51 -0700 Subject: [Netalyzr] What is TCP Connection Setup Latency? In-Reply-To: <4F6C924A.7030001@codeware.com> References: <4F6C924A.7030001@codeware.com> Message-ID: <4F6D03AF.9060701@icir.org> Hi Cory, On 03/23/2012 08:10 AM, Cory Riddell wrote: > We have been having weird problems with Windows remote desktop sessions > (several times per day the connection is lost but the client is able to > automatically reconnect). So I ran your tool and found a very high TCP > connection setup latency. The network latency was good (I think it was > 54 ms) but the TCP connection setup latency is often well over 1000 ms > with 3400 ms being the highest value I've seen. Mhmm ... not good! > I was guessing the TCP connection setup latency should be roughly 3x the > network latency because it takes three packets (for ack, syn-ack, syn). > But today I'm getting 50ms for the network latency and 54 seconds for > the TCP connection setup latency, so my theory is obviously incorrect. 54 seconds, ouch. Unfortunately your theory is right on, because ... > What is TCP connection setup latency? ... we indeed measure the time it takes to complete a TCP handshake: at the beginning of a test session, we make a sequence of TCP connection attempts to the backend server the applet uses to conduct its tests, and record the handshake completion times. We try this on a number of ports. The value reported in the summary is then the average of those completion times. If you take a look at the client-side transcript and search for the strings tcpSetupLatency and tcpFirstSetupLatency (we store the first connection attempt's time separately from subsequent ones), you can inspect the results (in ms). I'm not seeing unusual latencies to our backend servers. If you send me the ID of your session offline, we'll take a closer look. Perhaps a horrible outlier is distorting the results. (It seems we recently messed up the content type when delivering the client-side transcript -- it may not display immediately in the browser. Save it to disk and open it in an editor. It's just a text file. We'll fix this shortly.) Best, Christian From nweaver at ICSI.Berkeley.EDU Fri Mar 23 23:16:55 2012 From: nweaver at ICSI.Berkeley.EDU (Nicholas Weaver) Date: Sat, 24 Mar 2012 06:16:55 +0000 Subject: [Netalyzr] What is TCP Connection Setup Latency? In-Reply-To: <4F6D03AF.9060701@icir.org> References: <4F6C924A.7030001@codeware.com> <4F6D03AF.9060701@icir.org> Message-ID: <1666AACF-C4DB-45B3-8C04-43B167169AE8@icsi.berkeley.edu> On Mar 23, 2012, at 11:13 PM, Christian Kreibich wrote: > >> I was guessing the TCP connection setup latency should be roughly 3x the >> network latency because it takes three packets (for ack, syn-ack, syn). >> But today I'm getting 50ms for the network latency and 54 seconds for >> the TCP connection setup latency, so my theory is obviously incorrect. Although TCP setup latency should, in most systems, be equivalent to the estimated RTT, since we are recording the time it takes the Connect() call to complete, which usually returns when the host receives the SYN-ACK. When it is substantially longer, it suggests something is slowing down TCP handshakes without slowing down the network (eg, a process which checks or prompts for new connections). From cory at codeware.com Mon Mar 26 06:34:53 2012 From: cory at codeware.com (Cory Riddell) Date: Mon, 26 Mar 2012 08:34:53 -0500 Subject: [Netalyzr] What is TCP Connection Setup Latency? In-Reply-To: <4F6D03AF.9060701@icir.org> References: <4F6C924A.7030001@codeware.com> <4F6D03AF.9060701@icir.org> Message-ID: <4F70707D.5000109@codeware.com> Christian, On 3/23/2012 6:13 PM, Christian Kreibich wrote: > Hi Cory, > > On 03/23/2012 08:10 AM, Cory Riddell wrote: > I was guessing the TCP connection setup latency should be roughly 3x the > network latency because it takes three packets (for ack, syn-ack, syn). > But today I'm getting 50ms for the network latency and 54 seconds for > the TCP connection setup latency, so my theory is obviously incorrect. This was a typo. I was getting 50ms for network latency and 54ms for the TCP connection setup latency. In other words, it seemed to be working correctly, but since the connection setup latency wasn't 3x network latency, I was curious what it was measuring. Nicholas' answer supplies more of the details. Today things seems to be back to normal, so I'll chalk it up to my ISP have problems. They have put a monitor on our line for the next week, so hopefully that helps us find a problem. Thanks for a very interesting tool. I imagine it's providing a treasure trove of data. Cory From bzeeb-lists at lists.zabbadoz.net Mon Mar 26 07:21:54 2012 From: bzeeb-lists at lists.zabbadoz.net (Bjoern A. Zeeb) Date: Mon, 26 Mar 2012 14:21:54 +0000 (UTC) Subject: [Netalyzr] IPv6 Frag issue reported - false positive? Message-ID: Hi, we are getting results from here on a temporary conference network that report IPv6 Frag blocking as one of the users here had tested as well: http://n3.netalyzr.icsi.berkeley.edu/restore/id=ae81b058-26597-5a57e9a6-381d-45df-8fcd/rd Using different tools, I cannot reproduce the problems. I wonder what that test is doing in detail and almost certainly assume it's a false-positive. Could you give me some more details? /bz -- Bjoern A. Zeeb You have to have visions! Stop bit received. Insert coin for new address family. From nweaver at ICSI.Berkeley.EDU Tue Mar 27 08:30:36 2012 From: nweaver at ICSI.Berkeley.EDU (Nicholas Weaver) Date: Tue, 27 Mar 2012 08:30:36 -0700 Subject: [Netalyzr] IPv6 Frag issue reported - false positive? In-Reply-To: References: Message-ID: We had a problem yesterday with one of our services crashing which is causing this false report. New sessions today should hopefully not have this problem. Could you rerun Netalyzr and see if it works? On Mar 26, 2012, at 7:21 AM, Bjoern A. Zeeb wrote: > Hi, > > we are getting results from here on a temporary conference network that report IPv6 Frag blocking as one of the users here had tested as well: > > http://n3.netalyzr.icsi.berkeley.edu/restore/id=ae81b058-26597-5a57e9a6-381d-45df-8fcd/rd > > Using different tools, I cannot reproduce the problems. I wonder what that test is doing in detail and almost certainly assume it's a false-positive. > > Could you give me some more details? > > /bz > > -- > Bjoern A. Zeeb You have to have visions! > Stop bit received. Insert coin for new address family. > _______________________________________________ > Netalyzr mailing list > Netalyzr at mailman.ICSI.Berkeley.EDU > http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/netalyzr From bzeeb-lists at lists.zabbadoz.net Tue Mar 27 08:36:18 2012 From: bzeeb-lists at lists.zabbadoz.net (Bjoern A. Zeeb) Date: Tue, 27 Mar 2012 15:36:18 +0000 (UTC) Subject: [Netalyzr] IPv6 Frag issue reported - false positive? In-Reply-To: References: Message-ID: On Tue, 27 Mar 2012, Nicholas Weaver wrote: > We had a problem yesterday with one of our services crashing which is causing this false report. > > New sessions today should hopefully not have this problem. Could you rerun Netalyzr and see if it works? It's been the issue you had been involved with later that day; it just seems the mail to you was stuck for about a day as well. Thanks for the help! /bz PS: and yes I have re-run and the only issue is that Apple's tick-tack device in my machines seems to run slightly fast as you think;-) -- Bjoern A. Zeeb You have to have visions! Stop bit received. Insert coin for new address family.