[Tmrg] Traffic Generators (Harpoon and Tmix)
Sally Floyd
sallyfloyd at mac.com
Sun Dec 9 18:21:39 PST 2007
On Dec 6, 2007, at 8:23 PM, Lachlan Andrew wrote:
> On 04/12/2007, Constantine Dovrolis <dovrolis at cc.gatech.edu> wrote:
>> folks, my apologies for jumping into the discussion
>
> Not at all. Thanks for your input!
>
>> 1. I want to loudly agree with Sally that we should be
>> considering non-greedy TCP flows with heavy-tailed size
>> distribution, and
>> 2. we should be asking whether these non-greedy TCP flows
>> are generated by an open-loop flow arrival process or by a
>> closed-loop process that takes user thinking times (and
>> perhaps limited patience) into account.
>
> Just to clarify, in point 2, are you suggesting that there are
> idle/think times both within and between flows?
>
> I agree entirely that these are all important effects, which should be
> included in "version 2" of the test suite. I have several reason for
> supporting the simpler models for the initial "version 1". I'd be
> interested in your thoughts on each. My strongest concerns are points
> 2(ii) and 4.
>
> 1. We agreed at the meeting that the load would be "open loop". That
> allows us to specify the offered load in a protocol-independent way.
> If the traffic is entirely closed-loop then the load depends on the
> protocols, making comparisons difficult. (Being open-loop does not
> preclude modelling the think-time between arrivals within a session.)
Closed-loop models are just as protocol-independent as open-loop models,
I would say.
The overall transfer time depends on the protocol used in either case.
> 2. We need to ask what cost/benefit we get from the more complex
> models.
> (i) For some of our tests, this traffic is "cross traffic" which we're
> not measuring. In these tests, the results of Hohn, Veitch, and Abry
> (e.g., "The impact of the flow arrival process in Internet traffic")
> suggest that structure in the flow arrival process doesn't greatly
> affect the packet level traffic.
> (ii) For cases where we're going to measure the performance of
> non-greedy flows, we need to define metrics for their performance
> which reflect the non-greediness. I don't think such measures are
> obvious. We can't use connection completion times, average rates, ...
Even if I was only looking at metrics about the behavior of long-lived
flows,
I would prefer for the "background traffic" to have user think times
within
TCP connections. This is more realistic, and increases the burstiness
of
the aggregate traffic in a way that affects all of the competing
traffic.
> 3. These tests are not intended to be exhaustive. As I said before
> the meeting, I'd rather the meeting result in one or two
> clearly-defined tests than a complete first draft of a test suite
> where none of the tests is specified well enough to allow comparisons.
I think we can do a complete first draft of a test suite. But I agree
that
these tests are definitely not intended to be exhaustive.
> 4. I'm afraid of models with too many parameters which have to be
> estimated. I was under the impression that many studies have found
> distributions of *connection* sizes, but many fewer (if any) have
> studied the sizes of "bursts" within a connection. Will it matter if
> we get the sizes wrong?
We will get it even more wrong if we don't include user think times
within connections.
One of the good areas for future work is for researchers to say
"by the way, these results are quite sensitive to parameter X", or
"these results are not at all sensitive to parameter Y". It is
unavoidable,
I think, that we will have to learn these things as we go along.
> Another point related to parameter estimation is that I'm worried by
> the approach we agreed on of assuming that the file-size distribution
> is independent of the load, so that the load is simply proportional to
> the session arrival rate. It seems likely to me that higher load
> occurs when there is a brief influx of longer connections (say some
> BitTorrent users start up), rather than a brief rise in the session
> arrival rate. Could this have as big an impact as the choice of
> whether new "bursts" start their own connections or not?
I agree that this is a key concern. There are two ways to go:
(1) models where the total load requested in a user session is
independent of the level of congestion: and
(2) models where the total load requested in a user session is
explicitly dependent on the level of congestion.
I assume that the world is like (2). As far as I know, more traffic
generators are based on model (1). We could make an arbitrary
attempt at model (2), or we could use model (1) and explicitly ask
researchers to give us model (2) for traffic generation for the future.
Either one sounds ok to me.
- Sally
http://www.icir.org/floyd/
More information about the Tmrg-interest
mailing list