[Bro-Dev] early performance comparisons of CAF-based run loop
Robin Sommer
robin at icir.org
Fri Apr 14 06:32:11 PDT 2017
Nice, thanks for the doing these measurements! I haven't looked at the
code yet, but some quick thoughts on your results and some of the
other comments this thread, and then some suggested next steps at the
end.
- Agree that overall your numbers suggest that all these mechanisms
are fine performancewise, assuming we keep the optimization to batch
packets between polls/selects to avoid the
one-system-call-per-packet overhead.
- I don't think we should spend time anymore on improving the old
communication code. We're getting close to retire that now and a
number of its issues (like selects in the child process) will just
go away with that. Let's focus on the new setting where Broker/CAF
will be doing all communication.
- Regarding optimizing for different use cases: I would prefer
avoiding having lots of knobs to configure the specifics of the
loop. We have these magic values in the current I/O loop where
nobody knows how to pick them because it's hard to understand their
impact; and where folks have played with them, it was always hard
conclude much about them beyond any specific setting. What we could
try instead is a loop that adjusts itself based on load patterns: if
the load is heavy on packets, build larger batches to process
between polls; if input comes from lots different sources, increase
the polling; etc. Any heuristic here would need to stay quite simple
(otherwise we'd again end up not being able to predict much), but I
think that'd be worth a try.
- Gilbert's point on high-performance IPC is a good one. I don't think
we want to switch to direct memory access as our main model for the
time being at least, but it does pose the question if/how can
integrate packet sources that either don't need or don't support
select/poll. (Which, in a nod to history, accounts for some of the
complexities of the current loop because many years ago some pcaps
didn't support select)
In terms of next steps, we need to see if these results hold across
different OSs, and also with live traffic. The two questions are (1)
does the new loop function on all platforms with both low- and
high-volume live traffic (presumably it will but that needs double
checking, given the history of weird OS-specific effects); and (2)
does performance match the measurements shown so far? If we can
confirm that on at least Linux and FreeBSD for, say, the two most
recent major releases of each and also consider common alternative
capturing solutions (pfring, netmap, afnet?), I'd be pretty
comfortable switching.
Robin
--
Robin Sommer * ICSI/LBNL * robin at icir.org * www.icir.org/robin
More information about the bro-dev
mailing list