[Bro-Dev] early performance comparisons of CAF-based run loop

Siwek, Jon jsiwek at illinois.edu
Fri Apr 14 10:32:35 PDT 2017


> On Apr 14, 2017, at 8:32 AM, Robin Sommer <robin at icir.org> wrote:
> 
> - I don't think we should spend time anymore on improving the old
>  communication code. We're getting close to retire that now and a
>  number of its issues (like selects in the child process) will just
>  go away with that. Let's focus on the new setting where Broker/CAF
>  will be doing all communication.

If people are hitting the 1024 FD hard-limit in the old comm. code’s select(), that would indeed go away with the change to Broker.  But I think the way Broker is integrated in the parent’s main loop still relies on a select(), with the number of FDs it monitors scaling with the number of peers.  i.e. there may still be critical errors w/ large Bro clusters even using Broker as the communication system, just this time the problem manifests in the main loop.

Just mentioning it in case you didn’t account for the real fix also requiring the CAF-based loop being fully realized in addition to Broker — I’m less certain about the timeline of finishing up the CAF-based loop compared to just patching in a temporary stopgap of patching out the select() calls.  (Also don’t have a sense of the frequency/urgency of the problem).

> - Regarding optimizing for different use cases: I would prefer
>  avoiding having lots of knobs to configure the specifics of the
>  loop. We have these magic values in the current I/O loop where
>  nobody knows how to pick them because it's hard to understand their
>  impact; and where folks have played with them, it was always hard
>  conclude much about them beyond any specific setting. What we could
>  try instead is a loop that adjusts itself based on load patterns: if
>  the load is heavy on packets, build larger batches to process
>  between polls; if input comes from lots different sources, increase
>  the polling; etc.

That seems like a Good Idea.

>  it does pose the question if/how can
>  integrate packet sources that either don't need or don't support
>  select/poll

I think that’s just a matter of making sure the main loop “spins” at an appropriate frequency, which might change dynamically, dependent on loading pattern optimizations, as per the above idea.

Maybe you could even think of reading an offline pcap file as a source that doesn’t need select/poll.  Pedantically, regular files also don't “support” select(), at least not w/ the same intention (nonblocking IO), but it just happens to work fine in the current runloop implementation.

So since I’ve been able to get the CAF-based loop working on offline pcap files (it does not rely on polling the FD of the open file since it didn't work anyway w/ CAF's epoll-based multiplexer on Linux), it may be fair to say that other packet sources that don’t require/support poll-ability should also be possible to integrate.

- Jon



More information about the bro-dev mailing list