[Bro-Dev] Performance Enhancements
gc355804 at ohio.edu
Thu Oct 5 21:10:35 PDT 2017
Not sure about the content of the BroCon talk ... but a few years back, I did a bit of work on this. There was a plugin here:
that allowed me to profile the execution of various bro scripts and figure out what was eating the most time. It also added a pretty braindead mechanism to expose script variables through a REST interface, which I wrapped in an HTML5 UI to get some real-time statistics ... though I have no idea where that code went.
I also threw together this:
which was intended to benchmark Bro on a specific platform, the idea being to get results that were relatively consistent. It could make some pretty pictures, which was kind of neat ... but I'd probably do things a lot differently if I had it to do over again :)
I'll note that one of the challenges with profiling is that there are the bro scripts, and then there is the bro engine. The scripting layer has a completely different set of optimizations that might make sense than the engine does: turning off / turning on / tweaking different scripts can have a huge impact on Bro's relative performance depending on the frequency with which those script fragments are executed. Thus, one way to look at speeding things up might be to take a look at the scripts that are run most often and seeing about ways to accelerate core pieces of them ... possibly by moving pieces of those scripts to builtins (as C methods).
If I had to guess at one engine-related thing that would've sped things up when I was profiling this stuff back in the day, it'd probably be rebuilding the memory allocation strategy / management. From what I remember, Bro does do some malloc / free in the data path, which hurts quite a bit when one is trying to make things go fast. It also means that the selection of a memory allocator and NUMA / per-node memory management is going to be important. That's probably not going to qualify as something *small*, though ...
On a related note, a fun experiment is always to try running bro with a different allocator and seeing what happens ...
Another thing that (I found) got me a few percentage points for more-or-less free was profile-guided optimization: I ran bro first with profiling enabled against a representative data set, then rebuilt it against the profile I collected. Of course, your mileage may vary ...
Anyway, hope something in there was useful.
From: bro-dev-bounces at bro.org <bro-dev-bounces at bro.org> on behalf of Jim Mellander <jmellander at lbl.gov>
Sent: Thursday, October 5, 2017 3:45:21 PM
To: bro-dev at bro.org
Subject: [Bro-Dev] Performance Enhancements
One item of particular interest to me from Brocon was this tidbit from Packetsled's lightning talk:
"Optimizing core loops (like net_run() ) with preprocessor branch prediction macros likely() and unlikely() for ~3% speedup. We optimize for maximum load."
After conversing with Leo Linsky of Packetsled, I wanted to initiate a conversation about easy performance improvements that may be within fairly easy reach:
1. Obviously, branch prediction, as mentioned above. 3% speedup for (almost) free is nothing to sneeze at.
2. Profiling bro to identify other hot spots that could benefit from optimization.
3. Best practices for compiling Bro (compiler options, etc.)
4. Data structure revisit (hash functions, perhaps?)
Perhaps the Bro core team is working on some, all, or a lot more in this area. It might be nice to get the Bro community involved too. Is anyone else interested?
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the bro-dev