<div dir="ltr"><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">I particularly like the idea of an allocation pool that per-packet information can be stored, and reused by the next packet.</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">There also are probably some optimizations of frequent operations now that we're in a 64-bit world that could prove useful - the one's complement checksum calculation in net_util.cc is one that comes to mind, especially since it works effectively a byte at a time (and works with even byte counts only). Seeing as this is done per-packet on all tcp payload, optimizing this seems reasonable. Here's a discussion of do the checksum calc in 64-bit arithmetic: <a href="https://locklessinc.com/articles/tcp_checksum/">https://locklessinc.com/articles/tcp_checksum/</a> - this website also has an x64 allocator that is claimed to be faster than tcmalloc, see: <a href="https://locklessinc.com/benchmarks_allocator.shtml">https://locklessinc.com/benchmarks_allocator.shtml</a> (note: I haven't tried anything from this source, but find it interesting).</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">I'm guessing there are a number of such "small" optimizations that could provide significant performance gains.</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">Take care,</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">Jim</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Oct 6, 2017 at 7:26 AM, Azoff, Justin S <span dir="ltr"><<a href="mailto:jazoff@illinois.edu" target="_blank">jazoff@illinois.edu</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class=""><br>
> On Oct 6, 2017, at 12:10 AM, Clark, Gilbert <<a href="mailto:gc355804@ohio.edu">gc355804@ohio.edu</a>> wrote:<br>
><br>
> I'll note that one of the challenges with profiling is that there are the bro scripts, and then there is the bro engine. The scripting layer has a completely different set of optimizations that might make sense than the engine does: turning off / turning on / tweaking different scripts can have a huge impact on Bro's relative performance depending on the frequency with which those script fragments are executed. Thus, one way to look at speeding things up might be to take a look at the scripts that are run most often and seeing about ways to accelerate core pieces of them ... possibly by moving pieces of those scripts to builtins (as C methods).<br>
><br>
<br>
</span>Re: scripts, I have some code I put together to do arbitrary benchmarks of templated bro scripts. I need to clean it up and publish it, but I found some interesting things. Function calls are relatively slow.. so things like<br>
<br>
ip in Site::local_nets<br>
<br>
Is faster than calling<br>
<br>
Site::is_local_addr(ip);<br>
<br>
inlining short functions could speed things up a bit.<br>
<br>
I also found that things like<br>
<br>
port == 22/tcp || port == 3389/tcp<br>
<br>
Is faster than checking if port in {22/tcp,3389/tcp}.. up to about 10 ports.. Having the hash class fallback to a linear search when the hash only contains few items could speed things up there. Things like 'likely_server_ports' have 1 or 2 ports in most cases.<br>
<span class=""><br>
<br>
> If I had to guess at one engine-related thing that would've sped things up when I was profiling this stuff back in the day, it'd probably be rebuilding the memory allocation strategy / management. From what I remember, Bro does do some malloc / free in the data path, which hurts quite a bit when one is trying to make things go fast. It also means that the selection of a memory allocator and NUMA / per-node memory management is going to be important. That's probably not going to qualify as something *small*, though ...<br>
<br>
</span>Ah! This reminds me of something I was thinking about a few weeks ago. I'm not sure to what extent bro uses memory allocation pools/interning for common immutable data structures. Like for port objects or small strings. There's no reason bro should be mallocing/freeing memory to create port objects when they are only 65536 times 2 (or 3?) port objects... but bro does things like<br>
<br>
tcp_hdr->Assign(0, new PortVal(ntohs(tp->th_sport), TRANSPORT_TCP));<br>
tcp_hdr->Assign(1, new PortVal(ntohs(tp->th_dport), TRANSPORT_TCP));<br>
<br>
For every packet. As well as allocating a ton of TYPE_COUNT vals for things like packet sizes and header lengths.. which will almost always be between 0 and 64k.<br>
<br>
For things that can't be interned, like ipv6 address, having an allocation pool could speed things up... Instead of freeing things like IPAddr objects they could just be returned to a pool, and then when a new IPAddr object is needed, an already initialized object could be grabbed from the pool and 'refreshed' with the new value.<br>
<br>
<a href="https://golang.org/pkg/sync/#Pool" rel="noreferrer" target="_blank">https://golang.org/pkg/sync/#<wbr>Pool</a><br>
<br>
Talks about that sort of thing.<br>
<span class=""><br>
> On a related note, a fun experiment is always to try running bro with a different allocator and seeing what happens ...<br>
<br>
</span>I recently noticed our boxes were using jemalloc instead of tcmalloc.. Switching that caused malloc to drop a few places down in 'perf top' output.<br>
<br>
<br>
—<br>
<span class="HOEnZb"><font color="#888888">Justin Azoff<br>
<br>
<br>
</font></span></blockquote></div><br></div>