[Bro] raw bytes question

Thu Apr 16 12:36:24 PDT 2009

On Thu, Apr 16, 2009 at 13:44 -0500, Martin Holste wrote:

> This raises a question that I've been wondering since poring over the 1.4
> manual regarding how well Bro greps packets.  Specifically, the manual says
> that signatures are off by default and that the grepping is per-packet with
> no stream reassembly capabilities. 

Uh, does the manual really say that? Can you point me to where you
found these statements? 

The signature is not really "off by default". Rather (like most
functionality in Bro), it's only activated on demand when your
configuration actually defines any signatures. It's true that we
don't ship with many pre-built signatures[1]. But DPD for example
uses those in policy/sigs/dpd.bro, and they are activated once you
turn on DPD by loading dpd.bro.

Likewise, pattern matching *is* usally done stream-wise, not on
packets. More precisely, whenever Bro has reassembly enabled for a
particular connection, the pattern matching is performed after
reassembly. Only if Bro does not reassemble a connection, then
pattern matching proceeds on packets. Generally, you can tell Bro
pretty precisely which connections you want it to reassemble; by
default, it reassembles the *beginning* of all TCP connections, and
it then keeps the reassembler enabled for those for which it has
found a suitable application-layer protocol analyzer. 

For more details (including options to control matching), please see
this blog posting:

        http://blog.icir.org/2008/06/bro-signature-engine.html

>  It also appears that there's no particularly fancy pattern matching
>  engine under the hood, indicating that matching on full snaplengths
>  for many signatures produces high load. 

Likewise, I'm wondering where you got the impression that there's no
"fancy engine" (or what you'd consider a fancy one to look like :-).
There's a paper describing the internals of Bro's approach in more
detail if you are curious:

       http://www.icir.org/robin/papers/ccs03.ps

The paper also discusses various trade-offs in signature matching as
well as the difficulty of fairly comparing multiple engines against
each other. 

>  I haven't measured this myself, so I'm wondering if this is the
>  case.  Does anyone have any statisical (or anecdotal) evidence as
>  to how many sigs can run under a subnet with mostly web client
>  traffic?

The only systematic measurements I'm aware of are actually those in
the older CCS paper mentioned above. Most people seem to use Bro's
engine mostly with a small number of signatures as it's usally
deployed as *support* for script-level analysis rather than as the
primary detection tool by itself. I remember one specific case in
which someone used a large number of signatures and had some
performance trouble initially; that however was solvable by tuning
the engine's options a bit.

Hope this helps,

Robin

[1] Ignoring the ancient ones converted from Snort which aren't
really useful anymore.

-- 
Robin Sommer * Phone +1 (510) 666-2886 * robin at icir.org 
ICSI/LBNL    * Fax   +1 (510) 666-2956 *   www.icir.org