[Bro] raw bytes question

Martin Holste mcholste at gmail.com
Thu Apr 16 16:00:35 PDT 2009


Robin,

Thanks for the quick reply.  The "off by default" comment comes from section
7.6.1 of the user manual which states "Signature matching is off by
default."  I understand that Bro's emphasis (and therefore distinction from
its competition) is that it relies as little as possible on signature
matching.  So much so that my concern as a newcomer to Bro is that signature
matching is de-emphasized enough that it could suffer in performance.

For stream reassembly, I worded my question poorly.  The blog post you
mentioned (which was what I was thinking of when I wrote the questions)
states that reassembly is only done on the first 1K of streams.  So, I
(perhaps unreasonably) do not consider that reassembly because I am very
regularly interested in the 1K-2K range of a stream.

I read the CCS paper (though it's rather old!) and I think I now have a much
better idea of what the internal sig matching engine uses, namely DFA (or at
least that's what it used to use).  I'm wondering how this compares with the
Aho-Corasick NFA implementation of simple (non-regexp) string matching a la
Snort, both in performance and memory consumption.  I'd also be interested
in comparisons on CPU cache efficiency.

Thanks,

Martin

On Thu, Apr 16, 2009 at 2:36 PM, Robin Sommer <robin at icir.org> wrote:

>
> On Thu, Apr 16, 2009 at 13:44 -0500, Martin Holste wrote:
>
> > This raises a question that I've been wondering since poring over the 1.4
> > manual regarding how well Bro greps packets.  Specifically, the manual
> says
> > that signatures are off by default and that the grepping is per-packet
> with
> > no stream reassembly capabilities.
>
> Uh, does the manual really say that? Can you point me to where you
> found these statements?
>
> The signature is not really "off by default". Rather (like most
> functionality in Bro), it's only activated on demand when your
> configuration actually defines any signatures. It's true that we
> don't ship with many pre-built signatures[1]. But DPD for example
> uses those in policy/sigs/dpd.bro, and they are activated once you
> turn on DPD by loading dpd.bro.
>
> Likewise, pattern matching *is* usally done stream-wise, not on
> packets. More precisely, whenever Bro has reassembly enabled for a
> particular connection, the pattern matching is performed after
> reassembly. Only if Bro does not reassemble a connection, then
> pattern matching proceeds on packets. Generally, you can tell Bro
> pretty precisely which connections you want it to reassemble; by
> default, it reassembles the *beginning* of all TCP connections, and
> it then keeps the reassembler enabled for those for which it has
> found a suitable application-layer protocol analyzer.
>
> For more details (including options to control matching), please see
> this blog posting:
>
>        http://blog.icir.org/2008/06/bro-signature-engine.html
>
> >  It also appears that there's no particularly fancy pattern matching
> >  engine under the hood, indicating that matching on full snaplengths
> >  for many signatures produces high load.
>
> Likewise, I'm wondering where you got the impression that there's no
> "fancy engine" (or what you'd consider a fancy one to look like :-).
> There's a paper describing the internals of Bro's approach in more
> detail if you are curious:
>
>       http://www.icir.org/robin/papers/ccs03.ps
>
> The paper also discusses various trade-offs in signature matching as
> well as the difficulty of fairly comparing multiple engines against
> each other.
>
> >  I haven't measured this myself, so I'm wondering if this is the
> >  case.  Does anyone have any statisical (or anecdotal) evidence as
> >  to how many sigs can run under a subnet with mostly web client
> >  traffic?
>
> The only systematic measurements I'm aware of are actually those in
> the older CCS paper mentioned above. Most people seem to use Bro's
> engine mostly with a small number of signatures as it's usally
> deployed as *support* for script-level analysis rather than as the
> primary detection tool by itself. I remember one specific case in
> which someone used a large number of signatures and had some
> performance trouble initially; that however was solvable by tuning
> the engine's options a bit.
>
> Hope this helps,
>
> Robin
>
> [1] Ignoring the ancient ones converted from Snort which aren't
> really useful anymore.
>
> --
> Robin Sommer * Phone +1 (510) 666-2886 * robin at icir.org
> ICSI/LBNL    * Fax   +1 (510) 666-2956 *   www.icir.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ICSI.Berkeley.EDU/pipermail/bro/attachments/20090416/6de96a70/attachment.html 


More information about the Bro mailing list