[Bro] crash with std::bad_alloc

Robin Sommer robin at icir.org
Thu Nov 6 23:10:35 PST 2008

On Thu, Nov 06, 2008 at 12:23 +0100, you wrote:

> Memory: total=3126520K total_adj=3116888K malloced: 2878549K

Yeah, that's a lot ... 

> ".*byte_seq1.*byte_seq2.*byte_seq3.*"

I'm guessing that these are indeed the problem, assuming there's no
leak somewhere.  Having lots of such patterns is essentially the
worst case for a DFA-based pattern matcher (recall that Bro
internally combines many of these into a *one* regexp, which will
let the number of states explode). 

Three things you could try:

(a) there is a tuning option for the signature engine which tells
Bro how many regexps to combine internally into Big Ones. It's
called sig_max_group_size and the default is 50. It might help to
reduce this quite a bit (e.g., 10 or 20). 

(b) you could split each signature into several, one for each
component of the regexp (byte_seq1, byte_seq2, ...), and then either
chain these signatures with requires_signature condititions, or
raise an event for each one individually and correlate the matches
on the script-level to find out when all have matched. Both
approahces have the disadvantage that they don't consider the order
in which the subpatterns appear. 

(c) this one is kind of scary. :) There's a configure option
--expire-dfa-states which enables some internal code to limit the
size of the DFAs Bro builds (by expiring less frequently used states
and recalculating them later if necessary). Enabling this has quite
a performance impact on the matching process but even more worse is
the fact that this option has most likely not been used by anybody
for >5 years ... I'd almost bet it's broken in some way but you can
still give it a try ... 


Robin Sommer * Phone +1 (510) 666-2886 * robin at icir.org 
ICSI/LBNL    * Fax   +1 (510) 666-2956 *   www.icir.org

More information about the Bro mailing list