[Bro] Data truncated in signature_match and misc questions

Lorenzo Cavallaro sullivan at cs.ucsb.edu
Thu May 29 19:10:08 PDT 2008

Hi all,

   I'm playing a little bit with Bro and I ran into some issues and I
   don't know whether these are either bugs or things I don't do in the
   proper way. Maybe you guys can help me out :-)

   1. Basically, I'm trying to do something (apparently :-)) very
      simple: matching any stream whose carrying a sequence of bytes
      of length X.  For simplicity, lets say that I just want to match
      any stream which contains at least AAAA.
      Stream reassembly is very important for me, but I suppose Bro
      takes care of it when matching against signatures.

      I'm aware the data argument returned to the signature_match event
      handler should contain the part of the data that matched... and
      that's where things got weird (I would have preferred to leverage
      on signature_match events than instead of digging into the

      Consider this signature:

         signature test-AAAA
            event "sig-AAAA"
            payload /.*AAAA/

      and this policy file for the signature_match even handler:

         @load signatures
         #@load print-filter

         event signature_match(state: signature_state, msg: string, data:

            print fmt("[+] signature_match(%s) called", msg);
            print fmt("payload length: %d", byte_len(data));
            print fmt("payload (first 400 bytes): %s", sub_bytes(data, 0, 400));

      The output I got is the following:

      [+] signature_match(sig-AAAA) called
      payload length: 153
      payload (first 400 bytes): HTTP/1.1 404 Not Found^M^JDate: Wed, 28
      May 2008 22:07:28 GMT^M^JServer: Apache^M^JContent-Length:
      270^M^JConnection: close^M^JContent-Type: text/html...

      I don't see any AAAA in there... even if that's the payload which
      triggered the signature of course (as shown by tcpdump as well -- not
      included here).

      The point is that I'd like to extract any matching pattern from the
      payload which triggered the signature. Once the pattern is extracted
      I'd have to iterate over each element of the string do something.

      This was a dead end to me (but I'm surely missing some point,

      I also tried with a payload of /.*A{4}/ and /.*[A]{4}/ as I wanted
      to check whether the metacharacters {} worked properly or not.  It
      turned out they are ok here (signatures) but they don't work, for
      instance, with gsub.

   2. Does tcp_contents reassembles flows (I don't think so)? I'd use
      tcp_contents right away, but I'd just want to be sure I've no
      splitted matching payload (e.g., AA in one TCP segment and the
      next AA in the second one). That's why I wanted to go with the
      signature thing as this should be automatically taken care of by
      Bro. If the signature approach doesn't work out, tho, I've to
      reassemble packets by myself but it seems to me a waisting of
      times as Bro surely does it (or not?).

   3. I'm not able to see packets that are generated by the same host
      Bro is running on. Is this a normal behavior (performance tuning)?
      If so, is there a way to disable it just for testing purposes?

      I double-checked that the filters were right, of course :-). I ran
      Bro with -f 'tcp' (I'm not concerned about UDP right now, even tho
      I'll consider it later on). Also, I played with capture_filters
      and restrict_filters variables either by refining or redefining

      Just to be sure I loaded print-filter to re-check the capture
      filter was indeed the one I intended to. It was (tcp).  Still, I'm
      not able to get traffic that's sent by the same host where Bro is
      running on (I've a very basic configuration. Only one interface
      eth0 and localnets is set properly with just one local net addr,
      having just one physical net device).

   4. Regex works weird. It seems that {} notation, especially when
      used in conjuction with [^] sometimes works but other doesn't.  For
      instance, it doesn't work with gsub (if I didn't screw anything
      up, of course). Any ideas? For instance, something like:

         local tmp = gsub(payload, /[^A]{4}/, " ");

      doesn't work while the {} metachars worked for signature matching.

   I know, lots of questions :-)

TIA, bye

