From tarupp at fnal.gov  Thu Apr 16 09:13:22 2009
From: tarupp at fnal.gov (Tim Rupp)
Date: Thu, 16 Apr 2009 11:13:22 -0500
Subject: [Bro] raw bytes question
Message-ID: <49E75922.1080000@fnal.gov>

Hi list,

Is there an event I can hook that would allow me to do a regex on the
raw bytes of a packet if I knew the hex pattern of the bytes I want to
match?

-Tim


From robin at icir.org  Thu Apr 16 09:52:02 2009
From: robin at icir.org (Robin Sommer)
Date: Thu, 16 Apr 2009 09:52:02 -0700
Subject: [Bro] raw bytes question
In-Reply-To: <49E75922.1080000@fnal.gov>
References: <49E75922.1080000@fnal.gov>
Message-ID: <20090416165202.GF90672@icir.org>


On Thu, Apr 16, 2009 at 11:13 -0500, Tim Rupp wrote:

> Is there an event I can hook that would allow me to do a regex on the
> raw bytes of a packet if I knew the hex pattern of the bytes I want to
> match?

Per the mail I sent you earlier, that looks like a task for a
signature.

Robin

-- 
Robin Sommer * Phone +1 (510) 666-2886 * robin at icir.org 
ICSI/LBNL    * Fax   +1 (510) 666-2956 *   www.icir.org


From hall.692 at osu.edu  Thu Apr 16 10:19:07 2009
From: hall.692 at osu.edu (Seth Hall)
Date: Thu, 16 Apr 2009 13:19:07 -0400
Subject: [Bro] raw bytes question
In-Reply-To: <49E75922.1080000@fnal.gov>
References: <49E75922.1080000@fnal.gov>
Message-ID: <8E5A96C9-F787-44DA-8D73-4D50132A8E90@osu.edu>

Hi Tim,

On Apr 16, 2009, at 12:13 PM, Tim Rupp wrote:

> Is there an event I can hook that would allow me to do a regex on the
> raw bytes of a packet if I knew the hex pattern of the bytes I want to
> match?


If you want an example of working with signatures and policy script, I  
went ahead and added a script for detecting SSN leakage that works by  
having a signature that is subsequently handled in policy script.  It  
uses a list of known US SSNs for your organization and filters out  
false positives by using that list.  We've caught quite a few minor  
violations with this script since we started running it.

Here's the policy script:
   http://github.com/sethhall/bro_scripts/blob/819d078ad9cf59d9f594f2682fcd6d3c8b89d6ad/ssn-exposure.bro

The corresponding signature definition file is here:
   http://github.com/sethhall/bro_scripts/blob/819d078ad9cf59d9f594f2682fcd6d3c8b89d6ad/ssn.sig

Let me know if you have any problems understanding what's happening  
between the signature definition and the policy script.  That simple  
interaction is a little muddied by the rest of the script.

   .Seth

---
Seth Hall
Network Security - Office of the CIO
The Ohio State University
Phone: 614-292-9721


From mcholste at gmail.com  Thu Apr 16 11:44:01 2009
From: mcholste at gmail.com (Martin Holste)
Date: Thu, 16 Apr 2009 13:44:01 -0500
Subject: [Bro] raw bytes question
In-Reply-To: <8E5A96C9-F787-44DA-8D73-4D50132A8E90@osu.edu>
References: <49E75922.1080000@fnal.gov>
	<8E5A96C9-F787-44DA-8D73-4D50132A8E90@osu.edu>
Message-ID: <df9074860904161144v9c73fcbyf0203cdf1703fb71@mail.gmail.com>

This raises a question that I've been wondering since poring over the 1.4
manual regarding how well Bro greps packets.  Specifically, the manual says
that signatures are off by default and that the grepping is per-packet with
no stream reassembly capabilities.  It also appears that there's no
particularly fancy pattern matching engine under the hood, indicating that
matching on full snaplengths for many signatures produces high load.  I
haven't measured this myself, so I'm wondering if this is the case.  Does
anyone have any statisical (or anecdotal) evidence as to how many sigs can
run under a subnet with mostly web client traffic?

Thanks,

Martin

On Thu, Apr 16, 2009 at 12:19 PM, Seth Hall <hall.692 at osu.edu> wrote:

> Hi Tim,
>
> On Apr 16, 2009, at 12:13 PM, Tim Rupp wrote:
>
> > Is there an event I can hook that would allow me to do a regex on the
> > raw bytes of a packet if I knew the hex pattern of the bytes I want to
> > match?
>
>
> If you want an example of working with signatures and policy script, I
> went ahead and added a script for detecting SSN leakage that works by
> having a signature that is subsequently handled in policy script.  It
> uses a list of known US SSNs for your organization and filters out
> false positives by using that list.  We've caught quite a few minor
> violations with this script since we started running it.
>
> Here's the policy script:
>
> http://github.com/sethhall/bro_scripts/blob/819d078ad9cf59d9f594f2682fcd6d3c8b89d6ad/ssn-exposure.bro
>
> The corresponding signature definition file is here:
>
> http://github.com/sethhall/bro_scripts/blob/819d078ad9cf59d9f594f2682fcd6d3c8b89d6ad/ssn.sig
>
> Let me know if you have any problems understanding what's happening
> between the signature definition and the policy script.  That simple
> interaction is a little muddied by the rest of the script.
>
>   .Seth
>
> ---
> Seth Hall
> Network Security - Office of the CIO
> The Ohio State University
> Phone: 614-292-9721
>
> _______________________________________________
> Bro mailing list
> bro at bro-ids.org
> http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/bro
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ICSI.Berkeley.EDU/pipermail/bro/attachments/20090416/f183a147/attachment.html 

From robin at icir.org  Thu Apr 16 12:36:24 2009
From: robin at icir.org (Robin Sommer)
Date: Thu, 16 Apr 2009 12:36:24 -0700
Subject: [Bro] raw bytes question
In-Reply-To: <df9074860904161144v9c73fcbyf0203cdf1703fb71@mail.gmail.com>
References: <49E75922.1080000@fnal.gov>
	<8E5A96C9-F787-44DA-8D73-4D50132A8E90@osu.edu>
	<df9074860904161144v9c73fcbyf0203cdf1703fb71@mail.gmail.com>
Message-ID: <20090416193624.GI90672@icir.org>


On Thu, Apr 16, 2009 at 13:44 -0500, Martin Holste wrote:

> This raises a question that I've been wondering since poring over the 1.4
> manual regarding how well Bro greps packets.  Specifically, the manual says
> that signatures are off by default and that the grepping is per-packet with
> no stream reassembly capabilities. 

Uh, does the manual really say that? Can you point me to where you
found these statements? 

The signature is not really "off by default". Rather (like most
functionality in Bro), it's only activated on demand when your
configuration actually defines any signatures. It's true that we
don't ship with many pre-built signatures[1]. But DPD for example
uses those in policy/sigs/dpd.bro, and they are activated once you
turn on DPD by loading dpd.bro.

Likewise, pattern matching *is* usally done stream-wise, not on
packets. More precisely, whenever Bro has reassembly enabled for a
particular connection, the pattern matching is performed after
reassembly. Only if Bro does not reassemble a connection, then
pattern matching proceeds on packets. Generally, you can tell Bro
pretty precisely which connections you want it to reassemble; by
default, it reassembles the *beginning* of all TCP connections, and
it then keeps the reassembler enabled for those for which it has
found a suitable application-layer protocol analyzer. 

For more details (including options to control matching), please see
this blog posting:

        http://blog.icir.org/2008/06/bro-signature-engine.html

>  It also appears that there's no particularly fancy pattern matching
>  engine under the hood, indicating that matching on full snaplengths
>  for many signatures produces high load. 

Likewise, I'm wondering where you got the impression that there's no
"fancy engine" (or what you'd consider a fancy one to look like :-).
There's a paper describing the internals of Bro's approach in more
detail if you are curious:

       http://www.icir.org/robin/papers/ccs03.ps
       
The paper also discusses various trade-offs in signature matching as
well as the difficulty of fairly comparing multiple engines against
each other. 

>  I haven't measured this myself, so I'm wondering if this is the
>  case.  Does anyone have any statisical (or anecdotal) evidence as
>  to how many sigs can run under a subnet with mostly web client
>  traffic?

The only systematic measurements I'm aware of are actually those in
the older CCS paper mentioned above. Most people seem to use Bro's
engine mostly with a small number of signatures as it's usally
deployed as *support* for script-level analysis rather than as the
primary detection tool by itself. I remember one specific case in
which someone used a large number of signatures and had some
performance trouble initially; that however was solvable by tuning
the engine's options a bit.

Hope this helps,

Robin

[1] Ignoring the ancient ones converted from Snort which aren't
really useful anymore.

-- 
Robin Sommer * Phone +1 (510) 666-2886 * robin at icir.org 
ICSI/LBNL    * Fax   +1 (510) 666-2956 *   www.icir.org


From Tyler.Schoenke at colorado.edu  Thu Apr 16 15:01:31 2009
From: Tyler.Schoenke at colorado.edu (Tyler T. Schoenke)
Date: Thu, 16 Apr 2009 16:01:31 -0600
Subject: [Bro] Adapting packet filter in stand-alone cluster
Message-ID: <49E7AABB.8010202@colorado.edu>

I am getting started with Bro, and am using Robin's 1.4 stand-alone 
cluster branch.  I was trying to detect some IRC traffic using DPD, but 
realized that it was being filtered.  In the Workshop 2009 materials, it 
mentioned adapting the packet filter by adding the -f "tcp".  I tried 
that, tested it on my pcap file, and it worked.  How do I enable/disable 
the -f "tcp" option in the cluster configuration?

Tyler

--
Tyler Schoenke
IT Security Office
University of Colorado - Boulder


From mcholste at gmail.com  Thu Apr 16 16:00:35 2009
From: mcholste at gmail.com (Martin Holste)
Date: Thu, 16 Apr 2009 18:00:35 -0500
Subject: [Bro] raw bytes question
In-Reply-To: <20090416193624.GI90672@icir.org>
References: <49E75922.1080000@fnal.gov>
	<8E5A96C9-F787-44DA-8D73-4D50132A8E90@osu.edu>
	<df9074860904161144v9c73fcbyf0203cdf1703fb71@mail.gmail.com>
	<20090416193624.GI90672@icir.org>
Message-ID: <df9074860904161600j6b5a121p71a2aa97b97224ea@mail.gmail.com>

Robin,

Thanks for the quick reply.  The "off by default" comment comes from section
7.6.1 of the user manual which states "Signature matching is off by
default."  I understand that Bro's emphasis (and therefore distinction from
its competition) is that it relies as little as possible on signature
matching.  So much so that my concern as a newcomer to Bro is that signature
matching is de-emphasized enough that it could suffer in performance.

For stream reassembly, I worded my question poorly.  The blog post you
mentioned (which was what I was thinking of when I wrote the questions)
states that reassembly is only done on the first 1K of streams.  So, I
(perhaps unreasonably) do not consider that reassembly because I am very
regularly interested in the 1K-2K range of a stream.

I read the CCS paper (though it's rather old!) and I think I now have a much
better idea of what the internal sig matching engine uses, namely DFA (or at
least that's what it used to use).  I'm wondering how this compares with the
Aho-Corasick NFA implementation of simple (non-regexp) string matching a la
Snort, both in performance and memory consumption.  I'd also be interested
in comparisons on CPU cache efficiency.

Thanks,

Martin

On Thu, Apr 16, 2009 at 2:36 PM, Robin Sommer <robin at icir.org> wrote:

>
> On Thu, Apr 16, 2009 at 13:44 -0500, Martin Holste wrote:
>
> > This raises a question that I've been wondering since poring over the 1.4
> > manual regarding how well Bro greps packets.  Specifically, the manual
> says
> > that signatures are off by default and that the grepping is per-packet
> with
> > no stream reassembly capabilities.
>
> Uh, does the manual really say that? Can you point me to where you
> found these statements?
>
> The signature is not really "off by default". Rather (like most
> functionality in Bro), it's only activated on demand when your
> configuration actually defines any signatures. It's true that we
> don't ship with many pre-built signatures[1]. But DPD for example
> uses those in policy/sigs/dpd.bro, and they are activated once you
> turn on DPD by loading dpd.bro.
>
> Likewise, pattern matching *is* usally done stream-wise, not on
> packets. More precisely, whenever Bro has reassembly enabled for a
> particular connection, the pattern matching is performed after
> reassembly. Only if Bro does not reassemble a connection, then
> pattern matching proceeds on packets. Generally, you can tell Bro
> pretty precisely which connections you want it to reassemble; by
> default, it reassembles the *beginning* of all TCP connections, and
> it then keeps the reassembler enabled for those for which it has
> found a suitable application-layer protocol analyzer.
>
> For more details (including options to control matching), please see
> this blog posting:
>
>        http://blog.icir.org/2008/06/bro-signature-engine.html
>
> >  It also appears that there's no particularly fancy pattern matching
> >  engine under the hood, indicating that matching on full snaplengths
> >  for many signatures produces high load.
>
> Likewise, I'm wondering where you got the impression that there's no
> "fancy engine" (or what you'd consider a fancy one to look like :-).
> There's a paper describing the internals of Bro's approach in more
> detail if you are curious:
>
>       http://www.icir.org/robin/papers/ccs03.ps
>
> The paper also discusses various trade-offs in signature matching as
> well as the difficulty of fairly comparing multiple engines against
> each other.
>
> >  I haven't measured this myself, so I'm wondering if this is the
> >  case.  Does anyone have any statisical (or anecdotal) evidence as
> >  to how many sigs can run under a subnet with mostly web client
> >  traffic?
>
> The only systematic measurements I'm aware of are actually those in
> the older CCS paper mentioned above. Most people seem to use Bro's
> engine mostly with a small number of signatures as it's usally
> deployed as *support* for script-level analysis rather than as the
> primary detection tool by itself. I remember one specific case in
> which someone used a large number of signatures and had some
> performance trouble initially; that however was solvable by tuning
> the engine's options a bit.
>
> Hope this helps,
>
> Robin
>
> [1] Ignoring the ancient ones converted from Snort which aren't
> really useful anymore.
>
> --
> Robin Sommer * Phone +1 (510) 666-2886 * robin at icir.org
> ICSI/LBNL    * Fax   +1 (510) 666-2956 *   www.icir.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ICSI.Berkeley.EDU/pipermail/bro/attachments/20090416/6de96a70/attachment.html 

From hall.692 at osu.edu  Thu Apr 16 19:39:42 2009
From: hall.692 at osu.edu (Seth Hall)
Date: Thu, 16 Apr 2009 22:39:42 -0400
Subject: [Bro] Adapting packet filter in stand-alone cluster
In-Reply-To: <49E7AABB.8010202@colorado.edu>
References: <49E7AABB.8010202@colorado.edu>
Message-ID: <FBFFA685-8D09-4E1F-A7C2-710497F91CDF@osu.edu>


On Apr 16, 2009, at 6:01 PM, Tyler T. Schoenke wrote:

>   How do I enable/disable the -f "tcp" option in the cluster  
> configuration?

You can do it from your policy script.

In policy/local/local.bro (assuming you're using everything as it  
ships)...

redef capture_filters = { ["all-ip-packets"] = "ip or ip6" };

   .Seth

---
Seth Hall
Network Security - Office of the CIO
The Ohio State University
Phone: 614-292-9721


From robin at icir.org  Fri Apr 17 09:46:13 2009
From: robin at icir.org (Robin Sommer)
Date: Fri, 17 Apr 2009 09:46:13 -0700
Subject: [Bro] raw bytes question
In-Reply-To: <df9074860904161600j6b5a121p71a2aa97b97224ea@mail.gmail.com>
References: <49E75922.1080000@fnal.gov>
	<8E5A96C9-F787-44DA-8D73-4D50132A8E90@osu.edu>
	<df9074860904161144v9c73fcbyf0203cdf1703fb71@mail.gmail.com>
	<20090416193624.GI90672@icir.org>
	<df9074860904161600j6b5a121p71a2aa97b97224ea@mail.gmail.com>
Message-ID: <20090417164613.GD47247@icir.org>


On Thu, Apr 16, 2009 at 18:00 -0500, you wrote:

> Thanks for the quick reply.  The "off by default" comment comes from section
> 7.6.1 of the user manual which states "Signature matching is off by
> default." 

I see. That paragraph is actually not refering to the signature
engine itself but to the set of
Snort-converted-and-further-augmented signatures that were shipped
as part of the Bro-Lite environment (which is technically still
there but hasn't been maintained for years and will be removed
soon.) But I see how that can be confusing; the text doesn't really
make that distinction clear.

> states that reassembly is only done on the first 1K of streams.  So, I
> (perhaps unreasonably) do not consider that reassembly because I am very
> regularly interested in the 1K-2K range of a stream.

Well, I'd call it "reassembly of the first 1K". As I wrote in the
mail and in the blog posting, that's all configurable. Different
people require different trade-offs.

> least that's what it used to use).  I'm wondering how this compares with the
> Aho-Corasick NFA implementation of simple (non-regexp) string matching a la
> Snort, both in performance and memory consumption. 

The paper actually compares with Snort, though with the Snort of
2003. I can't comment on any recent versions. 

>  I'd also be interested in comparisons on CPU cache efficiency.

That is an interesting question indeed.

Robin

-- 
Robin Sommer * Phone +1 (510) 666-2886 * robin at icir.org 
ICSI/LBNL    * Fax   +1 (510) 666-2956 *   www.icir.org


From gurvindersinghdahiya at gmail.com  Thu Apr 23 03:32:33 2009
From: gurvindersinghdahiya at gmail.com (Gurvinder Singh Dahiya)
Date: Thu, 23 Apr 2009 06:32:33 -0400
Subject: [Bro] How to calculate RTT
Message-ID: <66da81d60904230332k1789e6f8r835b509a4c2edde4@mail.gmail.com>

Hi,

I am new to BRO IDS and i am working on project on TCP latency behavior to
get information about communicating peer. I tried to implement my algo in
Bro, but i stuck with calculating rtt of connection. i was trying to put
timer in conn.bro under event new_connection and then calculate the rtt in
event connection_established. but it does not work out. can any body point
me in right direction to look for.

I will appreciate any help.

- Gurvinder Singh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ICSI.Berkeley.EDU/pipermail/bro/attachments/20090423/f28d0452/attachment.html 

From martin.arlitt at ucalgary.ca  Thu Apr 23 11:58:42 2009
From: martin.arlitt at ucalgary.ca (Martin Arlitt)
Date: Thu, 23 Apr 2009 12:58:42 -0600
Subject: [Bro] How to calculate RTT
In-Reply-To: <66da81d60904230332k1789e6f8r835b509a4c2edde4@mail.gmail.com>
References: <66da81d60904230332k1789e6f8r835b509a4c2edde4@mail.gmail.com>
Message-ID: <49F0BA62.4000205@ucalgary.ca>

hi Gurvinder

my colleagues and I examined characteristics like RTT back in 2005.  Our 
scripts are available from:

http://www.bro-ids.org/bro-contrib/network-analysis/akm-imc05/

Please note that these scripts will not run on current versions of Bro, 
but you should be able to estimate RTT in a similar manner. (the 
particular issues I can think of are: these scripts were developed on a 
earlier version of Bro that used ALERT to generate messages, while 
current versions of Bro use NOTICE; and you may need to explicitly set 
"redef use_compression_compressor=F;")  The README file at the above 
location contains references to the papers that we wrote that used the 
data collected with these scripts.  You may find those useful as well.

Martin

Gurvinder Singh Dahiya wrote:
> Hi,
>
> I am new to BRO IDS and i am working on project on TCP latency 
> behavior to get information about communicating peer. I tried to 
> implement my algo in Bro, but i stuck with calculating rtt of 
> connection. i was trying to put timer in conn.bro under event 
> new_connection and then calculate the rtt in event 
> connection_established. but it does not work out. can any body point 
> me in right direction to look for.
>
> I will appreciate any help.
>
> - Gurvinder Singh
> ------------------------------------------------------------------------
>
> _______________________________________________
> Bro mailing list
> bro at bro-ids.org
> http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/bro