[Bro] Questions about Bro Capabilities

Reed Porada rporada at ll.mit.edu
Thu Oct 4 08:03:07 PDT 2007

On Oct 3, 2007, at 12:51 PM, Robin Sommer wrote:

> On Wed, Oct 03, 2007 at 08:26 -0700, Nicholas Weaver wrote:
>> For offline processing, do a two-pass approach.  In the first pass,
>> you use Bro to find the TG flows based on the higher-level  
>> attributes,
>> and write out the flow IDs.  For the second pass, only capture the
>> flows which don't correspond.
> Yeah, that was my thought too.  (This is an offline scheme, isn't it?)

Yes this is an offline scheme at this point.

> If I understood your approach correctly, you depend on
> application-layer analysis to find "your" traffic. In that case,
> doing it in a single pass would likely miss packets because you
> might only be able to take the decision some way into the stream.

For http, yes we depend on the application layer to validate the  
session, as we otherwise have no good way to validate individual  

> At the same time it also sounds like you're always cutting out
> complete flows rather than just individual packets.  So, a two-pass,
> flow-based approach sounds indeed reasonable.

I believe in general that entire flows will be cut out, given that if  
a single packet is off it is hard to validate the rest as being  
ours.  However, we still would like to possibly make an educated  
guess as to the culprit packet if possible.

> Does this make any sense?

In general I understand what you and Nick have proposed.  I do not  
know how to get the flow-ids out.  Are the http_request_stream$id's  
unique?  One thing that was suggested by a co-worker after looking at  
the output, is that we have a timestamp, src ip/port, dst ip/port.   
In general within a pcap that is sufficient for identifying a packet,  
my guess as to why you have not suggested this option is that the  
network_time() that is being used in output does not relate to the  
stream.  Is there anyway to get that to have a closer correlation to  
the stream?  I am also curious as to how to interpret the output from  
http-body.  What does each printout from http_entity_data events  
represent?  Is it a new packet, or an update to the stream that could  
be the sum of an arbitrary number of packets?

Thanks again for your time and help,


More information about the Bro mailing list