[Bro] File Scanning Capability

Seth Hall seth at icir.org
Mon Mar 21 11:49:13 PDT 2011

On Mar 21, 2011, at 2:16 PM, Will wrote:

> I will without a doubt eventually incorporate "http-ext-identified-files.sig" instead of what I am currently using, but I am having trouble determining where to integrate the logic for handling each file type. As it currently works, I am saving off every pdf and word doc, which would be unnecessary if I used bro to call the external tools and evaluate the results. 

That won't actually work quite right.  The http-ext-identified-files.sig file uses special signature keywords that the http analyzer provides to detect file types.  It's not directly applicable to SMTP/MIME transfers.

> Current logic (this method calls for the external tools to be run against the directory by cron and are independent of Bro):
>         hot_attachment_dump_fh = open( hot_attachment_dumpname );
>         write_file(hot_attachment_dump_fh, data);
>         close(hot_attachment_dump_fh);

In what event are you currently running using this code?

> The scan for office docs would be similiar, but use 'OfficeMalScanner' instead of pdfid.py and pdf-parser.py. If I get this to work, I would like to do something very similar with http files.

Makes sense.

> How can I call the external tools?  Is this the right place to be doing this?

You can't currently do this in a way that would be feasible on live traffic.  The problem is that the call to the external tool would block Bro and cause it to start dropping packets.  There is a "when" statement that can help build asynchronous function calls though.  So that the stack state will be saved and used again when the function call returns.  I don't know if the system() (I think this is what you're looking for to run external programs) function can be used with the when statement though.

If you are looking to run this on tracefiles for now though, you can certainly just use the system function to call your external tool.  It takes a single argument (a string) that is the command line you'd like to run.  There is a function for defanging data if you need to do that too (taking something off the line and using it in the command line) named str_shell_escape.  You do need to make sure that the data that is defanged with str_shell_escape is placed within double-quotes.

> I would be surprised if this capability doesn't already exist and suppose I might be going about this all wrong. I would just prefer to incorporate the file scans in Bro vice running them completely independently. If I wasn't clear or am completely out in left field feel free to be honest. I won't be offended.

Nope, not out in left field at all and personally I'm a bit ashamed I never wrote a mime-ext.bro script that was a bit more capable like the http-ext script.  I'm going to be rewriting the mime.bro script for the next release though and it will definitely have file extraction and identification capabilities built into it.  However, we are going to be working toward a much more generalized notion of files for some future release of Bro.  I've worked a bit on how that may proceed, but unfortunately we definitely won't be anywhere close to ready with that for the next release.


Seth Hall
International Computer Science Institute
(Bro) because everyone has a network

More information about the Bro mailing list