[Bro] File Scanning Capability

Will baxterw3232 at gmail.com
Mon Mar 21 11:16:39 PDT 2011

Hello again,

I was hoping to get some guidance on how to best use Bro to process email
files. My end goal is to strip out inbound email attachments, identify the
file type, then run a distinct set of external tools against them. Each file
type would have a different set or order of tools.

I will without a doubt eventually incorporate
instead of what I am currently using, but I am having trouble determining
where to integrate the logic for handling each file type. As it currently
works, I am saving off every pdf and word doc, which would be unnecessary if
I used bro to call the external tools and evaluate the results.

Current logic (this method calls for the external tools to be run against
the directory by cron and are independent of Bro):
       #if the hot flag is set then we dump the MIME-decoded attachment to
it's own file for analysis
       if( session$entity_is_hot )
        if ( session$entity_filename == hot_pdf_attachment_filenames )
             #build the filename out of MD5, length and filename
             hot_attachment_dumpname = fmt("dumped_pdf_files\/%s:%d:%s",
session$content_hash, length, session$entity_filename);
        if ( session$entity_filename == hot_word_attachment_filenames )
             hot_attachment_dumpname = fmt("dumped_doc_files\/%s:%d:%s",
session$content_hash, length,session$entity_filename);
        #get a raw filehandle, notice open() instead of open_log_file(),
write the data out, and be sure to close the fh
        hot_attachment_dump_fh = open( hot_attachment_dumpname );
        write_file(hot_attachment_dump_fh, data);

What I would like to be able to do:

if ( session$entity_filename == hot_pdf_attachment_filenames )
    hot_attachment_dumpname = fmt("dumped_pdf_files\/%d:%s", length,
    hot_attachment_dump_fh = open( hot_attachment_dumpname );
    write_file(hot_attachment_dump_fh, data);
    scan_pdf_file(file)  #call the external tools

    # scan_pdf_file would include something like this:
    scanpdf.py (which would include clamscan, pdfid.py, cymruMHR,
ssdeep...etc) The pdf python script can pass the results back to bro for

    if ( result == bad )
        delete file, carry on or log results somewhere then delete file

The scan for office docs would be similiar, but use 'OfficeMalScanner'
instead of pdfid.py and pdf-parser.py. If I get this to work, I would like
to do something very similar with http files.

How can I call the external tools?  Is this the right place to be doing

I read in Robin's 'Advanced Scripting' presentation from the 2009 workshop
about injecting external information but am still confused how to do the

I would be surprised if this capability doesn't already exist and suppose I
might be going about this all wrong. I would just prefer to incorporate the
file scans in Bro vice running them completely independently. If I wasn't
clear or am completely out in left field feel free to be honest. I won't be

Thanks in advance!

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ICSI.Berkeley.EDU/pipermail/bro/attachments/20110321/7f856995/attachment.html 

More information about the Bro mailing list