[Bro] File Scanning Capability

Will baxterw3232 at gmail.com
Mon Mar 21 13:34:13 PDT 2011


Thanks for that example Jim!

That gives me a bunch of other ideas. The best thing about using this method
would be near real-time scanning and notifications vice running a cron'd
script at a given interval.

In your code below, what are you asking bro to do, if anything with the
returned value?

           # If the category signals a block
           bro_conn.send("stomper_block",seqno)

>     return
>
> #Main program - Initialize and call event loop
>
> # Setup the connection to bro
> bro_conn = broccoli.Connection("127.0.0.1:47758")
>
> # Event loop
> bro_event_loop(bro_conn)


Will

On Mon, Mar 21, 2011 at 4:05 PM, Jim Mellander <jmellander at lbl.gov> wrote:

> Hi Will:
>
> Seems like you would probably want to use the python broccoli bindings
> to send an event to a python client, here's what I'm doing with my
> "stomper" code, which looks up urls on the fly in a malware database:
>
> # In your bro startup script
> @load listen-clear
>
> redef Remote::destinations += {
>        ["remote_stomper"] = [ $host=127.0.0.1, $events =
> /remote_check_URL/,
>  $connect=F, $ssl=F ]
> ...
>
> #within bro policy
>
> # Here we send to the broccoli client for checking/processing
> event remote_check_URL(++stomper_seqno, c, is_orig, host, uri, ts);
>
>
> .....................
>
> On the python side, the relevant sections from the python code, which
> is running as a daemon accepting events from bro and acting on them:
>
> #! /usr/bin/env python
> #
>
> import broccoli
> import sqlite3
> import random
> import sys
> import re
> import select   # for select loop
>
>
> # Bro event loop
> def bro_event_loop(bro_conn):
>    try:
>        bro_conn_fd=bro_conn_get_fd(bro_conn)
>        while True:
>            select.select((bro_conn_fd),(bro_conn_fd),(bro_conn_fd))
>            bro_conn.processInput()
>    except:
>        while True:
>            bro_conn.processInput()
>            sleep(.1)
>
> @broccoli.event
> def remote_check_URL(seqno, host, uri):
>    # Receive a URL from bro, and send a return signal back
>    #  if it should be blocked.
>    category = check_database(host,uri)
>    if category:
>        if check_category(category):
>            # If the category signals a block
>            bro_conn.send("stomper_block",seqno)
>    return
>
> #Main program - Initialize and call event loop
>
> # Setup the connection to bro
> bro_conn = broccoli.Connection("127.0.0.1:47758")
>
> # Event loop
> bro_event_loop(bro_conn)
> # Everything under this is never executed.
> sys.exit(0)
>
>
>
> Hope this will help you kick the can down the road a bit....
>
>
>
>
>
> On Mon, Mar 21, 2011 at 12:44 PM, Will <baxterw3232 at gmail.com> wrote:
> >
> >
> > On Mon, Mar 21, 2011 at 2:49 PM, Seth Hall <seth at icir.org> wrote:
> >>
> >> On Mar 21, 2011, at 2:16 PM, Will wrote:
> >>
> >> > I will without a doubt eventually incorporate
> >> > "http-ext-identified-files.sig" instead of what I am currently using,
> but I
> >> > am having trouble determining where to integrate the logic for
> handling each
> >> > file type. As it currently works, I am saving off every pdf and word
> doc,
> >> > which would be unnecessary if I used bro to call the external tools
> and
> >> > evaluate the results.
> >>
> >> >>That won't actually work quite right.  The
> http-ext-identified-files.sig
> >> >> file uses special signature keywords that the http analyzer
> >>provides to
> >> >> detect file types.  It's not directly applicable to SMTP/MIME
> transfers.
> >>
> > Understandable. Being that there are so many different types it would be
> > beneficial enough to create a signature file for SMTP/MIME. I would be
> happy
> > to share it when I get it done.
> >
> >>
> >> > Current logic (this method calls for the external tools to be run
> >> > against the directory by cron and are independent of Bro):
> >> >         hot_attachment_dump_fh = open( hot_attachment_dumpname );
> >> >         write_file(hot_attachment_dump_fh, data);
> >> >         close(hot_attachment_dump_fh);
> >>
> >> >>In what event are you currently running using this code?
> >
> > Here is the entire event:
> >
> > event mime_entity_data(c: connection, length: count, data: string)
> >        {
> >        local session = get_session(c, T);
> >
> >        #md5 hashing is now a builtin function, so just call it and
> dumpthe
> > result into the content_hash field
> >        #that field in the info struct was already there, just had to add
> > this to fill it.
> >        session$content_hash = md5_hash(data);
> >
> >        #log the first 256 bytes of the attachment and the MD5 hash.
> >        mime_log_msg(session, "data", fmt("%d: %s", length,
> sub_bytes(data,
> > 0, 256)));
> >        mime_log_msg(session, "all data", fmt("MD5: %s",
> > session$content_hash));
> >
> >        #if the hot flag is set then we dump the MIME-decoded attachment
> to
> > it's own file for analysis
> >        if( session$entity_is_hot )
> >         {
> >         if ( session$entity_filename == hot_pdf_attachment_filenames )
> >              {
> >              #build the filename out of MD5, length and filename
> >              hot_attachment_dumpname = fmt("dumped_pdf_files\/%s:%d:%s",
> > session$content_hash, length, session$entity_filename);
> >              }
> >         if ( session$entity_filename == hot_word_attachment_filenames )
> >              {
> >              hot_attachment_dumpname = fmt("dumped_doc_files\/%s:%d:%s",
> > session$content_hash, length,session$entity_filename);
> >              }
> >
> >         #get a raw filehandle, notice open() instead of open_log_file(),
> > write the data out, and be sure to close the fh
> >         hot_attachment_dump_fh = open( hot_attachment_dumpname );
> >         write_file(hot_attachment_dump_fh, data);
> >         close(hot_attachment_dump_fh);
> >
> >         #log stuff to the hot logfile as well
> >              mime_log_hot_msg(session, "hot data", fmt("%d: %s", length,
> > sub_bytes(data, 0, 256)));
> >         mime_log_hot_msg(session, "hot data", fmt("File dumped: %s MD5:
> %s",
> > session$entity_filename, session$content_hash));
> >         }
> >
> > I attached the modifed mime.bro in case anyone wanted to see the how the
> > rest of it.
> >
> >> > The scan for office docs would be similiar, but use 'OfficeMalScanner'
> >> > instead of pdfid.py and pdf-parser.py. If I get this to work, I would
> like
> >> > to do something very similar with http files.
> >>
> >> Makes sense.
> >>
> >> > How can I call the external tools?  Is this the right place to be
> doing
> >> > this?
> >>
> >> You can't currently do this in a way that would be feasible on live
> >> traffic.  The problem is that the call to the external tool would block
> Bro
> >> and cause it to start dropping packets.  There is a "when" statement
> that
> >> can help build asynchronous function calls though.  So that the stack
> state
> >> will be saved and used again when the function call returns.  I don't
> know
> >> if the system() (I think this is what you're looking for to run external
> >> programs) function can be used with the when statement though.
> >
> > I suppose the short answer is yes. I was looking for something like the
> > system() call. Like modifying the PyBroccoli Example from below:
> > PyBroccoli Example:
> > @event
> > def pong(src_time, dst_time):
> >     print "pong event: time=%f/%f s" % \
> >        (dst_time - src_time, current_time() - src_time)
> > bc = Connection("127.0.0.1:47758")
> > bc.send("ping", time(current_time()))
> >
> > To:
> >
> > @event (event == dumped pdf file)
> > def pass_pdf(file):
> >       system(pdf_scan.py -f dumped_file.pdf > tempdir)
> >
> > With what you mentioned taken into account, we can't ask bro to wait on
> the
> > results, but maybe we could dump the results to a logfile for alerting?
> >
> >>
> >> If you are looking to run this on tracefiles for now though, you can
> >> certainly just use the system function to call your external tool.  It
> takes
> >> a single argument (a string) that is the command line you'd like to run.
> >>  There is a function for defanging data if you need to do that too
> (taking
> >> something off the line and using it in the command line) named
> >> str_shell_escape.  You do need to make sure that the data that is
> defanged
> >> with str_shell_escape is placed within double-quotes.
> >>
> >> > I would be surprised if this capability doesn't already exist and
> >> > suppose I might be going about this all wrong. I would just prefer to
> >> > incorporate the file scans in Bro vice running them completely
> >> > independently. If I wasn't clear or am completely out in left field
> feel
> >> > free to be honest. I won't be offended.
> >>
> >> Nope, not out in left field at all and personally I'm a bit ashamed I
> >> never wrote a mime-ext.bro script that was a bit more capable like the
> >> http-ext script.  I'm going to be rewriting the mime.bro script for the
> next
> >> release though and it will definitely have file extraction and
> >> identification capabilities built into it.  However, we are going to be
> >> working toward a much more generalized notion of files for some future
> >> release of Bro.  I've worked a bit on how that may proceed, but
> >> unfortunately we definitely won't be anywhere close to ready with that
> for
> >> the next release.
> >
> >
> > <sarcasm>
> > Maybe you should charge "more" for Bro...
> > </sarcasm>
> >
> > No, you all are doing a great job on this project. I just wish I could do
> > more to help.
> >
> >>
> >>  .Seth
> >>
> >> --
> >> Seth Hall
> >> International Computer Science Institute
> >> (Bro) because everyone has a network
> >> http://www.bro-ids.org/
> >>
> >
> > Will
> >
> > _______________________________________________
> > Bro mailing list
> > bro at bro-ids.org
> > http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/bro
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ICSI.Berkeley.EDU/pipermail/bro/attachments/20110321/48511ac4/attachment.html 


More information about the Bro mailing list