[Bro] File Scanning Capability

Will baxterw3232 at gmail.com
Mon Mar 21 14:09:35 PDT 2011


Understood. Thanks again for the info.

Will

On Mon, Mar 21, 2011 at 5:03 PM, Jim Mellander <jmellander at lbl.gov> wrote:

> Hi Will,
>
> When bro receives the event, it will raise a notice that will execute
> a custom host-pair-drop-connectivity script that drops the
> source/destination host pair for a short period to interrupt the
> connection in realtime.
>
> seqno is used by bro to keep track of which request it sent, so that
> the event can identify the request that was made.  This is in a table
> whose entries expire rapidly (the timeout > the expected response time
> of the python program)
>
> BTW:
>
> I believe there was a bug in my code above (i put it down half-baked a
> while ago, and haven't picked it up in a while) - the broccoli event
> should have the same number of arguments as the bro event that sends
> it, and vice versa.
>
>
>
> On Mon, Mar 21, 2011 at 1:34 PM, Will <baxterw3232 at gmail.com> wrote:
> > Thanks for that example Jim!
> >
> > That gives me a bunch of other ideas. The best thing about using this
> method
> > would be near real-time scanning and notifications vice running a cron'd
> > script at a given interval.
> >
> > In your code below, what are you asking bro to do, if anything with the
> > returned value?
> >
> >            # If the category signals a block
> >            bro_conn.send("stomper_block",seqno)
> >>
> >>     return
> >>
> >> #Main program - Initialize and call event loop
> >>
> >> # Setup the connection to bro
> >> bro_conn = broccoli.Connection("127.0.0.1:47758")
> >>
> >> # Event loop
> >> bro_event_loop(bro_conn)
> >
> > Will
> >
> > On Mon, Mar 21, 2011 at 4:05 PM, Jim Mellander <jmellander at lbl.gov>
> wrote:
> >>
> >> Hi Will:
> >>
> >> Seems like you would probably want to use the python broccoli bindings
> >> to send an event to a python client, here's what I'm doing with my
> >> "stomper" code, which looks up urls on the fly in a malware database:
> >>
> >> # In your bro startup script
> >> @load listen-clear
> >>
> >> redef Remote::destinations += {
> >>        ["remote_stomper"] = [ $host=127.0.0.1, $events =
> >> /remote_check_URL/,
> >>  $connect=F, $ssl=F ]
> >> ...
> >>
> >> #within bro policy
> >>
> >> # Here we send to the broccoli client for checking/processing
> >> event remote_check_URL(++stomper_seqno, c, is_orig, host, uri, ts);
> >>
> >>
> >> .....................
> >>
> >> On the python side, the relevant sections from the python code, which
> >> is running as a daemon accepting events from bro and acting on them:
> >>
> >> #! /usr/bin/env python
> >> #
> >>
> >> import broccoli
> >> import sqlite3
> >> import random
> >> import sys
> >> import re
> >> import select   # for select loop
> >>
> >>
> >> # Bro event loop
> >> def bro_event_loop(bro_conn):
> >>    try:
> >>        bro_conn_fd=bro_conn_get_fd(bro_conn)
> >>        while True:
> >>            select.select((bro_conn_fd),(bro_conn_fd),(bro_conn_fd))
> >>            bro_conn.processInput()
> >>    except:
> >>        while True:
> >>            bro_conn.processInput()
> >>            sleep(.1)
> >>
> >> @broccoli.event
> >> def remote_check_URL(seqno, host, uri):
> >>    # Receive a URL from bro, and send a return signal back
> >>    #  if it should be blocked.
> >>    category = check_database(host,uri)
> >>    if category:
> >>        if check_category(category):
> >>            # If the category signals a block
> >>            bro_conn.send("stomper_block",seqno)
> >>    return
> >>
> >> #Main program - Initialize and call event loop
> >>
> >> # Setup the connection to bro
> >> bro_conn = broccoli.Connection("127.0.0.1:47758")
> >>
> >> # Event loop
> >> bro_event_loop(bro_conn)
> >> # Everything under this is never executed.
> >> sys.exit(0)
> >>
> >>
> >>
> >> Hope this will help you kick the can down the road a bit....
> >>
> >>
> >>
> >>
> >>
> >> On Mon, Mar 21, 2011 at 12:44 PM, Will <baxterw3232 at gmail.com> wrote:
> >> >
> >> >
> >> > On Mon, Mar 21, 2011 at 2:49 PM, Seth Hall <seth at icir.org> wrote:
> >> >>
> >> >> On Mar 21, 2011, at 2:16 PM, Will wrote:
> >> >>
> >> >> > I will without a doubt eventually incorporate
> >> >> > "http-ext-identified-files.sig" instead of what I am currently
> using,
> >> >> > but I
> >> >> > am having trouble determining where to integrate the logic for
> >> >> > handling each
> >> >> > file type. As it currently works, I am saving off every pdf and
> word
> >> >> > doc,
> >> >> > which would be unnecessary if I used bro to call the external tools
> >> >> > and
> >> >> > evaluate the results.
> >> >>
> >> >> >>That won't actually work quite right.  The
> >> >> >> http-ext-identified-files.sig
> >> >> >> file uses special signature keywords that the http analyzer
> >> >> >> >>provides to
> >> >> >> detect file types.  It's not directly applicable to SMTP/MIME
> >> >> >> transfers.
> >> >>
> >> > Understandable. Being that there are so many different types it would
> be
> >> > beneficial enough to create a signature file for SMTP/MIME. I would be
> >> > happy
> >> > to share it when I get it done.
> >> >
> >> >>
> >> >> > Current logic (this method calls for the external tools to be run
> >> >> > against the directory by cron and are independent of Bro):
> >> >> >         hot_attachment_dump_fh = open( hot_attachment_dumpname );
> >> >> >         write_file(hot_attachment_dump_fh, data);
> >> >> >         close(hot_attachment_dump_fh);
> >> >>
> >> >> >>In what event are you currently running using this code?
> >> >
> >> > Here is the entire event:
> >> >
> >> > event mime_entity_data(c: connection, length: count, data: string)
> >> >        {
> >> >        local session = get_session(c, T);
> >> >
> >> >        #md5 hashing is now a builtin function, so just call it and
> >> > dumpthe
> >> > result into the content_hash field
> >> >        #that field in the info struct was already there, just had to
> add
> >> > this to fill it.
> >> >        session$content_hash = md5_hash(data);
> >> >
> >> >        #log the first 256 bytes of the attachment and the MD5 hash.
> >> >        mime_log_msg(session, "data", fmt("%d: %s", length,
> >> > sub_bytes(data,
> >> > 0, 256)));
> >> >        mime_log_msg(session, "all data", fmt("MD5: %s",
> >> > session$content_hash));
> >> >
> >> >        #if the hot flag is set then we dump the MIME-decoded
> attachment
> >> > to
> >> > it's own file for analysis
> >> >        if( session$entity_is_hot )
> >> >         {
> >> >         if ( session$entity_filename == hot_pdf_attachment_filenames )
> >> >              {
> >> >              #build the filename out of MD5, length and filename
> >> >              hot_attachment_dumpname =
> fmt("dumped_pdf_files\/%s:%d:%s",
> >> > session$content_hash, length, session$entity_filename);
> >> >              }
> >> >         if ( session$entity_filename == hot_word_attachment_filenames
> )
> >> >              {
> >> >              hot_attachment_dumpname =
> fmt("dumped_doc_files\/%s:%d:%s",
> >> > session$content_hash, length,session$entity_filename);
> >> >              }
> >> >
> >> >         #get a raw filehandle, notice open() instead of
> open_log_file(),
> >> > write the data out, and be sure to close the fh
> >> >         hot_attachment_dump_fh = open( hot_attachment_dumpname );
> >> >         write_file(hot_attachment_dump_fh, data);
> >> >         close(hot_attachment_dump_fh);
> >> >
> >> >         #log stuff to the hot logfile as well
> >> >              mime_log_hot_msg(session, "hot data", fmt("%d: %s",
> length,
> >> > sub_bytes(data, 0, 256)));
> >> >         mime_log_hot_msg(session, "hot data", fmt("File dumped: %s
> MD5:
> >> > %s",
> >> > session$entity_filename, session$content_hash));
> >> >         }
> >> >
> >> > I attached the modifed mime.bro in case anyone wanted to see the how
> the
> >> > rest of it.
> >> >
> >> >> > The scan for office docs would be similiar, but use
> >> >> > 'OfficeMalScanner'
> >> >> > instead of pdfid.py and pdf-parser.py. If I get this to work, I
> would
> >> >> > like
> >> >> > to do something very similar with http files.
> >> >>
> >> >> Makes sense.
> >> >>
> >> >> > How can I call the external tools?  Is this the right place to be
> >> >> > doing
> >> >> > this?
> >> >>
> >> >> You can't currently do this in a way that would be feasible on live
> >> >> traffic.  The problem is that the call to the external tool would
> block
> >> >> Bro
> >> >> and cause it to start dropping packets.  There is a "when" statement
> >> >> that
> >> >> can help build asynchronous function calls though.  So that the stack
> >> >> state
> >> >> will be saved and used again when the function call returns.  I don't
> >> >> know
> >> >> if the system() (I think this is what you're looking for to run
> >> >> external
> >> >> programs) function can be used with the when statement though.
> >> >
> >> > I suppose the short answer is yes. I was looking for something like
> the
> >> > system() call. Like modifying the PyBroccoli Example from below:
> >> > PyBroccoli Example:
> >> > @event
> >> > def pong(src_time, dst_time):
> >> >     print "pong event: time=%f/%f s" % \
> >> >        (dst_time - src_time, current_time() - src_time)
> >> > bc = Connection("127.0.0.1:47758")
> >> > bc.send("ping", time(current_time()))
> >> >
> >> > To:
> >> >
> >> > @event (event == dumped pdf file)
> >> > def pass_pdf(file):
> >> >       system(pdf_scan.py -f dumped_file.pdf > tempdir)
> >> >
> >> > With what you mentioned taken into account, we can't ask bro to wait
> on
> >> > the
> >> > results, but maybe we could dump the results to a logfile for
> alerting?
> >> >
> >> >>
> >> >> If you are looking to run this on tracefiles for now though, you can
> >> >> certainly just use the system function to call your external tool.
>  It
> >> >> takes
> >> >> a single argument (a string) that is the command line you'd like to
> >> >> run.
> >> >>  There is a function for defanging data if you need to do that too
> >> >> (taking
> >> >> something off the line and using it in the command line) named
> >> >> str_shell_escape.  You do need to make sure that the data that is
> >> >> defanged
> >> >> with str_shell_escape is placed within double-quotes.
> >> >>
> >> >> > I would be surprised if this capability doesn't already exist and
> >> >> > suppose I might be going about this all wrong. I would just prefer
> to
> >> >> > incorporate the file scans in Bro vice running them completely
> >> >> > independently. If I wasn't clear or am completely out in left field
> >> >> > feel
> >> >> > free to be honest. I won't be offended.
> >> >>
> >> >> Nope, not out in left field at all and personally I'm a bit ashamed I
> >> >> never wrote a mime-ext.bro script that was a bit more capable like
> the
> >> >> http-ext script.  I'm going to be rewriting the mime.bro script for
> the
> >> >> next
> >> >> release though and it will definitely have file extraction and
> >> >> identification capabilities built into it.  However, we are going to
> be
> >> >> working toward a much more generalized notion of files for some
> future
> >> >> release of Bro.  I've worked a bit on how that may proceed, but
> >> >> unfortunately we definitely won't be anywhere close to ready with
> that
> >> >> for
> >> >> the next release.
> >> >
> >> >
> >> > <sarcasm>
> >> > Maybe you should charge "more" for Bro...
> >> > </sarcasm>
> >> >
> >> > No, you all are doing a great job on this project. I just wish I could
> >> > do
> >> > more to help.
> >> >
> >> >>
> >> >>  .Seth
> >> >>
> >> >> --
> >> >> Seth Hall
> >> >> International Computer Science Institute
> >> >> (Bro) because everyone has a network
> >> >> http://www.bro-ids.org/
> >> >>
> >> >
> >> > Will
> >> >
> >> > _______________________________________________
> >> > Bro mailing list
> >> > bro at bro-ids.org
> >> > http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/bro
> >> >
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ICSI.Berkeley.EDU/pipermail/bro/attachments/20110321/10b0402b/attachment.html 


More information about the Bro mailing list