[Bro] File Scanning Capability

Jim Mellander jmellander at lbl.gov
Mon Mar 21 14:03:24 PDT 2011


Hi Will,

When bro receives the event, it will raise a notice that will execute
a custom host-pair-drop-connectivity script that drops the
source/destination host pair for a short period to interrupt the
connection in realtime.

seqno is used by bro to keep track of which request it sent, so that
the event can identify the request that was made.  This is in a table
whose entries expire rapidly (the timeout > the expected response time
of the python program)

BTW:

I believe there was a bug in my code above (i put it down half-baked a
while ago, and haven't picked it up in a while) - the broccoli event
should have the same number of arguments as the bro event that sends
it, and vice versa.



On Mon, Mar 21, 2011 at 1:34 PM, Will <baxterw3232 at gmail.com> wrote:
> Thanks for that example Jim!
>
> That gives me a bunch of other ideas. The best thing about using this method
> would be near real-time scanning and notifications vice running a cron'd
> script at a given interval.
>
> In your code below, what are you asking bro to do, if anything with the
> returned value?
>
>            # If the category signals a block
>            bro_conn.send("stomper_block",seqno)
>>
>>     return
>>
>> #Main program - Initialize and call event loop
>>
>> # Setup the connection to bro
>> bro_conn = broccoli.Connection("127.0.0.1:47758")
>>
>> # Event loop
>> bro_event_loop(bro_conn)
>
> Will
>
> On Mon, Mar 21, 2011 at 4:05 PM, Jim Mellander <jmellander at lbl.gov> wrote:
>>
>> Hi Will:
>>
>> Seems like you would probably want to use the python broccoli bindings
>> to send an event to a python client, here's what I'm doing with my
>> "stomper" code, which looks up urls on the fly in a malware database:
>>
>> # In your bro startup script
>> @load listen-clear
>>
>> redef Remote::destinations += {
>>        ["remote_stomper"] = [ $host=127.0.0.1, $events =
>> /remote_check_URL/,
>>  $connect=F, $ssl=F ]
>> ...
>>
>> #within bro policy
>>
>> # Here we send to the broccoli client for checking/processing
>> event remote_check_URL(++stomper_seqno, c, is_orig, host, uri, ts);
>>
>>
>> .....................
>>
>> On the python side, the relevant sections from the python code, which
>> is running as a daemon accepting events from bro and acting on them:
>>
>> #! /usr/bin/env python
>> #
>>
>> import broccoli
>> import sqlite3
>> import random
>> import sys
>> import re
>> import select   # for select loop
>>
>>
>> # Bro event loop
>> def bro_event_loop(bro_conn):
>>    try:
>>        bro_conn_fd=bro_conn_get_fd(bro_conn)
>>        while True:
>>            select.select((bro_conn_fd),(bro_conn_fd),(bro_conn_fd))
>>            bro_conn.processInput()
>>    except:
>>        while True:
>>            bro_conn.processInput()
>>            sleep(.1)
>>
>> @broccoli.event
>> def remote_check_URL(seqno, host, uri):
>>    # Receive a URL from bro, and send a return signal back
>>    #  if it should be blocked.
>>    category = check_database(host,uri)
>>    if category:
>>        if check_category(category):
>>            # If the category signals a block
>>            bro_conn.send("stomper_block",seqno)
>>    return
>>
>> #Main program - Initialize and call event loop
>>
>> # Setup the connection to bro
>> bro_conn = broccoli.Connection("127.0.0.1:47758")
>>
>> # Event loop
>> bro_event_loop(bro_conn)
>> # Everything under this is never executed.
>> sys.exit(0)
>>
>>
>>
>> Hope this will help you kick the can down the road a bit....
>>
>>
>>
>>
>>
>> On Mon, Mar 21, 2011 at 12:44 PM, Will <baxterw3232 at gmail.com> wrote:
>> >
>> >
>> > On Mon, Mar 21, 2011 at 2:49 PM, Seth Hall <seth at icir.org> wrote:
>> >>
>> >> On Mar 21, 2011, at 2:16 PM, Will wrote:
>> >>
>> >> > I will without a doubt eventually incorporate
>> >> > "http-ext-identified-files.sig" instead of what I am currently using,
>> >> > but I
>> >> > am having trouble determining where to integrate the logic for
>> >> > handling each
>> >> > file type. As it currently works, I am saving off every pdf and word
>> >> > doc,
>> >> > which would be unnecessary if I used bro to call the external tools
>> >> > and
>> >> > evaluate the results.
>> >>
>> >> >>That won't actually work quite right.  The
>> >> >> http-ext-identified-files.sig
>> >> >> file uses special signature keywords that the http analyzer
>> >> >> >>provides to
>> >> >> detect file types.  It's not directly applicable to SMTP/MIME
>> >> >> transfers.
>> >>
>> > Understandable. Being that there are so many different types it would be
>> > beneficial enough to create a signature file for SMTP/MIME. I would be
>> > happy
>> > to share it when I get it done.
>> >
>> >>
>> >> > Current logic (this method calls for the external tools to be run
>> >> > against the directory by cron and are independent of Bro):
>> >> >         hot_attachment_dump_fh = open( hot_attachment_dumpname );
>> >> >         write_file(hot_attachment_dump_fh, data);
>> >> >         close(hot_attachment_dump_fh);
>> >>
>> >> >>In what event are you currently running using this code?
>> >
>> > Here is the entire event:
>> >
>> > event mime_entity_data(c: connection, length: count, data: string)
>> >        {
>> >        local session = get_session(c, T);
>> >
>> >        #md5 hashing is now a builtin function, so just call it and
>> > dumpthe
>> > result into the content_hash field
>> >        #that field in the info struct was already there, just had to add
>> > this to fill it.
>> >        session$content_hash = md5_hash(data);
>> >
>> >        #log the first 256 bytes of the attachment and the MD5 hash.
>> >        mime_log_msg(session, "data", fmt("%d: %s", length,
>> > sub_bytes(data,
>> > 0, 256)));
>> >        mime_log_msg(session, "all data", fmt("MD5: %s",
>> > session$content_hash));
>> >
>> >        #if the hot flag is set then we dump the MIME-decoded attachment
>> > to
>> > it's own file for analysis
>> >        if( session$entity_is_hot )
>> >         {
>> >         if ( session$entity_filename == hot_pdf_attachment_filenames )
>> >              {
>> >              #build the filename out of MD5, length and filename
>> >              hot_attachment_dumpname = fmt("dumped_pdf_files\/%s:%d:%s",
>> > session$content_hash, length, session$entity_filename);
>> >              }
>> >         if ( session$entity_filename == hot_word_attachment_filenames )
>> >              {
>> >              hot_attachment_dumpname = fmt("dumped_doc_files\/%s:%d:%s",
>> > session$content_hash, length,session$entity_filename);
>> >              }
>> >
>> >         #get a raw filehandle, notice open() instead of open_log_file(),
>> > write the data out, and be sure to close the fh
>> >         hot_attachment_dump_fh = open( hot_attachment_dumpname );
>> >         write_file(hot_attachment_dump_fh, data);
>> >         close(hot_attachment_dump_fh);
>> >
>> >         #log stuff to the hot logfile as well
>> >              mime_log_hot_msg(session, "hot data", fmt("%d: %s", length,
>> > sub_bytes(data, 0, 256)));
>> >         mime_log_hot_msg(session, "hot data", fmt("File dumped: %s MD5:
>> > %s",
>> > session$entity_filename, session$content_hash));
>> >         }
>> >
>> > I attached the modifed mime.bro in case anyone wanted to see the how the
>> > rest of it.
>> >
>> >> > The scan for office docs would be similiar, but use
>> >> > 'OfficeMalScanner'
>> >> > instead of pdfid.py and pdf-parser.py. If I get this to work, I would
>> >> > like
>> >> > to do something very similar with http files.
>> >>
>> >> Makes sense.
>> >>
>> >> > How can I call the external tools?  Is this the right place to be
>> >> > doing
>> >> > this?
>> >>
>> >> You can't currently do this in a way that would be feasible on live
>> >> traffic.  The problem is that the call to the external tool would block
>> >> Bro
>> >> and cause it to start dropping packets.  There is a "when" statement
>> >> that
>> >> can help build asynchronous function calls though.  So that the stack
>> >> state
>> >> will be saved and used again when the function call returns.  I don't
>> >> know
>> >> if the system() (I think this is what you're looking for to run
>> >> external
>> >> programs) function can be used with the when statement though.
>> >
>> > I suppose the short answer is yes. I was looking for something like the
>> > system() call. Like modifying the PyBroccoli Example from below:
>> > PyBroccoli Example:
>> > @event
>> > def pong(src_time, dst_time):
>> >     print "pong event: time=%f/%f s" % \
>> >        (dst_time - src_time, current_time() - src_time)
>> > bc = Connection("127.0.0.1:47758")
>> > bc.send("ping", time(current_time()))
>> >
>> > To:
>> >
>> > @event (event == dumped pdf file)
>> > def pass_pdf(file):
>> >       system(pdf_scan.py -f dumped_file.pdf > tempdir)
>> >
>> > With what you mentioned taken into account, we can't ask bro to wait on
>> > the
>> > results, but maybe we could dump the results to a logfile for alerting?
>> >
>> >>
>> >> If you are looking to run this on tracefiles for now though, you can
>> >> certainly just use the system function to call your external tool.  It
>> >> takes
>> >> a single argument (a string) that is the command line you'd like to
>> >> run.
>> >>  There is a function for defanging data if you need to do that too
>> >> (taking
>> >> something off the line and using it in the command line) named
>> >> str_shell_escape.  You do need to make sure that the data that is
>> >> defanged
>> >> with str_shell_escape is placed within double-quotes.
>> >>
>> >> > I would be surprised if this capability doesn't already exist and
>> >> > suppose I might be going about this all wrong. I would just prefer to
>> >> > incorporate the file scans in Bro vice running them completely
>> >> > independently. If I wasn't clear or am completely out in left field
>> >> > feel
>> >> > free to be honest. I won't be offended.
>> >>
>> >> Nope, not out in left field at all and personally I'm a bit ashamed I
>> >> never wrote a mime-ext.bro script that was a bit more capable like the
>> >> http-ext script.  I'm going to be rewriting the mime.bro script for the
>> >> next
>> >> release though and it will definitely have file extraction and
>> >> identification capabilities built into it.  However, we are going to be
>> >> working toward a much more generalized notion of files for some future
>> >> release of Bro.  I've worked a bit on how that may proceed, but
>> >> unfortunately we definitely won't be anywhere close to ready with that
>> >> for
>> >> the next release.
>> >
>> >
>> > <sarcasm>
>> > Maybe you should charge "more" for Bro...
>> > </sarcasm>
>> >
>> > No, you all are doing a great job on this project. I just wish I could
>> > do
>> > more to help.
>> >
>> >>
>> >>  .Seth
>> >>
>> >> --
>> >> Seth Hall
>> >> International Computer Science Institute
>> >> (Bro) because everyone has a network
>> >> http://www.bro-ids.org/
>> >>
>> >
>> > Will
>> >
>> > _______________________________________________
>> > Bro mailing list
>> > bro at bro-ids.org
>> > http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/bro
>> >
>
>




More information about the Bro mailing list