From dopheide at ncsa.illinois.edu Fri Mar 4 14:29:40 2011 From: dopheide at ncsa.illinois.edu (Dop) Date: Fri, 04 Mar 2011 16:29:40 -0600 Subject: [Bro] multiple workers per cluster node Message-ID: Hopefully quick question. How would you go about configuring Bro cluster nodes to each run dual clients (one per input interface)? Ie, all of my systems have input sources on eth4 and eth5. Instead of bonding those together and running a single Bro thread on bond0, I'd rather have two. Something is getting super confused when I try to do it: For each worker I have this: [nids-21a] type=worker host=10.142.148.21 interface=eth4 [nids-21b] type=worker host=10.142.148.21 interface=eth5 [BroControl] > start starting manager ... starting proxy-1 ... starting nids-21a ... starting nids-21b ... starting nids-22a ... starting nids-22b ... starting nids-23a ... starting nids-23b ... starting nids-24a ... starting nids-24b ... (nids-22a still initializing) (nids-21b still initializing) (nids-23b still initializing) (nids-21a still initializing) What's strange is that it seems to fail unevenly. Fails totally on 21, partially on 22 and 23, but works on 24. It's always the same nodes failing. Thanks, -Dop From jones at tacc.utexas.edu Fri Mar 4 15:52:23 2011 From: jones at tacc.utexas.edu (William Jones) Date: Fri, 4 Mar 2011 17:52:23 -0600 Subject: [Bro] multiple workers per cluster node In-Reply-To: References: Message-ID: Instead of For each worker I have this: [nids-21a] type=worker host=10.142.148.21 interface=eth4 [nids-21b] type=worker host=10.142.148.21 interface=eth5 Try: For each worker I have this: [nids-21] type=worker host=10.142.148.21 interface=eth4 -Ieth5 If you node had motile nodes you can write a pcap filter to split the ip space into multiples of 2,4 or 8 and run 2, 4, or 8 instance on the node. This set up allow one bro instance to see by sides of the same flow and will allow you to take advanced of all the cpu on a node. Bill Jones -----Original Message----- From: bro-bounces at bro-ids.org [mailto:bro-bounces at bro-ids.org] On Behalf Of Dop Sent: Friday, March 04, 2011 4:30 PM To: bro at bro-ids.org Subject: [Bro] multiple workers per cluster node Hopefully quick question. How would you go about configuring Bro cluster nodes to each run dual clients (one per input interface)? Ie, all of my systems have input sources on eth4 and eth5. Instead of bonding those together and running a single Bro thread on bond0, I'd rather have two. Something is getting super confused when I try to do it: For each worker I have this: [nids-21a] type=worker host=10.142.148.21 interface=eth4 [nids-21b] type=worker host=10.142.148.21 interface=eth5 [BroControl] > start starting manager ... starting proxy-1 ... starting nids-21a ... starting nids-21b ... starting nids-22a ... starting nids-22b ... starting nids-23a ... starting nids-23b ... starting nids-24a ... starting nids-24b ... (nids-22a still initializing) (nids-21b still initializing) (nids-23b still initializing) (nids-21a still initializing) What's strange is that it seems to fail unevenly. Fails totally on 21, partially on 22 and 23, but works on 24. It's always the same nodes failing. Thanks, -Dop _______________________________________________ Bro mailing list bro at bro-ids.org http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/bro From JAzoff at uamail.albany.edu Fri Mar 4 16:50:23 2011 From: JAzoff at uamail.albany.edu (Justin Azoff) Date: Fri, 4 Mar 2011 19:50:23 -0500 Subject: [Bro] multiple workers per cluster node In-Reply-To: References: Message-ID: <20110305005023.GD3604@datacomm.albany.edu> On Fri, Mar 04, 2011 at 05:29:40PM -0500, Dop wrote: > Hopefully quick question. How would you go about configuring Bro cluster > nodes to each run dual clients (one per input interface)? > ... > What's strange is that it seems to fail unevenly. Fails totally on 21, > partially on 22 and 23, but works on 24. It's always the same nodes > failing. This should work fine, I run 4 workers on one machine without any issues. It sounds like maybe you have some filesystem issues preventing bro from starting. What do you have in /usr/local/bro/spool/ for each of the failing nodes? Is there anything in the stdout or stderr logs? /usr/local/bro/spool/debug.log may also have useful info I would focus on the machine that it starts partially on. -- -- Justin Azoff -- Network Security & Performance Analyst From seth at icir.org Fri Mar 4 17:27:26 2011 From: seth at icir.org (Seth Hall) Date: Fri, 4 Mar 2011 20:27:26 -0500 Subject: [Bro] multiple workers per cluster node In-Reply-To: References: Message-ID: <08FB7564-C708-44FF-B1A2-BE1E33064551@icir.org> On Mar 4, 2011, at 5:29 PM, Dop wrote: > What's strange is that it seems to fail unevenly. Fails totally on 21, > partially on 22 and 23, but works on 24. It's always the same nodes > failing. Do you get a crash message when a worker fails? .Seth -- Seth Hall International Computer Science Institute (Bro) because everyone has a network http://www.bro-ids.org/ From dopheide at ncsa.illinois.edu Fri Mar 4 20:50:50 2011 From: dopheide at ncsa.illinois.edu (Dop) Date: Fri, 04 Mar 2011 22:50:50 -0600 Subject: [Bro] multiple workers per cluster node In-Reply-To: <20110305005023.GD3604@datacomm.albany.edu> Message-ID: Thanks everyone for the replies and suggestions. Apparently I just forgot to run 'install' after changing the node config which is embarrassing, but I still find it interesting that they all reacted differently. For future reference, all of the instances that fail show: /usr/local/bro/share/bro/broctl/cluster-worker.remote.bro, line 14 (BroCtl::workers[WORKER]): run-time error, no such index /usr/local/bro/share/bro/broctl/cluster-worker.remote.bro, line 13 ($host=BroCtl::manager$ip, $p=BroCtl::manager$p, $events=Remote::manager_events, $connect=T, $sync=F, $retry=1.0 min, $class=BroCtl::workers[WORKER]$tag): run-time error, uninitialized list value /usr/local/bro/share/broctl/scripts/run-bro: line 73: 27140 Segmentation fault (core dumped) nohup $tmpbro $@ -Dop -----Original Message----- From: Justin Azoff Date: Fri, 4 Mar 2011 19:50:23 -0500 To: Dop Cc: "bro at bro-ids.org" Subject: Re: [Bro] multiple workers per cluster node >On Fri, Mar 04, 2011 at 05:29:40PM -0500, Dop wrote: >> Hopefully quick question. How would you go about configuring Bro >>cluster >> nodes to each run dual clients (one per input interface)? >> ... >> What's strange is that it seems to fail unevenly. Fails totally on 21, >> partially on 22 and 23, but works on 24. It's always the same nodes >> failing. > >This should work fine, I run 4 workers on one machine without any >issues. > >It sounds like maybe you have some filesystem issues preventing bro from >starting. > >What do you have in /usr/local/bro/spool/ for each of the failing nodes? >Is there anything in the stdout or stderr logs? > >/usr/local/bro/spool/debug.log may also have useful info > >I would focus on the machine that it starts partially on. > >-- >-- Justin Azoff >-- Network Security & Performance Analyst > From jones at tacc.utexas.edu Fri Mar 4 22:16:16 2011 From: jones at tacc.utexas.edu (William Jones) Date: Sat, 5 Mar 2011 00:16:16 -0600 Subject: [Bro] multiple workers per cluster node In-Reply-To: References: Message-ID: Try name the works [worker-1] [worker-2] ... Bill Jones -----Original Message----- From: bro-bounces at bro-ids.org [mailto:bro-bounces at bro-ids.org] On Behalf Of Dop Sent: Friday, March 04, 2011 4:30 PM To: bro at bro-ids.org Subject: [Bro] multiple workers per cluster node Hopefully quick question. How would you go about configuring Bro cluster nodes to each run dual clients (one per input interface)? Ie, all of my systems have input sources on eth4 and eth5. Instead of bonding those together and running a single Bro thread on bond0, I'd rather have two. Something is getting super confused when I try to do it: For each worker I have this: [nids-21a] type=worker host=10.142.148.21 interface=eth4 [nids-21b] type=worker host=10.142.148.21 interface=eth5 [BroControl] > start starting manager ... starting proxy-1 ... starting nids-21a ... starting nids-21b ... starting nids-22a ... starting nids-22b ... starting nids-23a ... starting nids-23b ... starting nids-24a ... starting nids-24b ... (nids-22a still initializing) (nids-21b still initializing) (nids-23b still initializing) (nids-21a still initializing) What's strange is that it seems to fail unevenly. Fails totally on 21, partially on 22 and 23, but works on 24. It's always the same nodes failing. Thanks, -Dop _______________________________________________ Bro mailing list bro at bro-ids.org http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/bro From robin at icir.org Mon Mar 7 13:27:22 2011 From: robin at icir.org (Robin Sommer) Date: Mon, 7 Mar 2011 13:27:22 -0800 Subject: [Bro] Bro 1.5.3 release now available Message-ID: <20110307212722.GA54021@icir.org> Bro release 1.5.3 is now available from ftp://bro-ids.org/bro-1.5.3.tar.gz and http://www.bro-ids.org/download/bro-1.5.3.tar.gz This version is a maintenance release with a few refinements and fixes, see below. The next major release will be 1.6, which we are actively working on. Robin --------- cut ------------------------------------------------------- 1.5.3 Thu Mar 3 08:55:11 PST 2011 - Removing aux/broctl/policy/cluster-addrs.hot.bro from the distribution. The script is no longer needed and could in fact break an installation because it redefines an old variable that has went away. (Robin Sommer) - Smarter way to increase the communication module's pipe's socket buffer size, resulting in a value closer to the allowed maximum. (Craig Leres) - BroControl now also maintains links from the log archive to the current set of logs when running in standalone mode. (Robin Sommer) - Bug fix for a file descriptor leak in the remote communication module. (Scott Campbell) - Bug fix for BroControl to now activate trace-summary's sampling in cluster mode, but not anymore in standalone mode. (Robin Sommer) - Broccoli updates: * Accept empty strings ("") as values in the configuration file. (Craig Leres) * Support for specifying a separate host key for SSL-enabled operation, with documentation update. (Craig Leres) From seth at icir.org Wed Mar 16 10:57:58 2011 From: seth at icir.org (Seth Hall) Date: Wed, 16 Mar 2011 13:57:58 -0400 Subject: [Bro] question about printing timestamps Message-ID: <5DDE5234-F543-4F5D-9763-14FB3FCC0893@icir.org> I received a question privately about formatting timestamps in a human readable manner the other day and I thought I'd answer the question a bit more publicly. To format "time" values as human readable, you can use either %D or %T in calls to fmt(). Like this: fmt("%D", network_time()); Hopefully this helps someone. We'll try to make sure that things like this will be documented for the next release. .Seth -- Seth Hall International Computer Science Institute (Bro) because everyone has a network http://www.bro-ids.org/ From baxterw3232 at gmail.com Wed Mar 16 11:17:03 2011 From: baxterw3232 at gmail.com (Will) Date: Wed, 16 Mar 2011 14:17:03 -0400 Subject: [Bro] Incorporating dns_reponse in dns_request Message-ID: Hello All, Below is my event for dns_request in my site specific dns.bro policy. It currently creates a notice.log entry (and eventually an email alert) when any internal host does a look up for a domain in our hostile_domain_list. Example: '172.x.x.x queried 'very.bad.org' @ 2011-03-16-12:41:13.560817003 (EST)' The only thing missing from this is the returned IP address, if one was returned. Current Function (with zone transfer logic removed for brevity): event dns_request(c: connection, msg: dns_msg, query: string, qtype: count, qclass: count) { local id = c$id; local orig = id$orig_h; local resp = id$resp_h; local session = lookup_DNS_session(c, msg$id); local anno = DNS_query_annotation(c, msg, query, qtype, F); local report = fmt("%.06f #%d %s", network_time(), session$id, c$id$orig_h); local q: string; if ( orig in okay_to_lookup_sensitive_hosts ) return; if ( logging ) print dns_log, fmt("%s", report); # Check to see if this is a host or MX lookup for a designated hostile domain. local subq = second_level_domain(query); if ( check_domain_list && (query_types[qtype] == "A" || query_types[qtype] == "MX") && subq in hostile_domain_list ) { if( subq in hostile_domain_list[subq] || third_level_domain(query) in hostile_domain_list[subq] ) NOTICE([$note=SensitiveDNS_Lookup, $conn=c, $msg=fmt("%s queried '%s' @ %T (EST)", id$orig_h, query, network_time())]); } session$pending_queries[msg$id] = anno; session$last_active = network_time(); } I have tried to incorporate code from some of the other functions like creating a local drr variable and passing that to the function, but haven't had any luck. Something like this: function insert_name(c: connection, msg: dns_msg, ans: dns_answer, a: addr) { local drr: dns_response_record; So, I guess my question is, is there a way to evaluate a DNS query along with its corresponding response and return an IP address in this same event? I assume this may be a 'no' if each is handled completely independent. Thanks for listening...err...reading! Will -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ICSI.Berkeley.EDU/pipermail/bro/attachments/20110316/f9441ecc/attachment.html From hartley.87 at osu.edu Wed Mar 16 11:27:21 2011 From: hartley.87 at osu.edu (Hartley, Christopher J.) Date: Wed, 16 Mar 2011 18:27:21 +0000 Subject: [Bro] question about printing timestamps In-Reply-To: <5DDE5234-F543-4F5D-9763-14FB3FCC0893@icir.org> References: <5DDE5234-F543-4F5D-9763-14FB3FCC0893@icir.org> Message-ID: <9FCD6A42-E627-4C74-B387-BA9E6DD54A7E@osu.edu> fwiw, an equivalent to strptime would be very helpful. I guess it gets into the philosophy of what Bro should do and what should be a part of a log management solution. The case where this came up was in smtp Received: headers. I'd like Bro to determine whether a message is a retransmission based on comparing network_time() to the date in the header -- mind this is irritating because those headers can be more or less arbitrary ... Stamp = From-domain By-domain Opt-info ";" FWS date-time ; where "date-time" is as defined in [32] ; but the "obs-" forms, especially two-digit ; years, are prohibited in SMTP and MUST NOT be used. That, from the RFC (2821) should make it pretty easy to find and strptime the format, although it may also need to be able to try several candidate formats.... The more I look at it, the more comfortable I am with it: RFC 2821, 3.3. Date and Time Specification ... date-time = [ day-of-week "," ] date FWS time [CFWS] day-of-week = ([FWS] day-name) / obs-day-of-week day-name = "Mon" / "Tue" / "Wed" / "Thu" / "Fri" / "Sat" / "Sun" date = day month year year = 4*DIGIT / obs-year month = (FWS month-name FWS) / obs-month month-name = "Jan" / "Feb" / "Mar" / "Apr" / "May" / "Jun" / "Jul" / "Aug" / "Sep" / "Oct" / "Nov" / "Dec" day = ([FWS] 1*2DIGIT) / obs-day time = time-of-day FWS zone time-of-day = hour ":" minute [ ":" second ] hour = 2DIGIT / obs-hour minute = 2DIGIT / obs-minute second = 2DIGIT / obs-second zone = (( "+" / "-" ) 4DIGIT) / obs-zone So yeah, a strptime() would be pretty helpful, I haven't spent enough time to grock the Bro policy script parser to see how hard it would be to add... Oh, why do I want to check for retransmissions? Our silly mail server tries very hard to deliver spam, retrying frequently for ~ 48 hrs. Hopefully not a common problem! But there are likely other uses.. At this point I'm waiting for someone to respond, "Actually, there is a strptime..." Chris On Mar 16, 2011, at 1:57 PM, Seth Hall wrote: I received a question privately about formatting timestamps in a human readable manner the other day and I thought I'd answer the question a bit more publicly. To format "time" values as human readable, you can use either %D or %T in calls to fmt(). Like this: fmt("%D", network_time()); Hopefully this helps someone. We'll try to make sure that things like this will be documented for the next release. .Seth -- Seth Hall International Computer Science Institute (Bro) because everyone has a network http://www.bro-ids.org/ _______________________________________________ Bro mailing list bro at bro-ids.org http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/bro -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ICSI.Berkeley.EDU/pipermail/bro/attachments/20110316/b4a6fde4/attachment.html From seth at icir.org Wed Mar 16 11:33:21 2011 From: seth at icir.org (Seth Hall) Date: Wed, 16 Mar 2011 14:33:21 -0400 Subject: [Bro] question about printing timestamps In-Reply-To: <9FCD6A42-E627-4C74-B387-BA9E6DD54A7E@osu.edu> References: <5DDE5234-F543-4F5D-9763-14FB3FCC0893@icir.org> <9FCD6A42-E627-4C74-B387-BA9E6DD54A7E@osu.edu> Message-ID: On Mar 16, 2011, at 2:27 PM, Hartley, Christopher J. wrote: > At this point I'm waiting for someone to respond, "Actually, there is a strptime..." Heh, I wish I could tell you that. File a ticket and we'll see if we can do a strptime built in function that basically just wraps the C function. It seems like it should be fairly straight forward. No promises on the next release, but if it's filed we'll certainly consider it. .Seth -- Seth Hall International Computer Science Institute (Bro) because everyone has a network http://www.bro-ids.org/ From seth at icir.org Thu Mar 17 12:17:35 2011 From: seth at icir.org (Seth Hall) Date: Thu, 17 Mar 2011 15:17:35 -0400 Subject: [Bro] Incorporating dns_reponse in dns_request In-Reply-To: References: Message-ID: <1F29C938-56F1-481E-9C8F-E6439C3F61BA@icir.org> On Mar 16, 2011, at 2:17 PM, Will wrote: > So, I guess my question is, is there a way to evaluate a DNS query along with its corresponding response and return an IP address in this same event? > > I assume this may be a 'no' if each is handled completely independent. You're right, each is handled independently. If you check my github repository, there is a dns-ext.bro script[1], but it has memory trouble on live traffic. If you still want to test it though, it does what you want by tying the request and response(s) together. You could write code like this if you load the dns-ext script: event dns_ext(id: conn_id, di: dns_ext_session_info) { local subq = second_level_domain(di$query); if ( check_domain_list && (query_types[qtype] == "A" || query_types[qtype] == "MX") && subq in hostile_domain_list ) { if( subq in hostile_domain_list[subq] || third_level_domain(di$query) in hostile_domain_list[subq] ) NOTICE([$note=SensitiveDNS_Lookup, $conn=c, $msg=fmt("%s queried '%s' @ %T (EST) and the responses were: %s", id$orig_h, query, network_time(), di$replies)]); } } This is very similar to one of the techniques we'll likely have in the next release for extension of shipped scripts, but that code above should do everything you were looking to do in the script you emailed. Have fun! .Seth 1. https://github.com/sethhall/bro_scripts/blob/master/testing/dns-ext.bro -- Seth Hall International Computer Science Institute (Bro) because everyone has a network http://www.bro-ids.org/ From seth at icir.org Thu Mar 17 12:25:13 2011 From: seth at icir.org (Seth Hall) Date: Thu, 17 Mar 2011 15:25:13 -0400 Subject: [Bro] Incorporating dns_reponse in dns_request In-Reply-To: <1F29C938-56F1-481E-9C8F-E6439C3F61BA@icir.org> References: <1F29C938-56F1-481E-9C8F-E6439C3F61BA@icir.org> Message-ID: On Mar 17, 2011, at 3:17 PM, Seth Hall wrote: > if ( check_domain_list && (query_types[qtype] == "A" || query_types[qtype] == "MX") && subq in hostile_domain_list ) Oops, almost complete. The above should be... if ( check_domain_list && (query_types[di$qtype] == "A" || query_types[di$qtype] == "MX") && subq in hostile_domain_list ) .Seth -- Seth Hall International Computer Science Institute (Bro) because everyone has a network http://www.bro-ids.org/ From baxterw3232 at gmail.com Thu Mar 17 12:31:38 2011 From: baxterw3232 at gmail.com (Will) Date: Thu, 17 Mar 2011 15:31:38 -0400 Subject: [Bro] Incorporating dns_reponse in dns_request In-Reply-To: References: <1F29C938-56F1-481E-9C8F-E6439C3F61BA@icir.org> Message-ID: Thanks Seth. I will let you know how it works for us. I can understand how it could strain system resources attempting to correlate the events in real time. It will also be interesting to see how it will react on a domain that is sinkhole'd. Our spam appliance, for example, will retry the query 8-10 times until it gives up on domains we have sunk. Will On Thu, Mar 17, 2011 at 3:25 PM, Seth Hall wrote: > > On Mar 17, 2011, at 3:17 PM, Seth Hall wrote: > > > if ( check_domain_list && (query_types[qtype] == "A" || > query_types[qtype] == "MX") && subq in hostile_domain_list ) > > Oops, almost complete. The above should be... > > if ( check_domain_list && (query_types[di$qtype] == "A" || > query_types[di$qtype] == "MX") && subq in hostile_domain_list ) > > .Seth > > -- > Seth Hall > International Computer Science Institute > (Bro) because everyone has a network > http://www.bro-ids.org/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ICSI.Berkeley.EDU/pipermail/bro/attachments/20110317/8458ac47/attachment.html From seth at icir.org Thu Mar 17 12:40:46 2011 From: seth at icir.org (Seth Hall) Date: Thu, 17 Mar 2011 15:40:46 -0400 Subject: [Bro] Incorporating dns_reponse in dns_request In-Reply-To: References: <1F29C938-56F1-481E-9C8F-E6439C3F61BA@icir.org> Message-ID: <05D2833D-83CE-4792-82FB-2FD13C2D9A84@icir.org> On Mar 17, 2011, at 3:31 PM, Will wrote: > I will let you know how it works for us. I can understand how it could strain system resources attempting to correlate the events in real time. The memory problem is a programming error on my part. It's been on my list to fix for a really long time. :) .Seth -- Seth Hall International Computer Science Institute (Bro) because everyone has a network http://www.bro-ids.org/ From baxterw3232 at gmail.com Mon Mar 21 11:16:39 2011 From: baxterw3232 at gmail.com (Will) Date: Mon, 21 Mar 2011 14:16:39 -0400 Subject: [Bro] File Scanning Capability Message-ID: Hello again, I was hoping to get some guidance on how to best use Bro to process email files. My end goal is to strip out inbound email attachments, identify the file type, then run a distinct set of external tools against them. Each file type would have a different set or order of tools. I will without a doubt eventually incorporate "http-ext-identified-files.sig" instead of what I am currently using, but I am having trouble determining where to integrate the logic for handling each file type. As it currently works, I am saving off every pdf and word doc, which would be unnecessary if I used bro to call the external tools and evaluate the results. Current logic (this method calls for the external tools to be run against the directory by cron and are independent of Bro): #if the hot flag is set then we dump the MIME-decoded attachment to it's own file for analysis if( session$entity_is_hot ) { if ( session$entity_filename == hot_pdf_attachment_filenames ) { #build the filename out of MD5, length and filename hot_attachment_dumpname = fmt("dumped_pdf_files\/%s:%d:%s", session$content_hash, length, session$entity_filename); } if ( session$entity_filename == hot_word_attachment_filenames ) { hot_attachment_dumpname = fmt("dumped_doc_files\/%s:%d:%s", session$content_hash, length,session$entity_filename); } #get a raw filehandle, notice open() instead of open_log_file(), write the data out, and be sure to close the fh hot_attachment_dump_fh = open( hot_attachment_dumpname ); write_file(hot_attachment_dump_fh, data); close(hot_attachment_dump_fh); } What I would like to be able to do: if ( session$entity_filename == hot_pdf_attachment_filenames ) { hot_attachment_dumpname = fmt("dumped_pdf_files\/%d:%s", length, session$entity_filename); hot_attachment_dump_fh = open( hot_attachment_dumpname ); write_file(hot_attachment_dump_fh, data); scan_pdf_file(file) #call the external tools # scan_pdf_file would include something like this: scanpdf.py (which would include clamscan, pdfid.py, cymruMHR, ssdeep...etc) The pdf python script can pass the results back to bro for handling. if ( result == bad ) { alert } else { delete file, carry on or log results somewhere then delete file } The scan for office docs would be similiar, but use 'OfficeMalScanner' instead of pdfid.py and pdf-parser.py. If I get this to work, I would like to do something very similar with http files. How can I call the external tools? Is this the right place to be doing this? I read in Robin's 'Advanced Scripting' presentation from the 2009 workshop about injecting external information but am still confused how to do the alternative. I would be surprised if this capability doesn't already exist and suppose I might be going about this all wrong. I would just prefer to incorporate the file scans in Bro vice running them completely independently. If I wasn't clear or am completely out in left field feel free to be honest. I won't be offended. Thanks in advance! Will -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ICSI.Berkeley.EDU/pipermail/bro/attachments/20110321/7f856995/attachment.html From seth at icir.org Mon Mar 21 11:49:13 2011 From: seth at icir.org (Seth Hall) Date: Mon, 21 Mar 2011 14:49:13 -0400 Subject: [Bro] File Scanning Capability In-Reply-To: References: Message-ID: <0A4B2FE3-9CB6-4398-91AA-177619767CA2@icir.org> On Mar 21, 2011, at 2:16 PM, Will wrote: > I will without a doubt eventually incorporate "http-ext-identified-files.sig" instead of what I am currently using, but I am having trouble determining where to integrate the logic for handling each file type. As it currently works, I am saving off every pdf and word doc, which would be unnecessary if I used bro to call the external tools and evaluate the results. That won't actually work quite right. The http-ext-identified-files.sig file uses special signature keywords that the http analyzer provides to detect file types. It's not directly applicable to SMTP/MIME transfers. > Current logic (this method calls for the external tools to be run against the directory by cron and are independent of Bro): > hot_attachment_dump_fh = open( hot_attachment_dumpname ); > write_file(hot_attachment_dump_fh, data); > close(hot_attachment_dump_fh); In what event are you currently running using this code? > The scan for office docs would be similiar, but use 'OfficeMalScanner' instead of pdfid.py and pdf-parser.py. If I get this to work, I would like to do something very similar with http files. Makes sense. > How can I call the external tools? Is this the right place to be doing this? You can't currently do this in a way that would be feasible on live traffic. The problem is that the call to the external tool would block Bro and cause it to start dropping packets. There is a "when" statement that can help build asynchronous function calls though. So that the stack state will be saved and used again when the function call returns. I don't know if the system() (I think this is what you're looking for to run external programs) function can be used with the when statement though. If you are looking to run this on tracefiles for now though, you can certainly just use the system function to call your external tool. It takes a single argument (a string) that is the command line you'd like to run. There is a function for defanging data if you need to do that too (taking something off the line and using it in the command line) named str_shell_escape. You do need to make sure that the data that is defanged with str_shell_escape is placed within double-quotes. > I would be surprised if this capability doesn't already exist and suppose I might be going about this all wrong. I would just prefer to incorporate the file scans in Bro vice running them completely independently. If I wasn't clear or am completely out in left field feel free to be honest. I won't be offended. Nope, not out in left field at all and personally I'm a bit ashamed I never wrote a mime-ext.bro script that was a bit more capable like the http-ext script. I'm going to be rewriting the mime.bro script for the next release though and it will definitely have file extraction and identification capabilities built into it. However, we are going to be working toward a much more generalized notion of files for some future release of Bro. I've worked a bit on how that may proceed, but unfortunately we definitely won't be anywhere close to ready with that for the next release. .Seth -- Seth Hall International Computer Science Institute (Bro) because everyone has a network http://www.bro-ids.org/ From baxterw3232 at gmail.com Mon Mar 21 12:44:01 2011 From: baxterw3232 at gmail.com (Will) Date: Mon, 21 Mar 2011 15:44:01 -0400 Subject: [Bro] File Scanning Capability In-Reply-To: <0A4B2FE3-9CB6-4398-91AA-177619767CA2@icir.org> References: <0A4B2FE3-9CB6-4398-91AA-177619767CA2@icir.org> Message-ID: On Mon, Mar 21, 2011 at 2:49 PM, Seth Hall wrote: > > On Mar 21, 2011, at 2:16 PM, Will wrote: > > > I will without a doubt eventually incorporate > "http-ext-identified-files.sig" instead of what I am currently using, but I > am having trouble determining where to integrate the logic for handling each > file type. As it currently works, I am saving off every pdf and word doc, > which would be unnecessary if I used bro to call the external tools and > evaluate the results. > > >>That won't actually work quite right. The http-ext-identified-files.sig > file uses special signature keywords that the http analyzer >>provides to > detect file types. It's not directly applicable to SMTP/MIME transfers. > > Understandable. Being that there are so many different types it would be beneficial enough to create a signature file for SMTP/MIME. I would be happy to share it when I get it done. > > Current logic (this method calls for the external tools to be run against > the directory by cron and are independent of Bro): > > hot_attachment_dump_fh = open( hot_attachment_dumpname ); > > write_file(hot_attachment_dump_fh, data); > > close(hot_attachment_dump_fh); > > >>In what event are you currently running using this code? > Here is the entire event: event mime_entity_data(c: connection, length: count, data: string) { local session = get_session(c, T); #md5 hashing is now a builtin function, so just call it and dumpthe result into the content_hash field #that field in the info struct was already there, just had to add this to fill it. session$content_hash = md5_hash(data); #log the first 256 bytes of the attachment and the MD5 hash. mime_log_msg(session, "data", fmt("%d: %s", length, sub_bytes(data, 0, 256))); mime_log_msg(session, "all data", fmt("MD5: %s", session$content_hash)); #if the hot flag is set then we dump the MIME-decoded attachment to it's own file for analysis if( session$entity_is_hot ) { if ( session$entity_filename == hot_pdf_attachment_filenames ) { #build the filename out of MD5, length and filename hot_attachment_dumpname = fmt("dumped_pdf_files\/%s:%d:%s", session$content_hash, length, session$entity_filename); } if ( session$entity_filename == hot_word_attachment_filenames ) { hot_attachment_dumpname = fmt("dumped_doc_files\/%s:%d:%s", session$content_hash, length,session$entity_filename); } #get a raw filehandle, notice open() instead of open_log_file(), write the data out, and be sure to close the fh hot_attachment_dump_fh = open( hot_attachment_dumpname ); write_file(hot_attachment_dump_fh, data); close(hot_attachment_dump_fh); #log stuff to the hot logfile as well mime_log_hot_msg(session, "hot data", fmt("%d: %s", length, sub_bytes(data, 0, 256))); mime_log_hot_msg(session, "hot data", fmt("File dumped: %s MD5: %s", session$entity_filename, session$content_hash)); } I attached the modifed mime.bro in case anyone wanted to see the how the rest of it. > The scan for office docs would be similiar, but use 'OfficeMalScanner' > instead of pdfid.py and pdf-parser.py. If I get this to work, I would like > to do something very similar with http files. > > Makes sense. > > > How can I call the external tools? Is this the right place to be doing > this? > > You can't currently do this in a way that would be feasible on live > traffic. The problem is that the call to the external tool would block Bro > and cause it to start dropping packets. There is a "when" statement that > can help build asynchronous function calls though. So that the stack state > will be saved and used again when the function call returns. I don't know > if the system() (I think this is what you're looking for to run external > programs) function can be used with the when statement though. I suppose the short answer is yes. I was looking for something like the system() call. Like modifying the PyBroccoli Example from below: PyBroccoli Example: @event def pong(src_time, dst_time): print "pong event: time=%f/%f s" % \ (dst_time - src_time, current_time() - src_time) bc = Connection("127.0.0.1:47758") bc.send("ping", time(current_time())) To: @event (event == dumped pdf file) def pass_pdf(file): system(pdf_scan.py -f dumped_file.pdf > tempdir) With what you mentioned taken into account, we can't ask bro to wait on the results, but maybe we could dump the results to a logfile for alerting? > If you are looking to run this on tracefiles for now though, you can > certainly just use the system function to call your external tool. It takes > a single argument (a string) that is the command line you'd like to run. > There is a function for defanging data if you need to do that too (taking > something off the line and using it in the command line) named > str_shell_escape. You do need to make sure that the data that is defanged > with str_shell_escape is placed within double-quotes. > > > I would be surprised if this capability doesn't already exist and suppose > I might be going about this all wrong. I would just prefer to incorporate > the file scans in Bro vice running them completely independently. If I > wasn't clear or am completely out in left field feel free to be honest. I > won't be offended. > > Nope, not out in left field at all and personally I'm a bit ashamed I never > wrote a mime-ext.bro script that was a bit more capable like the http-ext > script. I'm going to be rewriting the mime.bro script for the next release > though and it will definitely have file extraction and identification > capabilities built into it. However, we are going to be working toward a > much more generalized notion of files for some future release of Bro. I've > worked a bit on how that may proceed, but unfortunately we definitely won't > be anywhere close to ready with that for the next release. > Maybe you should charge "more" for Bro... No, you all are doing a great job on this project. I just wish I could do more to help. > .Seth > > -- > Seth Hall > International Computer Science Institute > (Bro) because everyone has a network > http://www.bro-ids.org/ > > Will -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ICSI.Berkeley.EDU/pipermail/bro/attachments/20110321/24f6c184/attachment.html -------------- next part -------------- A non-text attachment was scrubbed... Name: mime.bro Type: application/octet-stream Size: 11934 bytes Desc: not available Url : http://mailman.ICSI.Berkeley.EDU/pipermail/bro/attachments/20110321/24f6c184/attachment.obj From jmellander at lbl.gov Mon Mar 21 13:05:18 2011 From: jmellander at lbl.gov (Jim Mellander) Date: Mon, 21 Mar 2011 13:05:18 -0700 Subject: [Bro] File Scanning Capability In-Reply-To: References: <0A4B2FE3-9CB6-4398-91AA-177619767CA2@icir.org> Message-ID: Hi Will: Seems like you would probably want to use the python broccoli bindings to send an event to a python client, here's what I'm doing with my "stomper" code, which looks up urls on the fly in a malware database: # In your bro startup script @load listen-clear redef Remote::destinations += { ["remote_stomper"] = [ $host=127.0.0.1, $events = /remote_check_URL/, $connect=F, $ssl=F ] ... #within bro policy # Here we send to the broccoli client for checking/processing event remote_check_URL(++stomper_seqno, c, is_orig, host, uri, ts); ..................... On the python side, the relevant sections from the python code, which is running as a daemon accepting events from bro and acting on them: #! /usr/bin/env python # import broccoli import sqlite3 import random import sys import re import select # for select loop # Bro event loop def bro_event_loop(bro_conn): try: bro_conn_fd=bro_conn_get_fd(bro_conn) while True: select.select((bro_conn_fd),(bro_conn_fd),(bro_conn_fd)) bro_conn.processInput() except: while True: bro_conn.processInput() sleep(.1) @broccoli.event def remote_check_URL(seqno, host, uri): # Receive a URL from bro, and send a return signal back # if it should be blocked. category = check_database(host,uri) if category: if check_category(category): # If the category signals a block bro_conn.send("stomper_block",seqno) return #Main program - Initialize and call event loop # Setup the connection to bro bro_conn = broccoli.Connection("127.0.0.1:47758") # Event loop bro_event_loop(bro_conn) # Everything under this is never executed. sys.exit(0) Hope this will help you kick the can down the road a bit.... On Mon, Mar 21, 2011 at 12:44 PM, Will wrote: > > > On Mon, Mar 21, 2011 at 2:49 PM, Seth Hall wrote: >> >> On Mar 21, 2011, at 2:16 PM, Will wrote: >> >> > I will without a doubt eventually incorporate >> > "http-ext-identified-files.sig" instead of what I am currently using, but I >> > am having trouble determining where to integrate the logic for handling each >> > file type. As it currently works, I am saving off every pdf and word doc, >> > which would be unnecessary if I used bro to call the external tools and >> > evaluate the results. >> >> >>That won't actually work quite right. ?The http-ext-identified-files.sig >> >> file uses special signature keywords that the http analyzer >>provides to >> >> detect file types. ?It's not directly applicable to SMTP/MIME transfers. >> > Understandable. Being that there are so many different types it would be > beneficial enough to create a signature file for SMTP/MIME. I would be happy > to share it when I get it done. > >> >> > Current logic (this method calls for the external tools to be run >> > against the directory by cron and are independent of Bro): >> > ? ? ? ? hot_attachment_dump_fh = open( hot_attachment_dumpname ); >> > ? ? ? ? write_file(hot_attachment_dump_fh, data); >> > ? ? ? ? close(hot_attachment_dump_fh); >> >> >>In what event are you currently running using this code? > > Here is the entire event: > > event mime_entity_data(c: connection, length: count, data: string) > ?????? { > ?????? local session = get_session(c, T); > > ?????? #md5 hashing is now a builtin function, so just call it and dumpthe > result into the content_hash field > ?????? #that field in the info struct was already there, just had to add > this to fill it. > ?????? session$content_hash = md5_hash(data); > > ?????? #log the first 256 bytes of the attachment and the MD5 hash. > ?????? mime_log_msg(session, "data", fmt("%d: %s", length, sub_bytes(data, > 0, 256))); > ?????? mime_log_msg(session, "all data", fmt("MD5: %s", > session$content_hash)); > > ?????? #if the hot flag is set then we dump the MIME-decoded attachment to > it's own file for analysis > ?????? if( session$entity_is_hot ) > ??????? { > ??????? if ( session$entity_filename == hot_pdf_attachment_filenames ) > ???????????? { > ???????????? #build the filename out of MD5, length and filename > ???????????? hot_attachment_dumpname = fmt("dumped_pdf_files\/%s:%d:%s", > session$content_hash, length, session$entity_filename); > ???????????? } > ??????? if ( session$entity_filename == hot_word_attachment_filenames ) > ???????????? { > ???????????? hot_attachment_dumpname = fmt("dumped_doc_files\/%s:%d:%s", > session$content_hash, length,session$entity_filename); > ???????????? } > > ??????? #get a raw filehandle, notice open() instead of open_log_file(), > write the data out, and be sure to close the fh > ??????? hot_attachment_dump_fh = open( hot_attachment_dumpname ); > ??????? write_file(hot_attachment_dump_fh, data); > ??????? close(hot_attachment_dump_fh); > > ??????? #log stuff to the hot logfile as well > ???????????? mime_log_hot_msg(session, "hot data", fmt("%d: %s", length, > sub_bytes(data, 0, 256))); > ??????? mime_log_hot_msg(session, "hot data", fmt("File dumped: %s MD5: %s", > session$entity_filename, session$content_hash)); > ??????? } > > I attached the modifed mime.bro in case anyone wanted to see the how the > rest of it. > >> > The scan for office docs would be similiar, but use 'OfficeMalScanner' >> > instead of pdfid.py and pdf-parser.py. If I get this to work, I would like >> > to do something very similar with http files. >> >> Makes sense. >> >> > How can I call the external tools? ?Is this the right place to be doing >> > this? >> >> You can't currently do this in a way that would be feasible on live >> traffic. ?The problem is that the call to the external tool would block Bro >> and cause it to start dropping packets. ?There is a "when" statement that >> can help build asynchronous function calls though. ?So that the stack state >> will be saved and used again when the function call returns. ?I don't know >> if the system() (I think this is what you're looking for to run external >> programs) function can be used with the when statement though. > > I suppose the short answer is yes. I was looking for something like the > system() call. Like modifying the PyBroccoli Example from below: > PyBroccoli Example: > @event > def pong(src_time, dst_time): > ??? print "pong event: time=%f/%f s" % \ > ?????? (dst_time - src_time, current_time() - src_time) > bc = Connection("127.0.0.1:47758") > bc.send("ping", time(current_time())) > > To: > > @event (event == dumped pdf file) > def pass_pdf(file): > ????? system(pdf_scan.py -f dumped_file.pdf > tempdir) > > With what you mentioned taken into account, we can't ask bro to wait on the > results, but maybe we could dump the results to a logfile for alerting? > >> >> If you are looking to run this on tracefiles for now though, you can >> certainly just use the system function to call your external tool. ?It takes >> a single argument (a string) that is the command line you'd like to run. >> ?There is a function for defanging data if you need to do that too (taking >> something off the line and using it in the command line) named >> str_shell_escape. ?You do need to make sure that the data that is defanged >> with str_shell_escape is placed within double-quotes. >> >> > I would be surprised if this capability doesn't already exist and >> > suppose I might be going about this all wrong. I would just prefer to >> > incorporate the file scans in Bro vice running them completely >> > independently. If I wasn't clear or am completely out in left field feel >> > free to be honest. I won't be offended. >> >> Nope, not out in left field at all and personally I'm a bit ashamed I >> never wrote a mime-ext.bro script that was a bit more capable like the >> http-ext script. ?I'm going to be rewriting the mime.bro script for the next >> release though and it will definitely have file extraction and >> identification capabilities built into it. ?However, we are going to be >> working toward a much more generalized notion of files for some future >> release of Bro. ?I've worked a bit on how that may proceed, but >> unfortunately we definitely won't be anywhere close to ready with that for >> the next release. > > > > Maybe you should charge "more" for Bro... > > > No, you all are doing a great job on this project. I just wish I could do > more to help. > >> >> ?.Seth >> >> -- >> Seth Hall >> International Computer Science Institute >> (Bro) because everyone has a network >> http://www.bro-ids.org/ >> > > Will > > _______________________________________________ > Bro mailing list > bro at bro-ids.org > http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/bro > From seth at icir.org Mon Mar 21 13:22:10 2011 From: seth at icir.org (Seth Hall) Date: Mon, 21 Mar 2011 16:22:10 -0400 Subject: [Bro] File Scanning Capability In-Reply-To: <0A4B2FE3-9CB6-4398-91AA-177619767CA2@icir.org> References: <0A4B2FE3-9CB6-4398-91AA-177619767CA2@icir.org> Message-ID: <6C55E96E-4372-418F-95B3-CE2552CC59AD@icir.org> On Mar 21, 2011, at 2:49 PM, Seth Hall wrote: > On Mar 21, 2011, at 2:16 PM, Will wrote: > >> I will without a doubt eventually incorporate "http-ext-identified-files.sig" instead of what I am currently using, but I am having trouble determining where to integrate the logic for handling each file type. As it currently works, I am saving off every pdf and word doc, which would be unnecessary if I used bro to call the external tools and evaluate the results. > > That won't actually work quite right. The http-ext-identified-files.sig file uses special signature keywords that the http analyzer provides to detect file types. It's not directly applicable to SMTP/MIME transfers. I forgot to mention here that you can do the file detection fully at the script layer with the identify_data data function. It takes a string which is the data at the beginning of a file and a boolean argument. If the boolean is true, it means you want the mime type (from libmagic), otherwise it returns the description of the file (again, from libmagic). .Seth -- Seth Hall International Computer Science Institute (Bro) because everyone has a network http://www.bro-ids.org/ From scampbell at lbl.gov Mon Mar 21 13:27:50 2011 From: scampbell at lbl.gov (Scott Campbell) Date: Mon, 21 Mar 2011 15:27:50 -0500 Subject: [Bro] File Scanning Capability In-Reply-To: References: <0A4B2FE3-9CB6-4398-91AA-177619767CA2@icir.org> Message-ID: <4D87B4C6.9090009@lbl.gov> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I implemented a straw man version of what you are doing for html file transfers - in particular looking at PDF files via the pdfid tool. As Jim pointed out, it is trivial to do a python->bro event call back via Broccoli. I will post the code when I get back home - it is more of a hack, but might prove to be helpful. cheers, scott On 3/21/11 2:44 PM, Will wrote: > On Mon, Mar 21, 2011 at 2:49 PM, Seth Hall wrote: > >> >> On Mar 21, 2011, at 2:16 PM, Will wrote: >> >>> I will without a doubt eventually incorporate >> "http-ext-identified-files.sig" instead of what I am currently using, but I >> am having trouble determining where to integrate the logic for handling each >> file type. As it currently works, I am saving off every pdf and word doc, >> which would be unnecessary if I used bro to call the external tools and >> evaluate the results. >> >>>> That won't actually work quite right. The http-ext-identified-files.sig >> file uses special signature keywords that the http analyzer >>provides to >> detect file types. It's not directly applicable to SMTP/MIME transfers. >> >> Understandable. Being that there are so many different types it would be > beneficial enough to create a signature file for SMTP/MIME. I would be happy > to share it when I get it done. > > >>> Current logic (this method calls for the external tools to be run against >> the directory by cron and are independent of Bro): >>> hot_attachment_dump_fh = open( hot_attachment_dumpname ); >>> write_file(hot_attachment_dump_fh, data); >>> close(hot_attachment_dump_fh); >> >>>> In what event are you currently running using this code? >> > > Here is the entire event: > > event mime_entity_data(c: connection, length: count, data: string) > { > local session = get_session(c, T); > > #md5 hashing is now a builtin function, so just call it and dumpthe > result into the content_hash field > #that field in the info struct was already there, just had to add > this to fill it. > session$content_hash = md5_hash(data); > > #log the first 256 bytes of the attachment and the MD5 hash. > mime_log_msg(session, "data", fmt("%d: %s", length, sub_bytes(data, > 0, 256))); > mime_log_msg(session, "all data", fmt("MD5: %s", > session$content_hash)); > > #if the hot flag is set then we dump the MIME-decoded attachment to > it's own file for analysis > if( session$entity_is_hot ) > { > if ( session$entity_filename == hot_pdf_attachment_filenames ) > { > #build the filename out of MD5, length and filename > hot_attachment_dumpname = fmt("dumped_pdf_files\/%s:%d:%s", > session$content_hash, length, session$entity_filename); > } > if ( session$entity_filename == hot_word_attachment_filenames ) > { > hot_attachment_dumpname = fmt("dumped_doc_files\/%s:%d:%s", > session$content_hash, length,session$entity_filename); > } > > #get a raw filehandle, notice open() instead of open_log_file(), > write the data out, and be sure to close the fh > hot_attachment_dump_fh = open( hot_attachment_dumpname ); > write_file(hot_attachment_dump_fh, data); > close(hot_attachment_dump_fh); > > #log stuff to the hot logfile as well > mime_log_hot_msg(session, "hot data", fmt("%d: %s", length, > sub_bytes(data, 0, 256))); > mime_log_hot_msg(session, "hot data", fmt("File dumped: %s MD5: %s", > session$entity_filename, session$content_hash)); > } > > I attached the modifed mime.bro in case anyone wanted to see the how the > rest of it. > >> The scan for office docs would be similiar, but use 'OfficeMalScanner' >> instead of pdfid.py and pdf-parser.py. If I get this to work, I would like >> to do something very similar with http files. >> >> Makes sense. >> >>> How can I call the external tools? Is this the right place to be doing >> this? >> >> You can't currently do this in a way that would be feasible on live >> traffic. The problem is that the call to the external tool would block Bro >> and cause it to start dropping packets. There is a "when" statement that >> can help build asynchronous function calls though. So that the stack state >> will be saved and used again when the function call returns. I don't know >> if the system() (I think this is what you're looking for to run external >> programs) function can be used with the when statement though. > > > I suppose the short answer is yes. I was looking for something like the > system() call. Like modifying the PyBroccoli Example from below: > PyBroccoli Example: > @event > def pong(src_time, dst_time): > print "pong event: time=%f/%f s" % \ > (dst_time - src_time, current_time() - src_time) > bc = Connection("127.0.0.1:47758") > bc.send("ping", time(current_time())) > > To: > > @event (event == dumped pdf file) > def pass_pdf(file): > system(pdf_scan.py -f dumped_file.pdf > tempdir) > > With what you mentioned taken into account, we can't ask bro to wait on the > results, but maybe we could dump the results to a logfile for alerting? > > >> If you are looking to run this on tracefiles for now though, you can >> certainly just use the system function to call your external tool. It takes >> a single argument (a string) that is the command line you'd like to run. >> There is a function for defanging data if you need to do that too (taking >> something off the line and using it in the command line) named >> str_shell_escape. You do need to make sure that the data that is defanged >> with str_shell_escape is placed within double-quotes. >> >>> I would be surprised if this capability doesn't already exist and suppose >> I might be going about this all wrong. I would just prefer to incorporate >> the file scans in Bro vice running them completely independently. If I >> wasn't clear or am completely out in left field feel free to be honest. I >> won't be offended. >> >> Nope, not out in left field at all and personally I'm a bit ashamed I never >> wrote a mime-ext.bro script that was a bit more capable like the http-ext >> script. I'm going to be rewriting the mime.bro script for the next release >> though and it will definitely have file extraction and identification >> capabilities built into it. However, we are going to be working toward a >> much more generalized notion of files for some future release of Bro. I've >> worked a bit on how that may proceed, but unfortunately we definitely won't >> be anywhere close to ready with that for the next release. >> > > > Maybe you should charge "more" for Bro... > > > No, you all are doing a great job on this project. I just wish I could do > more to help. > > >> .Seth >> >> -- >> Seth Hall >> International Computer Science Institute >> (Bro) because everyone has a network >> http://www.bro-ids.org/ >> >> > Will > > > > > _______________________________________________ > Bro mailing list > bro at bro-ids.org > http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/bro -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iD8DBQFNh7TGK2Plq8B7ZBwRAkMzAKDirnfa8BLlP75GVRi6jl7V7jgXUQCgmKeZ AmyCF5VQsQXYdhRxyTaqanw= =gSvv -----END PGP SIGNATURE----- From baxterw3232 at gmail.com Mon Mar 21 13:34:13 2011 From: baxterw3232 at gmail.com (Will) Date: Mon, 21 Mar 2011 16:34:13 -0400 Subject: [Bro] File Scanning Capability In-Reply-To: References: <0A4B2FE3-9CB6-4398-91AA-177619767CA2@icir.org> Message-ID: Thanks for that example Jim! That gives me a bunch of other ideas. The best thing about using this method would be near real-time scanning and notifications vice running a cron'd script at a given interval. In your code below, what are you asking bro to do, if anything with the returned value? # If the category signals a block bro_conn.send("stomper_block",seqno) > return > > #Main program - Initialize and call event loop > > # Setup the connection to bro > bro_conn = broccoli.Connection("127.0.0.1:47758") > > # Event loop > bro_event_loop(bro_conn) Will On Mon, Mar 21, 2011 at 4:05 PM, Jim Mellander wrote: > Hi Will: > > Seems like you would probably want to use the python broccoli bindings > to send an event to a python client, here's what I'm doing with my > "stomper" code, which looks up urls on the fly in a malware database: > > # In your bro startup script > @load listen-clear > > redef Remote::destinations += { > ["remote_stomper"] = [ $host=127.0.0.1, $events = > /remote_check_URL/, > $connect=F, $ssl=F ] > ... > > #within bro policy > > # Here we send to the broccoli client for checking/processing > event remote_check_URL(++stomper_seqno, c, is_orig, host, uri, ts); > > > ..................... > > On the python side, the relevant sections from the python code, which > is running as a daemon accepting events from bro and acting on them: > > #! /usr/bin/env python > # > > import broccoli > import sqlite3 > import random > import sys > import re > import select # for select loop > > > # Bro event loop > def bro_event_loop(bro_conn): > try: > bro_conn_fd=bro_conn_get_fd(bro_conn) > while True: > select.select((bro_conn_fd),(bro_conn_fd),(bro_conn_fd)) > bro_conn.processInput() > except: > while True: > bro_conn.processInput() > sleep(.1) > > @broccoli.event > def remote_check_URL(seqno, host, uri): > # Receive a URL from bro, and send a return signal back > # if it should be blocked. > category = check_database(host,uri) > if category: > if check_category(category): > # If the category signals a block > bro_conn.send("stomper_block",seqno) > return > > #Main program - Initialize and call event loop > > # Setup the connection to bro > bro_conn = broccoli.Connection("127.0.0.1:47758") > > # Event loop > bro_event_loop(bro_conn) > # Everything under this is never executed. > sys.exit(0) > > > > Hope this will help you kick the can down the road a bit.... > > > > > > On Mon, Mar 21, 2011 at 12:44 PM, Will wrote: > > > > > > On Mon, Mar 21, 2011 at 2:49 PM, Seth Hall wrote: > >> > >> On Mar 21, 2011, at 2:16 PM, Will wrote: > >> > >> > I will without a doubt eventually incorporate > >> > "http-ext-identified-files.sig" instead of what I am currently using, > but I > >> > am having trouble determining where to integrate the logic for > handling each > >> > file type. As it currently works, I am saving off every pdf and word > doc, > >> > which would be unnecessary if I used bro to call the external tools > and > >> > evaluate the results. > >> > >> >>That won't actually work quite right. The > http-ext-identified-files.sig > >> >> file uses special signature keywords that the http analyzer > >>provides to > >> >> detect file types. It's not directly applicable to SMTP/MIME > transfers. > >> > > Understandable. Being that there are so many different types it would be > > beneficial enough to create a signature file for SMTP/MIME. I would be > happy > > to share it when I get it done. > > > >> > >> > Current logic (this method calls for the external tools to be run > >> > against the directory by cron and are independent of Bro): > >> > hot_attachment_dump_fh = open( hot_attachment_dumpname ); > >> > write_file(hot_attachment_dump_fh, data); > >> > close(hot_attachment_dump_fh); > >> > >> >>In what event are you currently running using this code? > > > > Here is the entire event: > > > > event mime_entity_data(c: connection, length: count, data: string) > > { > > local session = get_session(c, T); > > > > #md5 hashing is now a builtin function, so just call it and > dumpthe > > result into the content_hash field > > #that field in the info struct was already there, just had to add > > this to fill it. > > session$content_hash = md5_hash(data); > > > > #log the first 256 bytes of the attachment and the MD5 hash. > > mime_log_msg(session, "data", fmt("%d: %s", length, > sub_bytes(data, > > 0, 256))); > > mime_log_msg(session, "all data", fmt("MD5: %s", > > session$content_hash)); > > > > #if the hot flag is set then we dump the MIME-decoded attachment > to > > it's own file for analysis > > if( session$entity_is_hot ) > > { > > if ( session$entity_filename == hot_pdf_attachment_filenames ) > > { > > #build the filename out of MD5, length and filename > > hot_attachment_dumpname = fmt("dumped_pdf_files\/%s:%d:%s", > > session$content_hash, length, session$entity_filename); > > } > > if ( session$entity_filename == hot_word_attachment_filenames ) > > { > > hot_attachment_dumpname = fmt("dumped_doc_files\/%s:%d:%s", > > session$content_hash, length,session$entity_filename); > > } > > > > #get a raw filehandle, notice open() instead of open_log_file(), > > write the data out, and be sure to close the fh > > hot_attachment_dump_fh = open( hot_attachment_dumpname ); > > write_file(hot_attachment_dump_fh, data); > > close(hot_attachment_dump_fh); > > > > #log stuff to the hot logfile as well > > mime_log_hot_msg(session, "hot data", fmt("%d: %s", length, > > sub_bytes(data, 0, 256))); > > mime_log_hot_msg(session, "hot data", fmt("File dumped: %s MD5: > %s", > > session$entity_filename, session$content_hash)); > > } > > > > I attached the modifed mime.bro in case anyone wanted to see the how the > > rest of it. > > > >> > The scan for office docs would be similiar, but use 'OfficeMalScanner' > >> > instead of pdfid.py and pdf-parser.py. If I get this to work, I would > like > >> > to do something very similar with http files. > >> > >> Makes sense. > >> > >> > How can I call the external tools? Is this the right place to be > doing > >> > this? > >> > >> You can't currently do this in a way that would be feasible on live > >> traffic. The problem is that the call to the external tool would block > Bro > >> and cause it to start dropping packets. There is a "when" statement > that > >> can help build asynchronous function calls though. So that the stack > state > >> will be saved and used again when the function call returns. I don't > know > >> if the system() (I think this is what you're looking for to run external > >> programs) function can be used with the when statement though. > > > > I suppose the short answer is yes. I was looking for something like the > > system() call. Like modifying the PyBroccoli Example from below: > > PyBroccoli Example: > > @event > > def pong(src_time, dst_time): > > print "pong event: time=%f/%f s" % \ > > (dst_time - src_time, current_time() - src_time) > > bc = Connection("127.0.0.1:47758") > > bc.send("ping", time(current_time())) > > > > To: > > > > @event (event == dumped pdf file) > > def pass_pdf(file): > > system(pdf_scan.py -f dumped_file.pdf > tempdir) > > > > With what you mentioned taken into account, we can't ask bro to wait on > the > > results, but maybe we could dump the results to a logfile for alerting? > > > >> > >> If you are looking to run this on tracefiles for now though, you can > >> certainly just use the system function to call your external tool. It > takes > >> a single argument (a string) that is the command line you'd like to run. > >> There is a function for defanging data if you need to do that too > (taking > >> something off the line and using it in the command line) named > >> str_shell_escape. You do need to make sure that the data that is > defanged > >> with str_shell_escape is placed within double-quotes. > >> > >> > I would be surprised if this capability doesn't already exist and > >> > suppose I might be going about this all wrong. I would just prefer to > >> > incorporate the file scans in Bro vice running them completely > >> > independently. If I wasn't clear or am completely out in left field > feel > >> > free to be honest. I won't be offended. > >> > >> Nope, not out in left field at all and personally I'm a bit ashamed I > >> never wrote a mime-ext.bro script that was a bit more capable like the > >> http-ext script. I'm going to be rewriting the mime.bro script for the > next > >> release though and it will definitely have file extraction and > >> identification capabilities built into it. However, we are going to be > >> working toward a much more generalized notion of files for some future > >> release of Bro. I've worked a bit on how that may proceed, but > >> unfortunately we definitely won't be anywhere close to ready with that > for > >> the next release. > > > > > > > > Maybe you should charge "more" for Bro... > > > > > > No, you all are doing a great job on this project. I just wish I could do > > more to help. > > > >> > >> .Seth > >> > >> -- > >> Seth Hall > >> International Computer Science Institute > >> (Bro) because everyone has a network > >> http://www.bro-ids.org/ > >> > > > > Will > > > > _______________________________________________ > > Bro mailing list > > bro at bro-ids.org > > http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/bro > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ICSI.Berkeley.EDU/pipermail/bro/attachments/20110321/48511ac4/attachment.html From baxterw3232 at gmail.com Mon Mar 21 13:44:47 2011 From: baxterw3232 at gmail.com (Will) Date: Mon, 21 Mar 2011 16:44:47 -0400 Subject: [Bro] File Scanning Capability In-Reply-To: <4D87B4C6.9090009@lbl.gov> References: <0A4B2FE3-9CB6-4398-91AA-177619767CA2@icir.org> <4D87B4C6.9090009@lbl.gov> Message-ID: Thanks Scott. That would be great. I assume you meant 'http' file transfers? I have a very limited amount of experience analyzing pdf files, but understand that there are many characteristics that can be used to narrow down files that actually need to be analyzed. I am interested in parsing the results of pdfid.py, if conditions are met, pass the results in an alert. And potentially triggering pdf-parser.py to add additional content for analysis in the alert. I would be very interested in seeing what you are doing. Will On Mon, Mar 21, 2011 at 4:27 PM, Scott Campbell wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > I implemented a straw man version of what you are doing for html file > transfers - in particular looking at PDF files via the pdfid tool. As > Jim pointed out, it is trivial to do a python->bro event call back via > Broccoli. I will post the code when I get back home - it is more of a > hack, but might prove to be helpful. > > cheers, > scott > > On 3/21/11 2:44 PM, Will wrote: > > On Mon, Mar 21, 2011 at 2:49 PM, Seth Hall wrote: > > > >> > >> On Mar 21, 2011, at 2:16 PM, Will wrote: > >> > >>> I will without a doubt eventually incorporate > >> "http-ext-identified-files.sig" instead of what I am currently using, > but I > >> am having trouble determining where to integrate the logic for handling > each > >> file type. As it currently works, I am saving off every pdf and word > doc, > >> which would be unnecessary if I used bro to call the external tools and > >> evaluate the results. > >> > >>>> That won't actually work quite right. The > http-ext-identified-files.sig > >> file uses special signature keywords that the http analyzer >>provides > to > >> detect file types. It's not directly applicable to SMTP/MIME transfers. > >> > >> Understandable. Being that there are so many different types it would be > > beneficial enough to create a signature file for SMTP/MIME. I would be > happy > > to share it when I get it done. > > > > > >>> Current logic (this method calls for the external tools to be run > against > >> the directory by cron and are independent of Bro): > >>> hot_attachment_dump_fh = open( hot_attachment_dumpname ); > >>> write_file(hot_attachment_dump_fh, data); > >>> close(hot_attachment_dump_fh); > >> > >>>> In what event are you currently running using this code? > >> > > > > Here is the entire event: > > > > event mime_entity_data(c: connection, length: count, data: string) > > { > > local session = get_session(c, T); > > > > #md5 hashing is now a builtin function, so just call it and > dumpthe > > result into the content_hash field > > #that field in the info struct was already there, just had to add > > this to fill it. > > session$content_hash = md5_hash(data); > > > > #log the first 256 bytes of the attachment and the MD5 hash. > > mime_log_msg(session, "data", fmt("%d: %s", length, > sub_bytes(data, > > 0, 256))); > > mime_log_msg(session, "all data", fmt("MD5: %s", > > session$content_hash)); > > > > #if the hot flag is set then we dump the MIME-decoded attachment > to > > it's own file for analysis > > if( session$entity_is_hot ) > > { > > if ( session$entity_filename == hot_pdf_attachment_filenames ) > > { > > #build the filename out of MD5, length and filename > > hot_attachment_dumpname = fmt("dumped_pdf_files\/%s:%d:%s", > > session$content_hash, length, session$entity_filename); > > } > > if ( session$entity_filename == hot_word_attachment_filenames ) > > { > > hot_attachment_dumpname = fmt("dumped_doc_files\/%s:%d:%s", > > session$content_hash, length,session$entity_filename); > > } > > > > #get a raw filehandle, notice open() instead of open_log_file(), > > write the data out, and be sure to close the fh > > hot_attachment_dump_fh = open( hot_attachment_dumpname ); > > write_file(hot_attachment_dump_fh, data); > > close(hot_attachment_dump_fh); > > > > #log stuff to the hot logfile as well > > mime_log_hot_msg(session, "hot data", fmt("%d: %s", length, > > sub_bytes(data, 0, 256))); > > mime_log_hot_msg(session, "hot data", fmt("File dumped: %s MD5: > %s", > > session$entity_filename, session$content_hash)); > > } > > > > I attached the modifed mime.bro in case anyone wanted to see the how the > > rest of it. > > > >> The scan for office docs would be similiar, but use 'OfficeMalScanner' > >> instead of pdfid.py and pdf-parser.py. If I get this to work, I would > like > >> to do something very similar with http files. > >> > >> Makes sense. > >> > >>> How can I call the external tools? Is this the right place to be doing > >> this? > >> > >> You can't currently do this in a way that would be feasible on live > >> traffic. The problem is that the call to the external tool would block > Bro > >> and cause it to start dropping packets. There is a "when" statement > that > >> can help build asynchronous function calls though. So that the stack > state > >> will be saved and used again when the function call returns. I don't > know > >> if the system() (I think this is what you're looking for to run external > >> programs) function can be used with the when statement though. > > > > > > I suppose the short answer is yes. I was looking for something like the > > system() call. Like modifying the PyBroccoli Example from below: > > PyBroccoli Example: > > @event > > def pong(src_time, dst_time): > > print "pong event: time=%f/%f s" % \ > > (dst_time - src_time, current_time() - src_time) > > bc = Connection("127.0.0.1:47758") > > bc.send("ping", time(current_time())) > > > > To: > > > > @event (event == dumped pdf file) > > def pass_pdf(file): > > system(pdf_scan.py -f dumped_file.pdf > tempdir) > > > > With what you mentioned taken into account, we can't ask bro to wait on > the > > results, but maybe we could dump the results to a logfile for alerting? > > > > > >> If you are looking to run this on tracefiles for now though, you can > >> certainly just use the system function to call your external tool. It > takes > >> a single argument (a string) that is the command line you'd like to run. > >> There is a function for defanging data if you need to do that too > (taking > >> something off the line and using it in the command line) named > >> str_shell_escape. You do need to make sure that the data that is > defanged > >> with str_shell_escape is placed within double-quotes. > >> > >>> I would be surprised if this capability doesn't already exist and > suppose > >> I might be going about this all wrong. I would just prefer to > incorporate > >> the file scans in Bro vice running them completely independently. If I > >> wasn't clear or am completely out in left field feel free to be honest. > I > >> won't be offended. > >> > >> Nope, not out in left field at all and personally I'm a bit ashamed I > never > >> wrote a mime-ext.bro script that was a bit more capable like the > http-ext > >> script. I'm going to be rewriting the mime.bro script for the next > release > >> though and it will definitely have file extraction and identification > >> capabilities built into it. However, we are going to be working toward > a > >> much more generalized notion of files for some future release of Bro. > I've > >> worked a bit on how that may proceed, but unfortunately we definitely > won't > >> be anywhere close to ready with that for the next release. > >> > > > > > > Maybe you should charge "more" for Bro... > > > > > > No, you all are doing a great job on this project. I just wish I could do > > more to help. > > > > > >> .Seth > >> > >> -- > >> Seth Hall > >> International Computer Science Institute > >> (Bro) because everyone has a network > >> http://www.bro-ids.org/ > >> > >> > > Will > > > > > > > > > > _______________________________________________ > > Bro mailing list > > bro at bro-ids.org > > http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/bro > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.9 (Darwin) > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ > > iD8DBQFNh7TGK2Plq8B7ZBwRAkMzAKDirnfa8BLlP75GVRi6jl7V7jgXUQCgmKeZ > AmyCF5VQsQXYdhRxyTaqanw= > =gSvv > -----END PGP SIGNATURE----- > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ICSI.Berkeley.EDU/pipermail/bro/attachments/20110321/1ac0cfa1/attachment.html From jmellander at lbl.gov Mon Mar 21 14:03:24 2011 From: jmellander at lbl.gov (Jim Mellander) Date: Mon, 21 Mar 2011 14:03:24 -0700 Subject: [Bro] File Scanning Capability In-Reply-To: References: <0A4B2FE3-9CB6-4398-91AA-177619767CA2@icir.org> Message-ID: Hi Will, When bro receives the event, it will raise a notice that will execute a custom host-pair-drop-connectivity script that drops the source/destination host pair for a short period to interrupt the connection in realtime. seqno is used by bro to keep track of which request it sent, so that the event can identify the request that was made. This is in a table whose entries expire rapidly (the timeout > the expected response time of the python program) BTW: I believe there was a bug in my code above (i put it down half-baked a while ago, and haven't picked it up in a while) - the broccoli event should have the same number of arguments as the bro event that sends it, and vice versa. On Mon, Mar 21, 2011 at 1:34 PM, Will wrote: > Thanks for that example Jim! > > That gives me a bunch of other ideas. The best thing about using this method > would be near real-time scanning and notifications vice running a cron'd > script at a given interval. > > In your code below, what are you asking bro to do, if anything with the > returned value? > > ?????????? # If the category signals a block > ? ? ? ? ? ?bro_conn.send("stomper_block",seqno) >> >> ??? return >> >> #Main program - Initialize and call event loop >> >> # Setup the connection to bro >> bro_conn = broccoli.Connection("127.0.0.1:47758") >> >> # Event loop >> bro_event_loop(bro_conn) > > Will > > On Mon, Mar 21, 2011 at 4:05 PM, Jim Mellander wrote: >> >> Hi Will: >> >> Seems like you would probably want to use the python broccoli bindings >> to send an event to a python client, here's what I'm doing with my >> "stomper" code, which looks up urls on the fly in a malware database: >> >> # In your bro startup script >> @load listen-clear >> >> redef Remote::destinations += { >> ? ? ? ?["remote_stomper"] = [ $host=127.0.0.1, $events = >> /remote_check_URL/, >> ?$connect=F, $ssl=F ] >> ... >> >> #within bro policy >> >> # Here we send to the broccoli client for checking/processing >> event remote_check_URL(++stomper_seqno, c, is_orig, host, uri, ts); >> >> >> ..................... >> >> On the python side, the relevant sections from the python code, which >> is running as a daemon accepting events from bro and acting on them: >> >> #! /usr/bin/env python >> # >> >> import broccoli >> import sqlite3 >> import random >> import sys >> import re >> import select ? # for select loop >> >> >> # Bro event loop >> def bro_event_loop(bro_conn): >> ? ?try: >> ? ? ? ?bro_conn_fd=bro_conn_get_fd(bro_conn) >> ? ? ? ?while True: >> ? ? ? ? ? ?select.select((bro_conn_fd),(bro_conn_fd),(bro_conn_fd)) >> ? ? ? ? ? ?bro_conn.processInput() >> ? ?except: >> ? ? ? ?while True: >> ? ? ? ? ? ?bro_conn.processInput() >> ? ? ? ? ? ?sleep(.1) >> >> @broccoli.event >> def remote_check_URL(seqno, host, uri): >> ? ?# Receive a URL from bro, and send a return signal back >> ? ?# ?if it should be blocked. >> ? ?category = check_database(host,uri) >> ? ?if category: >> ? ? ? ?if check_category(category): >> ? ? ? ? ? ?# If the category signals a block >> ? ? ? ? ? ?bro_conn.send("stomper_block",seqno) >> ? ?return >> >> #Main program - Initialize and call event loop >> >> # Setup the connection to bro >> bro_conn = broccoli.Connection("127.0.0.1:47758") >> >> # Event loop >> bro_event_loop(bro_conn) >> # Everything under this is never executed. >> sys.exit(0) >> >> >> >> Hope this will help you kick the can down the road a bit.... >> >> >> >> >> >> On Mon, Mar 21, 2011 at 12:44 PM, Will wrote: >> > >> > >> > On Mon, Mar 21, 2011 at 2:49 PM, Seth Hall wrote: >> >> >> >> On Mar 21, 2011, at 2:16 PM, Will wrote: >> >> >> >> > I will without a doubt eventually incorporate >> >> > "http-ext-identified-files.sig" instead of what I am currently using, >> >> > but I >> >> > am having trouble determining where to integrate the logic for >> >> > handling each >> >> > file type. As it currently works, I am saving off every pdf and word >> >> > doc, >> >> > which would be unnecessary if I used bro to call the external tools >> >> > and >> >> > evaluate the results. >> >> >> >> >>That won't actually work quite right. ?The >> >> >> http-ext-identified-files.sig >> >> >> file uses special signature keywords that the http analyzer >> >> >> >>provides to >> >> >> detect file types. ?It's not directly applicable to SMTP/MIME >> >> >> transfers. >> >> >> > Understandable. Being that there are so many different types it would be >> > beneficial enough to create a signature file for SMTP/MIME. I would be >> > happy >> > to share it when I get it done. >> > >> >> >> >> > Current logic (this method calls for the external tools to be run >> >> > against the directory by cron and are independent of Bro): >> >> > ? ? ? ? hot_attachment_dump_fh = open( hot_attachment_dumpname ); >> >> > ? ? ? ? write_file(hot_attachment_dump_fh, data); >> >> > ? ? ? ? close(hot_attachment_dump_fh); >> >> >> >> >>In what event are you currently running using this code? >> > >> > Here is the entire event: >> > >> > event mime_entity_data(c: connection, length: count, data: string) >> > ?????? { >> > ?????? local session = get_session(c, T); >> > >> > ?????? #md5 hashing is now a builtin function, so just call it and >> > dumpthe >> > result into the content_hash field >> > ?????? #that field in the info struct was already there, just had to add >> > this to fill it. >> > ?????? session$content_hash = md5_hash(data); >> > >> > ?????? #log the first 256 bytes of the attachment and the MD5 hash. >> > ?????? mime_log_msg(session, "data", fmt("%d: %s", length, >> > sub_bytes(data, >> > 0, 256))); >> > ?????? mime_log_msg(session, "all data", fmt("MD5: %s", >> > session$content_hash)); >> > >> > ?????? #if the hot flag is set then we dump the MIME-decoded attachment >> > to >> > it's own file for analysis >> > ?????? if( session$entity_is_hot ) >> > ??????? { >> > ??????? if ( session$entity_filename == hot_pdf_attachment_filenames ) >> > ???????????? { >> > ???????????? #build the filename out of MD5, length and filename >> > ???????????? hot_attachment_dumpname = fmt("dumped_pdf_files\/%s:%d:%s", >> > session$content_hash, length, session$entity_filename); >> > ???????????? } >> > ??????? if ( session$entity_filename == hot_word_attachment_filenames ) >> > ???????????? { >> > ???????????? hot_attachment_dumpname = fmt("dumped_doc_files\/%s:%d:%s", >> > session$content_hash, length,session$entity_filename); >> > ???????????? } >> > >> > ??????? #get a raw filehandle, notice open() instead of open_log_file(), >> > write the data out, and be sure to close the fh >> > ??????? hot_attachment_dump_fh = open( hot_attachment_dumpname ); >> > ??????? write_file(hot_attachment_dump_fh, data); >> > ??????? close(hot_attachment_dump_fh); >> > >> > ??????? #log stuff to the hot logfile as well >> > ???????????? mime_log_hot_msg(session, "hot data", fmt("%d: %s", length, >> > sub_bytes(data, 0, 256))); >> > ??????? mime_log_hot_msg(session, "hot data", fmt("File dumped: %s MD5: >> > %s", >> > session$entity_filename, session$content_hash)); >> > ??????? } >> > >> > I attached the modifed mime.bro in case anyone wanted to see the how the >> > rest of it. >> > >> >> > The scan for office docs would be similiar, but use >> >> > 'OfficeMalScanner' >> >> > instead of pdfid.py and pdf-parser.py. If I get this to work, I would >> >> > like >> >> > to do something very similar with http files. >> >> >> >> Makes sense. >> >> >> >> > How can I call the external tools? ?Is this the right place to be >> >> > doing >> >> > this? >> >> >> >> You can't currently do this in a way that would be feasible on live >> >> traffic. ?The problem is that the call to the external tool would block >> >> Bro >> >> and cause it to start dropping packets. ?There is a "when" statement >> >> that >> >> can help build asynchronous function calls though. ?So that the stack >> >> state >> >> will be saved and used again when the function call returns. ?I don't >> >> know >> >> if the system() (I think this is what you're looking for to run >> >> external >> >> programs) function can be used with the when statement though. >> > >> > I suppose the short answer is yes. I was looking for something like the >> > system() call. Like modifying the PyBroccoli Example from below: >> > PyBroccoli Example: >> > @event >> > def pong(src_time, dst_time): >> > ??? print "pong event: time=%f/%f s" % \ >> > ?????? (dst_time - src_time, current_time() - src_time) >> > bc = Connection("127.0.0.1:47758") >> > bc.send("ping", time(current_time())) >> > >> > To: >> > >> > @event (event == dumped pdf file) >> > def pass_pdf(file): >> > ????? system(pdf_scan.py -f dumped_file.pdf > tempdir) >> > >> > With what you mentioned taken into account, we can't ask bro to wait on >> > the >> > results, but maybe we could dump the results to a logfile for alerting? >> > >> >> >> >> If you are looking to run this on tracefiles for now though, you can >> >> certainly just use the system function to call your external tool. ?It >> >> takes >> >> a single argument (a string) that is the command line you'd like to >> >> run. >> >> ?There is a function for defanging data if you need to do that too >> >> (taking >> >> something off the line and using it in the command line) named >> >> str_shell_escape. ?You do need to make sure that the data that is >> >> defanged >> >> with str_shell_escape is placed within double-quotes. >> >> >> >> > I would be surprised if this capability doesn't already exist and >> >> > suppose I might be going about this all wrong. I would just prefer to >> >> > incorporate the file scans in Bro vice running them completely >> >> > independently. If I wasn't clear or am completely out in left field >> >> > feel >> >> > free to be honest. I won't be offended. >> >> >> >> Nope, not out in left field at all and personally I'm a bit ashamed I >> >> never wrote a mime-ext.bro script that was a bit more capable like the >> >> http-ext script. ?I'm going to be rewriting the mime.bro script for the >> >> next >> >> release though and it will definitely have file extraction and >> >> identification capabilities built into it. ?However, we are going to be >> >> working toward a much more generalized notion of files for some future >> >> release of Bro. ?I've worked a bit on how that may proceed, but >> >> unfortunately we definitely won't be anywhere close to ready with that >> >> for >> >> the next release. >> > >> > >> > >> > Maybe you should charge "more" for Bro... >> > >> > >> > No, you all are doing a great job on this project. I just wish I could >> > do >> > more to help. >> > >> >> >> >> ?.Seth >> >> >> >> -- >> >> Seth Hall >> >> International Computer Science Institute >> >> (Bro) because everyone has a network >> >> http://www.bro-ids.org/ >> >> >> > >> > Will >> > >> > _______________________________________________ >> > Bro mailing list >> > bro at bro-ids.org >> > http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/bro >> > > > From baxterw3232 at gmail.com Mon Mar 21 14:09:35 2011 From: baxterw3232 at gmail.com (Will) Date: Mon, 21 Mar 2011 17:09:35 -0400 Subject: [Bro] File Scanning Capability In-Reply-To: References: <0A4B2FE3-9CB6-4398-91AA-177619767CA2@icir.org> Message-ID: Understood. Thanks again for the info. Will On Mon, Mar 21, 2011 at 5:03 PM, Jim Mellander wrote: > Hi Will, > > When bro receives the event, it will raise a notice that will execute > a custom host-pair-drop-connectivity script that drops the > source/destination host pair for a short period to interrupt the > connection in realtime. > > seqno is used by bro to keep track of which request it sent, so that > the event can identify the request that was made. This is in a table > whose entries expire rapidly (the timeout > the expected response time > of the python program) > > BTW: > > I believe there was a bug in my code above (i put it down half-baked a > while ago, and haven't picked it up in a while) - the broccoli event > should have the same number of arguments as the bro event that sends > it, and vice versa. > > > > On Mon, Mar 21, 2011 at 1:34 PM, Will wrote: > > Thanks for that example Jim! > > > > That gives me a bunch of other ideas. The best thing about using this > method > > would be near real-time scanning and notifications vice running a cron'd > > script at a given interval. > > > > In your code below, what are you asking bro to do, if anything with the > > returned value? > > > > # If the category signals a block > > bro_conn.send("stomper_block",seqno) > >> > >> return > >> > >> #Main program - Initialize and call event loop > >> > >> # Setup the connection to bro > >> bro_conn = broccoli.Connection("127.0.0.1:47758") > >> > >> # Event loop > >> bro_event_loop(bro_conn) > > > > Will > > > > On Mon, Mar 21, 2011 at 4:05 PM, Jim Mellander > wrote: > >> > >> Hi Will: > >> > >> Seems like you would probably want to use the python broccoli bindings > >> to send an event to a python client, here's what I'm doing with my > >> "stomper" code, which looks up urls on the fly in a malware database: > >> > >> # In your bro startup script > >> @load listen-clear > >> > >> redef Remote::destinations += { > >> ["remote_stomper"] = [ $host=127.0.0.1, $events = > >> /remote_check_URL/, > >> $connect=F, $ssl=F ] > >> ... > >> > >> #within bro policy > >> > >> # Here we send to the broccoli client for checking/processing > >> event remote_check_URL(++stomper_seqno, c, is_orig, host, uri, ts); > >> > >> > >> ..................... > >> > >> On the python side, the relevant sections from the python code, which > >> is running as a daemon accepting events from bro and acting on them: > >> > >> #! /usr/bin/env python > >> # > >> > >> import broccoli > >> import sqlite3 > >> import random > >> import sys > >> import re > >> import select # for select loop > >> > >> > >> # Bro event loop > >> def bro_event_loop(bro_conn): > >> try: > >> bro_conn_fd=bro_conn_get_fd(bro_conn) > >> while True: > >> select.select((bro_conn_fd),(bro_conn_fd),(bro_conn_fd)) > >> bro_conn.processInput() > >> except: > >> while True: > >> bro_conn.processInput() > >> sleep(.1) > >> > >> @broccoli.event > >> def remote_check_URL(seqno, host, uri): > >> # Receive a URL from bro, and send a return signal back > >> # if it should be blocked. > >> category = check_database(host,uri) > >> if category: > >> if check_category(category): > >> # If the category signals a block > >> bro_conn.send("stomper_block",seqno) > >> return > >> > >> #Main program - Initialize and call event loop > >> > >> # Setup the connection to bro > >> bro_conn = broccoli.Connection("127.0.0.1:47758") > >> > >> # Event loop > >> bro_event_loop(bro_conn) > >> # Everything under this is never executed. > >> sys.exit(0) > >> > >> > >> > >> Hope this will help you kick the can down the road a bit.... > >> > >> > >> > >> > >> > >> On Mon, Mar 21, 2011 at 12:44 PM, Will wrote: > >> > > >> > > >> > On Mon, Mar 21, 2011 at 2:49 PM, Seth Hall wrote: > >> >> > >> >> On Mar 21, 2011, at 2:16 PM, Will wrote: > >> >> > >> >> > I will without a doubt eventually incorporate > >> >> > "http-ext-identified-files.sig" instead of what I am currently > using, > >> >> > but I > >> >> > am having trouble determining where to integrate the logic for > >> >> > handling each > >> >> > file type. As it currently works, I am saving off every pdf and > word > >> >> > doc, > >> >> > which would be unnecessary if I used bro to call the external tools > >> >> > and > >> >> > evaluate the results. > >> >> > >> >> >>That won't actually work quite right. The > >> >> >> http-ext-identified-files.sig > >> >> >> file uses special signature keywords that the http analyzer > >> >> >> >>provides to > >> >> >> detect file types. It's not directly applicable to SMTP/MIME > >> >> >> transfers. > >> >> > >> > Understandable. Being that there are so many different types it would > be > >> > beneficial enough to create a signature file for SMTP/MIME. I would be > >> > happy > >> > to share it when I get it done. > >> > > >> >> > >> >> > Current logic (this method calls for the external tools to be run > >> >> > against the directory by cron and are independent of Bro): > >> >> > hot_attachment_dump_fh = open( hot_attachment_dumpname ); > >> >> > write_file(hot_attachment_dump_fh, data); > >> >> > close(hot_attachment_dump_fh); > >> >> > >> >> >>In what event are you currently running using this code? > >> > > >> > Here is the entire event: > >> > > >> > event mime_entity_data(c: connection, length: count, data: string) > >> > { > >> > local session = get_session(c, T); > >> > > >> > #md5 hashing is now a builtin function, so just call it and > >> > dumpthe > >> > result into the content_hash field > >> > #that field in the info struct was already there, just had to > add > >> > this to fill it. > >> > session$content_hash = md5_hash(data); > >> > > >> > #log the first 256 bytes of the attachment and the MD5 hash. > >> > mime_log_msg(session, "data", fmt("%d: %s", length, > >> > sub_bytes(data, > >> > 0, 256))); > >> > mime_log_msg(session, "all data", fmt("MD5: %s", > >> > session$content_hash)); > >> > > >> > #if the hot flag is set then we dump the MIME-decoded > attachment > >> > to > >> > it's own file for analysis > >> > if( session$entity_is_hot ) > >> > { > >> > if ( session$entity_filename == hot_pdf_attachment_filenames ) > >> > { > >> > #build the filename out of MD5, length and filename > >> > hot_attachment_dumpname = > fmt("dumped_pdf_files\/%s:%d:%s", > >> > session$content_hash, length, session$entity_filename); > >> > } > >> > if ( session$entity_filename == hot_word_attachment_filenames > ) > >> > { > >> > hot_attachment_dumpname = > fmt("dumped_doc_files\/%s:%d:%s", > >> > session$content_hash, length,session$entity_filename); > >> > } > >> > > >> > #get a raw filehandle, notice open() instead of > open_log_file(), > >> > write the data out, and be sure to close the fh > >> > hot_attachment_dump_fh = open( hot_attachment_dumpname ); > >> > write_file(hot_attachment_dump_fh, data); > >> > close(hot_attachment_dump_fh); > >> > > >> > #log stuff to the hot logfile as well > >> > mime_log_hot_msg(session, "hot data", fmt("%d: %s", > length, > >> > sub_bytes(data, 0, 256))); > >> > mime_log_hot_msg(session, "hot data", fmt("File dumped: %s > MD5: > >> > %s", > >> > session$entity_filename, session$content_hash)); > >> > } > >> > > >> > I attached the modifed mime.bro in case anyone wanted to see the how > the > >> > rest of it. > >> > > >> >> > The scan for office docs would be similiar, but use > >> >> > 'OfficeMalScanner' > >> >> > instead of pdfid.py and pdf-parser.py. If I get this to work, I > would > >> >> > like > >> >> > to do something very similar with http files. > >> >> > >> >> Makes sense. > >> >> > >> >> > How can I call the external tools? Is this the right place to be > >> >> > doing > >> >> > this? > >> >> > >> >> You can't currently do this in a way that would be feasible on live > >> >> traffic. The problem is that the call to the external tool would > block > >> >> Bro > >> >> and cause it to start dropping packets. There is a "when" statement > >> >> that > >> >> can help build asynchronous function calls though. So that the stack > >> >> state > >> >> will be saved and used again when the function call returns. I don't > >> >> know > >> >> if the system() (I think this is what you're looking for to run > >> >> external > >> >> programs) function can be used with the when statement though. > >> > > >> > I suppose the short answer is yes. I was looking for something like > the > >> > system() call. Like modifying the PyBroccoli Example from below: > >> > PyBroccoli Example: > >> > @event > >> > def pong(src_time, dst_time): > >> > print "pong event: time=%f/%f s" % \ > >> > (dst_time - src_time, current_time() - src_time) > >> > bc = Connection("127.0.0.1:47758") > >> > bc.send("ping", time(current_time())) > >> > > >> > To: > >> > > >> > @event (event == dumped pdf file) > >> > def pass_pdf(file): > >> > system(pdf_scan.py -f dumped_file.pdf > tempdir) > >> > > >> > With what you mentioned taken into account, we can't ask bro to wait > on > >> > the > >> > results, but maybe we could dump the results to a logfile for > alerting? > >> > > >> >> > >> >> If you are looking to run this on tracefiles for now though, you can > >> >> certainly just use the system function to call your external tool. > It > >> >> takes > >> >> a single argument (a string) that is the command line you'd like to > >> >> run. > >> >> There is a function for defanging data if you need to do that too > >> >> (taking > >> >> something off the line and using it in the command line) named > >> >> str_shell_escape. You do need to make sure that the data that is > >> >> defanged > >> >> with str_shell_escape is placed within double-quotes. > >> >> > >> >> > I would be surprised if this capability doesn't already exist and > >> >> > suppose I might be going about this all wrong. I would just prefer > to > >> >> > incorporate the file scans in Bro vice running them completely > >> >> > independently. If I wasn't clear or am completely out in left field > >> >> > feel > >> >> > free to be honest. I won't be offended. > >> >> > >> >> Nope, not out in left field at all and personally I'm a bit ashamed I > >> >> never wrote a mime-ext.bro script that was a bit more capable like > the > >> >> http-ext script. I'm going to be rewriting the mime.bro script for > the > >> >> next > >> >> release though and it will definitely have file extraction and > >> >> identification capabilities built into it. However, we are going to > be > >> >> working toward a much more generalized notion of files for some > future > >> >> release of Bro. I've worked a bit on how that may proceed, but > >> >> unfortunately we definitely won't be anywhere close to ready with > that > >> >> for > >> >> the next release. > >> > > >> > > >> > > >> > Maybe you should charge "more" for Bro... > >> > > >> > > >> > No, you all are doing a great job on this project. I just wish I could > >> > do > >> > more to help. > >> > > >> >> > >> >> .Seth > >> >> > >> >> -- > >> >> Seth Hall > >> >> International Computer Science Institute > >> >> (Bro) because everyone has a network > >> >> http://www.bro-ids.org/ > >> >> > >> > > >> > Will > >> > > >> > _______________________________________________ > >> > Bro mailing list > >> > bro at bro-ids.org > >> > http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/bro > >> > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ICSI.Berkeley.EDU/pipermail/bro/attachments/20110321/10b0402b/attachment.html From jmellander at lbl.gov Tue Mar 29 13:52:58 2011 From: jmellander at lbl.gov (Jim Mellander) Date: Tue, 29 Mar 2011 13:52:58 -0700 Subject: [Bro] Fwd: Bug in drop.bro and patch In-Reply-To: References: Message-ID: ---------- Forwarded message ---------- From: Jim Mellander Date: Tue, Mar 29, 2011 at 1:49 PM Subject: Bug in drop.bro and patch To: bro-dev at bro-ids.org Hi folks: In drop.bro, if use_catch_release is F (indicating that you don't want to use catch & release), bro will still attempt to unblock hosts after a 1 day timeout by executing the clear_host function (see the drop_info table), and if there is a restore-connectivity script in the path, it will get executed, so you actually get a pseudo catch & release. The fix is to add a one liner to the clear_host function, which returns immediately if catch & release is not enabled. ?See patch below: ==================================== *** drop.bro ? ?Tue Mar 29 13:39:44 2011 --- drop.bro.new ? ? ? ?Tue Mar 29 13:37:16 2011 *************** *** 283,288 **** --- 283,289 ---- ?function clear_host(t: table[addr] of drop_rec, a: addr): interval ? ? ? ?{ + ? ? ? if ( ! use_catch_release ) ? ? ?return 0 secs; ? ? ? ?if ( is_dropped(a) ) ? ? ? ? ? ? ? ?# Restore address. ? ? ? ? ? ? ? ?do_restore(a, T);