[Bro] Couple elasticsearch questions

Michael Wenthold michael.wenthold at gmail.com
Wed Jul 23 13:20:27 PDT 2014


I'm for from being an expert, but I think that using the built in grok
patterns and/or being more specific with the regex syntax will result in
better logstash performance.

for example, I'm profiling performance for some of our dns grok parsing
patterns:

match => [ "message",
"(?<bro_event_time>[0-9\.]{14})[0-9]+\t%{IP:dns_requester}\s%(?<dns_query_src_port>[0-9]{1,5})\t%{IP:dns_server}\s%(?<dns_query_dst_port>[0-9]{1,5})\t%{WORD:dns_query_proto}\t(?<dns_query_transid>[0-9]+)\t%{HOSTNAME:dns_query}\t%{NOTSPACE:dns_query_class}\t(?<dns_query_type>[A-Za-z0-9\-\*]+)\t%{NOTSPACE:dns_query_result>[A-Z\*]+)\t(?<dns_authoritative_answer>[TF])\t(?<dns_recursion_desired>[TF])\t(?<dns_recursion_available>[TF])\t%{GREEDYDATA:dns_response}"
]

I'm also sure that there's more efficient ways to write it than what I
did.  The odd parsing of the timestamp is because we use logstash to
rewrite event times where possible, using the actual event time with the
date filter:

   date {
      match => [ "bro_event_time", "UNIX" ]
    }

Just my .02.


On Wed, Jul 23, 2014 at 11:58 AM, Craig Pluchinsky <craigp at iup.edu> wrote:

> I've done most of them using grok and custom patterns.  Conn.log below
> Using logstash to read the log files, process and insert into
> elasticsearch.  Then using kibana as a web front end.
>
>        grok {
>          match => [ "message",
>
> "(?<ts>(.*?))\t(?<uid>(.*?))\t(?<id.orig_h>(.*?))\t(?<id.orig_p>(.*?))\t(?<id.resp_h>(.*?))\t(?<id.resp_p>(.*?))\t(?<proto>(.*?))\t(?<service>(.*?))\t(?<duration>(.*?))\t(?<orig_bytes>(.*?))\t(?<resp_bytes>(.*?))\t(?<conn_state>(.*?))\t(?<local_orig>(.*?))\t(?<missed_bytes>(.*?))\t(?<history>(.*?))\t(?<orig_pkts>(.*?))\t(?<orig_ip_bytes>(.*?))\t(?<resp_pkts>(.*?))\t(?<resp_ip_bytes>(.*?))\t(?<tunnel_parents>(.*))"
> ]
>        }
>
>
>
> -------------------------------
> Craig Pluchinsky
> IT Services
> Indiana University of Pennsylvania
> 724-357-3327
>
>
> On Wed, 23 Jul 2014, James Lay wrote:
>
> > On 2014-07-23 09:40, Seth Hall wrote:
> >> On Jul 23, 2014, at 11:10 AM, James Lay <jlay at slave-tothe-box.net>
> >> wrote:
> >>
> >>> 1.  Is there a proper way to set which logs to send to elasticsearch
> >>> that I can use in local.bro instead of modifying
> >>> logs-to-elasticsearch.bro?
> >>
> >> Yes, there are settings that you can change.  In local.bro, you can
> >> do this...
> >>
> >> @load tuning/logs-to-elasticsearch
> >> redef LogElasticSearch::send_logs += {
> >>      Conn::LOG,
> >>      HTTP::LOG
> >> };
> >>
> >> That will only send the conn.log and http.log to ElasticSearch.
> >>
> >>> 2.  The docs say to add @load tuning/logs-to-elasticsearch in
> >>> local.bro...how can I send bro data to a remote elasticsearch server
> >>> instead?
> >>
> >> redef LogElasticSearch::server_host = "1.2.3.4";
> >>
> >>> 3.  And lastly, as I look at the Brownian demo, I see that all the
> >>> fields are correctly laid out..was this down with Brownian, or with
> >>> elasticsearch itself?
> >>
> >> Could you explain what you mean by "correctly laid out"?
> >>
> >>> I'm trying to get bro data into logstash direct, instead of using
> >>> log
> >>> files.  Thanks for any insight.
> >>
> >> Cool!  With the current mechanism, you could encounter overload
> >> situations that cause Bro to grow in memory until you run out of
> >> memory.  We're slowly working on extensions to the ES writer to make
> >> it write to a disk backed queuing system so that things should remain
> >> more stable over time.  I am interested to hear any experiences you
> >> have with this though.
> >>
> >>   .Seth
> >
> > Thanks for the responses Gents...they do help.  So...for example
> > here...I have snort currently going to logstash.  In order to match
> > fields I have this:
> >
> > filter {
> >         grok {
> >                 match => [ "message", "%{SYSLOGTIMESTAMP:date}
> > %{IPORHOST:device} %{WORD:snort}\[%{INT:snort_pid}\]\:
> > \[%{INT:gid}\:%{INT:sid}\:%{INT:rev}\] %{DATA:ids_alert}
> > \[Classification\: %{DATA:ids_classification}\] \[Priority\:
> > %{INT:ids_priority}\] \{%{WORD:proto}\}
> > %{IP:ids_src_ip}\:%{INT:ids_src_port} \-\>
> > %{IP:ids_dst_ip}\:%{INT:ids_dst_port}" ]
> > }
> >
> > to match:
> >
> > Jul 23 09:44:46 gateway snort[13205]: [1:2500084:3305] ET COMPROMISED
> > Known Compromised or Hostile Host Traffic TCP group 43 [Classification:
> > Misc Attack] [Priority: 2] {TCP} 61.174.51.229:6000 -> x.x.x.x:22
> >
> > I'm guessing I'm going to have to create something like the above grok
> > for each bro log file....which...is going to be a hoot ;)  I was hoping
> > that work was already done somewhere...and I think I had it working at
> > one time for conn.log that I posted here some time ago.  Thanks
> > again...after looking at the Brownian source I think I'm going to have
> > to just bite the bullet and generate the grok lines.
> >
> > James
> >
> > _______________________________________________
> > Bro mailing list
> > bro at bro-ids.org
> > http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/bro
> >
> _______________________________________________
> Bro mailing list
> bro at bro-ids.org
> http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/bro
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ICSI.Berkeley.EDU/pipermail/bro/attachments/20140723/ee1fc293/attachment.html 


More information about the Bro mailing list