[Bro] elastic search / bro questions

Joe Blow blackhole.em at gmail.com
Thu Nov 6 18:37:50 PST 2014


Yuck.  I was really hoping this wasn't the way.  From everything you said,
the river is where i'm focusing.  I really, really dislike logstash (i'd
rather bend rsyslog + ES output plugin to my liking any day).
I've written a few custom ES output/input parsers and many SOLR parsers
that will parse bro logs, proxy logs, etc..., but would rather focus on
something more native to output to ES if possible.

I guess it might be time to dig into some src...

Thanks for the feedback.

Cheers,

JB



On Thu, Nov 6, 2014 at 9:25 PM, M K <mkhan04 at gmail.com> wrote:

> Unless it's changed within the past month or so, the ElasticSearch writer
> that comes with Bro is very alpha-level code. For the most part it fires
> and forgets and can be prone to losing messages if your cluster isn't able
> to keep up or some other situation causes it not to be able to ingest the
> data properly.
>
> Your best bet, as of now, is to write out the logs to disk and use some
> intermediary program to process the logs and ingest them into ES. Logstash
> can help, but with the default custom format Bro uses, it can't parse the
> data properly. If you're using Bro 2.3, you can modify the output format of
> the ascii writer to use json instead and then use logstash to feed the data
> relatively easily into ES. Further, I'd recommend using a rabbit river so
> ES can ingest the data at its leisure.
>
> If you're stuck with the non-json format, well your options are kinda
> limited. You can write a crazy custom logstash conf using grok (which is
> super inefficient) or figure out some other mechanism.
>
> As an aside, I've written a custom logstash filter that processes the
> custom bro format and is, to a limited extent, bro type aware so it can
> take old-style bro logs relatively easily and make it more usable (numbers
> are turned into numbers and sets, vectors and tables are turned into arrays
> -- same as how I've seen the ES writer output data). There are some caveats
> in its usage though. I'm putting the finishing touches on it and plan to
> release it when I get a chance (hopefully within the next week or two).
>
> On Thu, Nov 6, 2014 at 7:54 PM, Joe Blow <blackhole.em at gmail.com> wrote:
>
>> Hey all,
>>
>> Just going to throw this out there and hope some people are willing to
>> potentially share some learning experiences if they have any.
>>
>> We have a system which generates around 15k-30k BRO events/sec and are
>> trying to ingest these logs into a fairly beefy elasticsearch cluster.
>> Total cluster memory ~300GB, storage ~300TB.
>>
>> Long story short, we're having some problems keeping up with this feed.
>> Does anyone have any performance tuning with this module?  I've played a
>> lot with rsyslog batch sizes with elasticsearch and was hoping there would
>> be some simple directive i could try and apply to BRO.
>>
>> Does anyone have this experience here?  Does this module batch anything?
>>
>> Thanks in advance.
>>
>> Cheers,
>>
>> JB
>>
>> _______________________________________________
>> Bro mailing list
>> bro at bro-ids.org
>> http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/bro
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ICSI.Berkeley.EDU/pipermail/bro/attachments/20141106/066e55df/attachment.html 


More information about the Bro mailing list