[Bro] [BRO-ISSUE]: bro crash when so many Repoter::Error calls

Myth Ren email4myth at gmail.com
Thu Jan 25 08:18:07 PST 2018


Hi all,

    I'm using bro 2.5.1 for network security monitoring , the message queue
is kafka componment (the bro-to-kafka plugin version is v0.5.0, librdkafka
version is v0.9.5).

    Now i have encountered an error when network traffic up to 1.6Gbps, the
error message is segment fault from `src/Event.cc#90`, bro crashed.

    The following listed is our test environment informations:

> CPU: 32 core Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz

Memory: 64G

NIC:  Speed 10000Mb/s

Storage: 2TB SATA, 100GB SSD


Below listed information is backtrace from core dump. (more on gist
<https://gist.github.com/MythRen/b55220647ca28654c6f7e1db12ee6036>)

> #0  SetNext (this=0x0, n=0x7fe292ebd490) at
> /opt/download/bro/src/Event.h:21 #1  EventMgr::QueueEvent (this=0xc302c0
> <mgr>, event=event at entry=0x7fe292ebd490) at
> /opt/download/bro/src/Event.cc:90 #2  0x00000000005fe6a7 in QueueEvent
> (obj=0x0, mgr=0x0, aid=0, src=0, vl=0x7fe2e2bedb80, h=..., this=<optimized
> out>) at /opt/download/bro/src/Event.h:88 #3  Reporter::DoLog
> (this=0x29aabb0, prefix=prefix at entry=0x908cd7 "error", event=...,
> out=0x0, conn=conn at entry=0x0, addl=addl at entry=0x0, location=location at entry=true,
> time=time at entry=true, postfix=postfix at entry=0x0,     fmt=fmt at entry=0x7fe36c719d70
> "Kafka send failed: %s", ap=ap at entry=0x7fe36aa3eaf8) at
> /opt/download/bro/src/Reporter.cc:350 #4  0x00000000005fee8f in
> Reporter::Error (this=<optimized out>, fmt=fmt at entry=0x7fe36c719d70
> "Kafka send failed: %s") at /opt/download/bro/src/Reporter.cc:76 #5
> 0x00007fe36c717fa9 in logging::writer::KafkaWriter::DoWrite
> (this=0x6369270, num_fields=<optimized out>, fields=<optimized out>,
> vals=0x69d2080) at
> /opt/download/bro/aux/plugins/kafka/src/KafkaWriter.cc:156 #6
> 0x000000000089e495 in logging::WriterBackend::Write (this=0x6369270,
> arg_num_fields=<optimized out>, num_writes=1000, vals=0x6dc7bf0) at
> /opt/download/bro/src/logging/WriterBackend.cc:301 #7  0x0000000000662180
> in threading::MsgThread::Run (this=0x6369270) at
> /opt/download/bro/src/threading/MsgThread.cc:371 #8  0x000000000065eaa8 in
> threading::BasicThread::launcher (arg=0x6369270) at
> /opt/download/bro/src/threading/BasicThread.cc:205 #9  0x00007fe36e8ce2b0
> in ?? () from /lib64/libstdc++.so.6 #10 0x00007fe36ed2ce25 in start_thread
> () from /lib64/libpthread.so.0 #11 0x00007fe36e03634d in clone () from
> /lib64/libc.so.6



Varibles on frame 1

(gdb) f 1 #1  EventMgr::QueueEvent (this=0xc302c0 <mgr>,
event=event at entry=0x7fe292ebd490)
at /opt/download/bro/src/Event.cc:90 90 tail->SetNext(event); (gdb) info
args this = 0xc302c0 <mgr> event = 0x7fe292ebd490 (gdb) info locals done =
<optimized out> (gdb) p head $1 = (Event *) 0x7fe3540c81c0 (gdb) p tail $2
= (Event *) 0x0 (gdb) p event
>
> $3 = (Event *) 0x7fe292ebd490



During test, the variable `tail` is NULL pointer always when bro crashed,
however the variable `head` is NULL or not.

on my research,  in the huge network traffic scenario, KakfaWriter write
log to kafka exceed the limit of
configuration `queue.buffering.max.messages(default is 100000)` or
`queue.buffering.max.kbytes(default is 4000000,  4G in other words)` in
librdkafka,
and QUEUE_FULL error raised by librdkafka, then KafkaWriter call
Reporter::Error to report the runtime error.
so KafkaWriter::DoWrite lead too many action to call Reporter::Error
function.
I guess the issue cause with concurrency access to the varible `tail`
without lock, then it set to be a NULL pointer, but i don't know why.
Then call the function `SetNext` with the NULL pointer, segment fault was
raised.

The above is my guesswork, maybe there is another reason.

Wish someone could help.


Best regards,
Myth
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ICSI.Berkeley.EDU/pipermail/bro/attachments/20180126/24c53273/attachment.html 


More information about the Bro mailing list