[Bro] Bro's limitations with high worker count and memory exhaustion

Baxter Milliwew baxter.milliwew at gmail.com
Mon Jun 29 16:22:13 PDT 2015


The manager still crashes.  Interesting note about a buffer overflow.


[manager]


Bro 2.4

Linux 3.16.0-38-generic


core

[New LWP 18834]

[Thread debugging using libthread_db enabled]

Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Core was generated by `/usr/local/3rd-party/bro/bin/bro -U .status -p
broctl -p broctl-live -p local -'.

Program terminated with signal SIGABRT, Aborted.

#0  0x00007f163bb46cc9 in __GI_raise (sig=sig at entry=6) at
../nptl/sysdeps/unix/sysv/linux/raise.c:56


Thread 1 (Thread 0x............ (LWP 18834)):

#0  0x00007f163bb46cc9 in __GI_raise (sig=sig at entry=6) at
../nptl/sysdeps/unix/sysv/linux/raise.c:56

#1  0x00007f163bb4a0d8 in __GI_abort () at abort.c:89

#2  0x00007f163bb83394 in __libc_message (do_abort=do_abort at entry=2,
fmt=fmt at entry=0x............ "*** %s ***: %s terminated\n") at
../sysdeps/posix/libc_fatal.c:175

#3  0x00007f163bc1ac9c in __GI___fortify_fail (msg=<optimized out>,
msg at entry=0x............ "buffer overflow detected") at fortify_fail.c:37

#4  0x00007f163bc19b60 in __GI___chk_fail () at chk_fail.c:28

#5  0x00007f163bc1abe7 in __fdelt_chk (d=<optimized out>) at fdelt_chk.c:25

#6  0x00000000005e962a in Set (set=0x............, this=0x............) at
/home/bro/Bro-IDS/bro-2.4/src/iosource/FD_Set.h:59

#7  SocketComm::Run (this=0x............) at
/home/bro/Bro-IDS/bro-2.4/src/RemoteSerializer.cc:3406

#8  0x00000000005e9c31 in RemoteSerializer::Fork (this=0x............) at
/home/bro/Bro-IDS/bro-2.4/src/RemoteSerializer.cc:687

#9  0x00000000005e9d4f in RemoteSerializer::Enable (this=0x............) at
/home/bro/Bro-IDS/bro-2.4/src/RemoteSerializer.cc:575

#10 0x00000000005b6943 in BifFunc::bro_enable_communication
(frame=<optimized out>, BiF_ARGS=<optimized out>) at bro.bif:4480

#11 0x00000000005b431d in BuiltinFunc::Call (this=0x............,
args=0x............, parent=0x............) at
/home/bro/Bro-IDS/bro-2.4/src/Func.cc:586

#12 0x0000000000599066 in CallExpr::Eval (this=0x............,
f=0x............) at /home/bro/Bro-IDS/bro-2.4/src/Expr.cc:4544

#13 0x000000000060ceb4 in ExprStmt::Exec (this=0x............,
f=0x............, flow=@0x............: FLOW_NEXT) at
/home/bro/Bro-IDS/bro-2.4/src/Stmt.cc:352

#14 0x000000000060b174 in IfStmt::DoExec (this=0x............,
f=0x............, v=<optimized out>, flow=@0x............: FLOW_NEXT) at
/home/bro/Bro-IDS/bro-2.4/src/Stmt.cc:456

#15 0x000000000060ced1 in ExprStmt::Exec (this=0x............,
f=0x............, flow=@0x............: FLOW_NEXT) at
/home/bro/Bro-IDS/bro-2.4/src/Stmt.cc:356

#16 0x000000000060b211 in StmtList::Exec (this=0x............,
f=0x............, flow=@0x............: FLOW_NEXT) at
/home/bro/Bro-IDS/bro-2.4/src/Stmt.cc:1696

#17 0x000000000060b211 in StmtList::Exec (this=0x............,
f=0x............, flow=@0x............: FLOW_NEXT) at
/home/bro/Bro-IDS/bro-2.4/src/Stmt.cc:1696

#18 0x00000000005c042e in BroFunc::Call (this=0x............,
args=<optimized out>, parent=0x0) at
/home/bro/Bro-IDS/bro-2.4/src/Func.cc:403

#19 0x000000000057ee2a in EventHandler::Call (this=0x............,
vl=0x............, no_remote=no_remote at entry=false) at
/home/bro/Bro-IDS/bro-2.4/src/EventHandler.cc:130

#20 0x000000000057e035 in Dispatch (no_remote=false, this=0x............)
at /home/bro/Bro-IDS/bro-2.4/src/Event.h:50

#21 EventMgr::Dispatch (this=this at entry=0x...... <mgr>) at
/home/bro/Bro-IDS/bro-2.4/src/Event.cc:111

#22 0x000000000057e1d0 in EventMgr::Drain (this=0xbbd720 <mgr>) at
/home/bro/Bro-IDS/bro-2.4/src/Event.cc:128

#23 0x00000000005300ed in main (argc=<optimized out>, argv=<optimized out>)
at /home/bro/Bro-IDS/bro-2.4/src/main.cc:1147




On Mon, Jun 29, 2015 at 4:09 PM, Baxter Milliwew <baxter.milliwew at gmail.com>
wrote:

> Nevermind... new box, default nofile limits.  Thanks for the malloc tip.
>
>
> On Mon, Jun 29, 2015 at 4:03 PM, Baxter Milliwew <
> baxter.milliwew at gmail.com> wrote:
>
>> Switching to jemalloc fixed the stability issue but not the worker count
>> limitation.
>>
>> On Sun, Jun 28, 2015 at 7:18 PM, Baxter Milliwew <
>> baxter.milliwew at gmail.com> wrote:
>>
>>> Looks like malloc from glibc, default on Ubuntu.  I will try jemalloc
>>> and others.
>>>
>>>
>>>
>>> On Sun, Jun 28, 2015 at 1:03 AM, Jan Grashofer <jan.grashofer at cern.ch>
>>> wrote:
>>>
>>>>  I experienced similar problems (memory gets eaten up quickly and
>>>> workers crash with segfault) using tcmalloc. Which malloc do you use?
>>>>
>>>>
>>>>
>>>> Regards,
>>>>
>>>> Jan
>>>>
>>>>
>>>>  ------------------------------
>>>> *From:* bro-bounces at bro.org [bro-bounces at bro.org] on behalf of Baxter
>>>> Milliwew [baxter.milliwew at gmail.com]
>>>> *Sent:* Friday, June 26, 2015 23:03
>>>> *To:* bro at bro.org
>>>> *Subject:* [Bro] Bro's limitations with high worker count and memory
>>>> exhaustion
>>>>
>>>>   There's some sort of association between memory exhaustion and a
>>>> high number of workers.  The poor man's fix would be to purchase new
>>>> servers with higher CPU speeds as that would reduce the worker count.
>>>> Issues with high worker count and/or memory exhaustion appears to be a well
>>>> know problem based on the mailing list archives.
>>>>
>>>>  In the current version of bro-2.4 my previous configuration
>>>> immediately causes the manager to crash: 15 proxies, 155 workers.  To
>>>> resolve this I've lowered the count to 10 proxies and 140 workers.  However
>>>> even with this configuration the manager process will exhaust all memory
>>>> and crash within about 2 hours.
>>>>
>>>>  The manager is threaded; I think this is an issue with the threading
>>>> behavior between manager, proxies, and workers.  Debugging threading
>>>> problems is complex and I'm a complete novice.. my current tutorial is
>>>> using information from a stack overflow thread:
>>>>
>>>>
>>>> http://stackoverflow.com/questions/981011/c-programming-debugging-with-pthreads
>>>>
>>>>  Does anyone else have this problem ?  What have you tried and what do
>>>> you suggest ?
>>>>
>>>>  Thanks
>>>>
>>>>
>>>>
>>>>
>>>>   1435347409.458185       worker-2-18     parent  -       -       -
>>>>     info    [#10000/10.1.1.1:36994] peer sent class "control"
>>>>
>>>> 1435347409.458185       worker-2-18     parent  -       -       -
>>>> info    [#10000/10.1.1.1:36994] phase: handshake
>>>>
>>>> 1435347409.661085       worker-2-18     parent  -       -       -
>>>> info    [#10000/10.1.1.1:36994] request for unknown event save_results
>>>>
>>>> 1435347409.661085       worker-2-18     parent  -       -       -
>>>> info    [#10000/10.1.1.1:36994] registered for event
>>>> Control::peer_status_response
>>>>
>>>> 1435347409.694858       worker-2-18     parent  -       -       -
>>>> info    [#10000/10.1.1.1:36994] peer does not support 64bit PIDs;
>>>> using compatibility mode
>>>>
>>>> 1435347409.694858       worker-2-18     parent  -       -       -
>>>> info    [#10000/10.1.1.1:36994] peer is a Broccoli
>>>>
>>>> 1435347409.694858       worker-2-18     parent  -       -       -
>>>> info    [#10000/10.1.1.1:36994] phase: running
>>>>
>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ICSI.Berkeley.EDU/pipermail/bro/attachments/20150629/16ebfc31/attachment-0001.html 


More information about the Bro mailing list