[Bro] Bro's limitations with high worker count and memory exhaustion

Siwek, Jon jsiwek at illinois.edu
Tue Jun 30 07:44:23 PDT 2015


A guess is that you’re bumping into an FD_SETSIZE limit — the way remote I/O is currently structured has at least 5 file descriptors per remote connection from what I can see at a glance (a pair of pipes, 2 fds each, for signaling read/write readiness related to ChunkedIO and one fd for the actual socket).  Typically, FD_SETSIZE is 1024, so with ~150-200 remote connections and 5 fds per connection plus whatever other descriptors Bro may need to have open (e.g. for file I/O), it seems reasonable to guess that’s the problem.  But you could easily verify w/ some code modifications to check whether the FD_SET call is using a fd >= FD_SETSIZE.

Other than making involved code changes to Bro (e.g. to move away from select() for I/O event handling), the only suggestions I have are 1) reducing number of remote connections 2) see if you can increase FD_SETSIZE via preprocessor stuff or CFLAGS/CXXFLAGS upon ./configure’ing (I’ve never done this myself to know if it works, but I’ve googled around before and think the implication was that it may work on Linux).

- Jon

> On Jun 29, 2015, at 6:22 PM, Baxter Milliwew <baxter.milliwew at gmail.com> wrote:
> 
> The manager still crashes.  Interesting note about a buffer overflow.
> 
> 
> [manager]
> 
> Bro 2.4
> Linux 3.16.0-38-generic
> 
> core
> [New LWP 18834]
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
> Core was generated by `/usr/local/3rd-party/bro/bin/bro -U .status -p broctl -p broctl-live -p local -'.
> Program terminated with signal SIGABRT, Aborted.
> #0  0x00007f163bb46cc9 in __GI_raise (sig=sig at entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
> 
> Thread 1 (Thread 0x............ (LWP 18834)):
> #0  0x00007f163bb46cc9 in __GI_raise (sig=sig at entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
> #1  0x00007f163bb4a0d8 in __GI_abort () at abort.c:89
> #2  0x00007f163bb83394 in __libc_message (do_abort=do_abort at entry=2, fmt=fmt at entry=0x............ "*** %s ***: %s terminated\n") at ../sysdeps/posix/libc_fatal.c:175
> #3  0x00007f163bc1ac9c in __GI___fortify_fail (msg=<optimized out>, msg at entry=0x............ "buffer overflow detected") at fortify_fail.c:37
> #4  0x00007f163bc19b60 in __GI___chk_fail () at chk_fail.c:28
> #5  0x00007f163bc1abe7 in __fdelt_chk (d=<optimized out>) at fdelt_chk.c:25
> #6  0x00000000005e962a in Set (set=0x............, this=0x............) at /home/bro/Bro-IDS/bro-2.4/src/iosource/FD_Set.h:59
> #7  SocketComm::Run (this=0x............) at /home/bro/Bro-IDS/bro-2.4/src/RemoteSerializer.cc:3406
> #8  0x00000000005e9c31 in RemoteSerializer::Fork (this=0x............) at /home/bro/Bro-IDS/bro-2.4/src/RemoteSerializer.cc:687
> #9  0x00000000005e9d4f in RemoteSerializer::Enable (this=0x............) at /home/bro/Bro-IDS/bro-2.4/src/RemoteSerializer.cc:575
> #10 0x00000000005b6943 in BifFunc::bro_enable_communication (frame=<optimized out>, BiF_ARGS=<optimized out>) at bro.bif:4480
> #11 0x00000000005b431d in BuiltinFunc::Call (this=0x............, args=0x............, parent=0x............) at /home/bro/Bro-IDS/bro-2.4/src/Func.cc:586
> #12 0x0000000000599066 in CallExpr::Eval (this=0x............, f=0x............) at /home/bro/Bro-IDS/bro-2.4/src/Expr.cc:4544
> #13 0x000000000060ceb4 in ExprStmt::Exec (this=0x............, f=0x............, flow=@0x............: FLOW_NEXT) at /home/bro/Bro-IDS/bro-2.4/src/Stmt.cc:352
> #14 0x000000000060b174 in IfStmt::DoExec (this=0x............, f=0x............, v=<optimized out>, flow=@0x............: FLOW_NEXT) at /home/bro/Bro-IDS/bro-2.4/src/Stmt.cc:456
> #15 0x000000000060ced1 in ExprStmt::Exec (this=0x............, f=0x............, flow=@0x............: FLOW_NEXT) at /home/bro/Bro-IDS/bro-2.4/src/Stmt.cc:356
> #16 0x000000000060b211 in StmtList::Exec (this=0x............, f=0x............, flow=@0x............: FLOW_NEXT) at /home/bro/Bro-IDS/bro-2.4/src/Stmt.cc:1696
> #17 0x000000000060b211 in StmtList::Exec (this=0x............, f=0x............, flow=@0x............: FLOW_NEXT) at /home/bro/Bro-IDS/bro-2.4/src/Stmt.cc:1696
> #18 0x00000000005c042e in BroFunc::Call (this=0x............, args=<optimized out>, parent=0x0) at /home/bro/Bro-IDS/bro-2.4/src/Func.cc:403
> #19 0x000000000057ee2a in EventHandler::Call (this=0x............, vl=0x............, no_remote=no_remote at entry=false) at /home/bro/Bro-IDS/bro-2.4/src/EventHandler.cc:130
> #20 0x000000000057e035 in Dispatch (no_remote=false, this=0x............) at /home/bro/Bro-IDS/bro-2.4/src/Event.h:50
> #21 EventMgr::Dispatch (this=this at entry=0x...... <mgr>) at /home/bro/Bro-IDS/bro-2.4/src/Event.cc:111
> #22 0x000000000057e1d0 in EventMgr::Drain (this=0xbbd720 <mgr>) at /home/bro/Bro-IDS/bro-2.4/src/Event.cc:128
> #23 0x00000000005300ed in main (argc=<optimized out>, argv=<optimized out>) at /home/bro/Bro-IDS/bro-2.4/src/main.cc:1147
> 
> 
> 
> On Mon, Jun 29, 2015 at 4:09 PM, Baxter Milliwew <baxter.milliwew at gmail.com> wrote:
> Nevermind... new box, default nofile limits.  Thanks for the malloc tip.
> 
> 
> On Mon, Jun 29, 2015 at 4:03 PM, Baxter Milliwew <baxter.milliwew at gmail.com> wrote:
> Switching to jemalloc fixed the stability issue but not the worker count limitation. 
> 
> On Sun, Jun 28, 2015 at 7:18 PM, Baxter Milliwew <baxter.milliwew at gmail.com> wrote:
> Looks like malloc from glibc, default on Ubuntu.  I will try jemalloc and others.
> 
> 
> 
> On Sun, Jun 28, 2015 at 1:03 AM, Jan Grashofer <jan.grashofer at cern.ch> wrote:
> I experienced similar problems (memory gets eaten up quickly and workers crash with segfault) using tcmalloc. Which malloc do you use?
> 
>  
> Regards,
> 
> Jan 
> 
>  
> From: bro-bounces at bro.org [bro-bounces at bro.org] on behalf of Baxter Milliwew [baxter.milliwew at gmail.com]
> Sent: Friday, June 26, 2015 23:03
> To: bro at bro.org
> Subject: [Bro] Bro's limitations with high worker count and memory exhaustion
> 
> There's some sort of association between memory exhaustion and a high number of workers.  The poor man's fix would be to purchase new servers with higher CPU speeds as that would reduce the worker count.  Issues with high worker count and/or memory exhaustion appears to be a well know problem based on the mailing list archives.
> 
> In the current version of bro-2.4 my previous configuration immediately causes the manager to crash: 15 proxies, 155 workers.  To resolve this I've lowered the count to 10 proxies and 140 workers.  However even with this configuration the manager process will exhaust all memory and crash within about 2 hours.
> 
> The manager is threaded; I think this is an issue with the threading behavior between manager, proxies, and workers.  Debugging threading problems is complex and I'm a complete novice.. my current tutorial is using information from a stack overflow thread:
> 
> http://stackoverflow.com/questions/981011/c-programming-debugging-with-pthreads
> 
> Does anyone else have this problem ?  What have you tried and what do you suggest ? 
> 
> Thanks
> 
> 
> 
> 
> 1435347409.458185       worker-2-18     parent  -       -       -       info    [#10000/10.1.1.1:36994] peer sent class "control"
> 1435347409.458185       worker-2-18     parent  -       -       -       info    [#10000/10.1.1.1:36994] phase: handshake
> 1435347409.661085       worker-2-18     parent  -       -       -       info    [#10000/10.1.1.1:36994] request for unknown event save_results
> 1435347409.661085       worker-2-18     parent  -       -       -       info    [#10000/10.1.1.1:36994] registered for event Control::peer_status_response
> 1435347409.694858       worker-2-18     parent  -       -       -       info    [#10000/10.1.1.1:36994] peer does not support 64bit PIDs; using compatibility mode
> 1435347409.694858       worker-2-18     parent  -       -       -       info    [#10000/10.1.1.1:36994] peer is a Broccoli
> 1435347409.694858       worker-2-18     parent  -       -       -       info    [#10000/10.1.1.1:36994] phase: running
> 
> 
> 
> 
> 
> _______________________________________________
> Bro mailing list
> bro at bro-ids.org
> http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/bro




More information about the Bro mailing list