[Bro] Bro's limitations with high worker count and memory exhaustion
Jan Grashofer
jan.grashofer at cern.ch
Sun Jun 28 01:03:49 PDT 2015
I experienced similar problems (memory gets eaten up quickly and workers crash with segfault) using tcmalloc. Which malloc do you use?
Regards,
Jan
________________________________
From: bro-bounces at bro.org [bro-bounces at bro.org] on behalf of Baxter Milliwew [baxter.milliwew at gmail.com]
Sent: Friday, June 26, 2015 23:03
To: bro at bro.org
Subject: [Bro] Bro's limitations with high worker count and memory exhaustion
There's some sort of association between memory exhaustion and a high number of workers. The poor man's fix would be to purchase new servers with higher CPU speeds as that would reduce the worker count. Issues with high worker count and/or memory exhaustion appears to be a well know problem based on the mailing list archives.
In the current version of bro-2.4 my previous configuration immediately causes the manager to crash: 15 proxies, 155 workers. To resolve this I've lowered the count to 10 proxies and 140 workers. However even with this configuration the manager process will exhaust all memory and crash within about 2 hours.
The manager is threaded; I think this is an issue with the threading behavior between manager, proxies, and workers. Debugging threading problems is complex and I'm a complete novice.. my current tutorial is using information from a stack overflow thread:
http://stackoverflow.com/questions/981011/c-programming-debugging-with-pthreads
Does anyone else have this problem ? What have you tried and what do you suggest ?
Thanks
1435347409.458185 worker-2-18 parent - - - info [#10000/10.1.1.1:36994<http://10.1.1.1:36994>] peer sent class "control"
1435347409.458185 worker-2-18 parent - - - info [#10000/10.1.1.1:36994<http://10.1.1.1:36994>] phase: handshake
1435347409.661085 worker-2-18 parent - - - info [#10000/10.1.1.1:36994<http://10.1.1.1:36994>] request for unknown event save_results
1435347409.661085 worker-2-18 parent - - - info [#10000/10.1.1.1:36994<http://10.1.1.1:36994>] registered for event Control::peer_status_response
1435347409.694858 worker-2-18 parent - - - info [#10000/10.1.1.1:36994<http://10.1.1.1:36994>] peer does not support 64bit PIDs; using compatibility mode
1435347409.694858 worker-2-18 parent - - - info [#10000/10.1.1.1:36994<http://10.1.1.1:36994>] peer is a Broccoli
1435347409.694858 worker-2-18 parent - - - info [#10000/10.1.1.1:36994<http://10.1.1.1:36994>] phase: running
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ICSI.Berkeley.EDU/pipermail/bro/attachments/20150628/80b467f5/attachment.html
More information about the Bro
mailing list