[Bro] Bro cluster requirements and manager logging backlog bug
Hovsep Levi
hovsep.sanjay.levi at gmail.com
Tue Dec 20 10:56:36 PST 2016
On Mon, Dec 19, 2016 at 10:47 PM, Azoff, Justin S <jazoff at illinois.edu>
wrote:
>
> > On Dec 19, 2016, at 4:26 PM, Hovsep Levi <hovsep.sanjay.levi at gmail.com>
> wrote:
> >
> > Hello all,
> >
> >
> > We are still having a problem with our Bro cluster and logging. During
> peak times the manager will slowly consume all available memory while the
> logs sent to disk are delayed by an hour or more.
>
>
> You're saying "the manager" but do you mean "the manager node" or "the
> manager process"?
>
The manager node.
>
> With the added logger process the manager process does not have anything
> to do with logs.
The last time you mentioned these issues the logger node capability did not
> exist yet. A lot has changed since then but the logs you show are from 4
> months ago.
>
The email was a saved draft accidentally sent before finishing the edit.
I've been using the logger process since October. As of yesterday I'm
using the latest Bro-2.5 with a default local.bro file.
> We need to see what this command outputs when your cluster is having log
> issues:
>
> broctl top manager logger
>
>
Ok. Today I find this:
[bro at mgr /opt/bro]$ bin/broctl top manager logger
Name Type Host Pid Proc VSize Rss Cpu Cmd
logger logger 169.232.234.36 52852 parent 109G 100G 0% bro
logger logger 169.232.234.36 52867 child 837M 498M 0% bro
manager manager 169.232.234.36 52935 child 485M 17M 0% bro
manager manager 169.232.234.36 52892 parent 2G 557M 0% bro
In this condition all the workers are at 100% CPU and the worker nodes have
all 128GB RAM used. The manager node had to be rebooted as "killall -9
bro" had no effect. This is what happens if Bro isn't restarted every 30
minutes.
> Also, you've never mentioned the actual rate of logs you are seeing at
> these peak times
>
> Running this in your log directory would help:
>
> du -ms;cat *|wc -l;sleep 60;du -ms;cat *|wc -l
>
>
[bro at mgr /opt/bro_data/logs/current]$ du -ms;cat *|wc -l;sleep 60;du
-ms;cat *|wc -l
56 .
789695
220 .
2801719
Bro was started @ Tue Dec 20 18:39:26 UTC 2016, the command below was run a
minute later.
[bro at mgr /opt/bro]$ bin/broctl top manager logger
Name Type Host Pid Proc VSize Rss Cpu Cmd
logger logger 169.232.234.36 18832 parent 289M 139M 194% bro
logger logger 169.232.234.36 18874 child 485M 78M 36% bro
manager manager 169.232.234.36 18905 parent 530M 384M 72% bro
manager manager 169.232.234.36 18947 child 510M 231M 51% bro
@ Tue Dec 20 18:46:48 UTC 2016 already the logger has 5G memory:
[bro at mgr /opt/bro]$ bin/broctl top manager logger
Name Type Host Pid Proc VSize Rss Cpu Cmd
logger logger 169.232.234.36 18832 parent 5G 5G 192% bro
logger logger 169.232.234.36 18874 child 1G 1G 58% bro
manager manager 169.232.234.36 18947 child 510M 255M 55% bro
manager manager 169.232.234.36 18905 parent 11G 1G 25% bro
[bro at mgr /opt/bro_data/logs/current]$ du -ms;cat *|wc -l;sleep 60;du
-ms;cat *|wc -l
593 .
7117478
809 .
9573974
@ Tue Dec 20 18:51:05 UTC 2016 the logger has 10G memory and the manager
has increased by 5G as well
[bro at mgr /opt/bro]$ bin/broctl top manager logger
Name Type Host Pid Proc VSize Rss Cpu Cmd
logger logger 169.232.234.36 18832 parent 10G 10G 222% bro
logger logger 169.232.234.36 18874 child 3G 3G 64% bro
manager manager 169.232.234.36 18947 child 510M 255M 65% bro
manager manager 169.232.234.36 18905 parent 16G 1G 23% bro
[bro at mgr /opt/bro_data/logs/current]$ du -ms;cat *|wc -l;sleep 60;du
-ms;cat *|wc -l
1346 .
15357570
1623 .
18708418
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ICSI.Berkeley.EDU/pipermail/bro/attachments/20161220/cbded41c/attachment.html
More information about the Bro
mailing list