[Bro] Bro cluster requirements and manager logging backlog bug
Azoff, Justin S
jazoff at illinois.edu
Tue Dec 20 12:40:37 PST 2016
> On Dec 20, 2016, at 1:56 PM, Hovsep Levi <hovsep.sanjay.levi at gmail.com> wrote:
>
>
> [bro at mgr /opt/bro]$ bin/broctl top manager logger
> Name Type Host Pid Proc VSize Rss Cpu Cmd
> logger logger 169.232.234.36 52852 parent 109G 100G 0% bro
> logger logger 169.232.234.36 52867 child 837M 498M 0% bro
> manager manager 169.232.234.36 52935 child 485M 17M 0% bro
> manager manager 169.232.234.36 52892 parent 2G 557M 0% bro
>
> In this condition all the workers are at 100% CPU and the worker nodes have all 128GB RAM used. The manager node had to be rebooted as "killall -9 bro" had no effect. This is what happens if Bro isn't restarted every 30 minutes.
This output with the cpu at 0 is kind of odd, unless it was already swapping or something.
>
> Also, you've never mentioned the actual rate of logs you are seeing at these peak times
>
> Running this in your log directory would help:
>
> du -ms;cat *|wc -l;sleep 60;du -ms;cat *|wc -l
>
> [bro at mgr /opt/bro_data/logs/current]$ du -ms;cat *|wc -l;sleep 60;du -ms;cat *|wc -l
> 56 .
> 789695
> 220 .
> 2801719
So this shows only 33k logs/sec and 3MB/sec
>
>
> @ Tue Dec 20 18:46:48 UTC 2016 already the logger has 5G memory:
>
> [bro at mgr /opt/bro]$ bin/broctl top manager logger
> Name Type Host Pid Proc VSize Rss Cpu Cmd
> logger logger 169.232.234.36 18832 parent 5G 5G 192% bro
> logger logger 169.232.234.36 18874 child 1G 1G 58% bro
> manager manager 169.232.234.36 18947 child 510M 255M 55% bro
> manager manager 169.232.234.36 18905 parent 11G 1G 25% bro
This shows that your logger process seems to just have issues keeping up with the volume...
>
> [bro at mgr /opt/bro_data/logs/current]$ du -ms;cat *|wc -l;sleep 60;du -ms;cat *|wc -l
> 593 .
> 7117478
> 809 .
> 9573974
but based on this you are only doing 40k logs/sec and 4 MB/sec and shouldn't really be having issues. We have users doing over 200k/sec.
Can you check the following:
after bro has been running for a bit:
wc -l *.log | sort -n
to show which log files are the largest
the output of this command:
top -b -n 1 -H -o TIME |grep bro:|head -n 20
or just run top and press H. That should show all the bro logging threads (it works on linux at least) They may show up truncated but it's enough to tell them apart.
What model/count CPU does your manager have?
Are you writing out logs as the default ascii or using json?
--
- Justin Azoff
More information about the Bro
mailing list