[Bro] Bro manager dies in a large cluster

Michał Purzyński michalpurzynski1 at gmail.com
Mon Aug 31 19:01:41 PDT 2015


Hello :-)

I've finished the long process of merging all sensors in a large
cluster. To my surprise, every time I enable all of them and run
"broctl deploy" all workers start, so do proxies, but manager dies
right away.

This cluster has almost 200 workers, 9 servers, between 8 and 16
proxies (tried 8 and 16, didn't change anything).

I have lots of traffic, lots of connections, lots of everything ;-) My
guess is that manager can't keep up with the amount of logs it is
expected to generate and it gives up.

manager and proxies run on a server dedicated just for them, 64GB RAM,
16 physical cores, dedicated network for the cluster traffic.

Now, when I divide the cluster more or less in half (4 nodes enabled,
5 disabled) everything is stable.

The amount of logs with 4 sensors enabled (almost exactly an hour, I'm
like 2 minutes from rotation). Hm. Maybe I should do something about
the Mysql traffic ;-)

What can I do? I'd like to help debug, if that's a bug I'm running into.

total 24G

-rw-rw-r-- 1 bro bro  84K Sep  1 01:57 capture_loss.log
-rw-rw-r-- 1 bro bro 5.6M Sep  1 01:58 communication.log
-rw-rw-r-- 1 bro bro 1.2G Sep  1 01:58 conn.log
-rw-rw-r-- 1 bro bro 476M Sep  1 01:58 conn-noise.log
-rw-rw-r-- 1 bro bro 1.8M Sep  1 01:58 dhcp.log
-rw-rw-r-- 1 bro bro 309M Sep  1 01:58 dns.log
-rw-rw-r-- 1 bro bro 609M Sep  1 01:58 dns-noise.log
-rw-rw-r-- 1 bro bro 115K Sep  1 01:58 dpd.log
-rw-rw-r-- 1 bro bro 1.5G Sep  1 01:58 files.log
-rw-rw-r-- 1 bro bro 1.5G Sep  1 01:58 http.log
-rw-rw-r-- 1 bro bro  65M Sep  1 01:58 http-noise.log
-rw-rw-r-- 1 bro bro 1.6M Sep  1 01:58 intel.log
-rw-rw-r-- 1 bro bro  37K Sep  1 01:54 intel-noise.log
-rw-rw-r-- 1 bro bro  68K Sep  1 01:58 irc.log
-rw-rw-r-- 1 bro bro 7.8M Sep  1 01:58 kerberos.log
-rw-rw-r-- 1 bro bro 566K Sep  1 01:58 known_certs.log
-rw-rw-r-- 1 bro bro  41K Sep  1 01:58 known_devices.log
-rw-rw-r-- 1 bro bro 244K Sep  1 01:58 known_hosts.log
-rw-rw-r-- 1 bro bro 330K Sep  1 01:58 known_services.log
-rw-rw-r-- 1 bro bro 4.9G Sep  1 01:58 mysql.log
-rw-rw-r-- 1 bro bro 636K Sep  1 01:58 notice.log
-rw-rw-r-- 1 bro bro 6.0K Sep  1 01:58 pe.log
-rw-rw-r-- 1 bro bro  559 Sep  1 01:31 reporter.log
-rw-rw-r-- 1 bro bro 168K Sep  1 01:57 sip.log
-rw-rw-r-- 1 bro bro  12M Sep  1 01:58 smtp.log
-rw-rw-r-- 1 bro bro  25M Sep  1 01:58 snmp.log
-rw-rw-r-- 1 bro bro  73M Sep  1 01:58 software.log
-rw-rw-r-- 1 bro bro 3.9M Sep  1 01:58 ssh.log
-rw-rw-r-- 1 bro bro  23K Sep  1 01:57 sslcipherstat_log1.log
-rw-rw-r-- 1 bro bro 766K Sep  1 01:58 sslcipherstat_log2.log
-rw-rw-r-- 1 bro bro 783M Sep  1 01:58 ssl.log
-rw-rw-r-- 1 bro bro  17K Sep  1 01:57 sslprotostat_log1.log
-rw-rw-r-- 1 bro bro 773K Sep  1 01:58 sslprotostat_log2.log
-rw-rw-r-- 1 bro bro  492 Sep  1 00:12 stderr.log
-rw-rw-r-- 1 bro bro  188 Sep  1 00:12 stdout.log
-rw-rw-r-- 1 bro bro 7.8K Sep  1 01:12 subnet.log
-rw-rw-r-- 1 bro bro 3.9G Sep  1 01:58 syslog.log
-rw-rw-r-- 1 bro bro 1.6M Sep  1 01:58 tunnel.log
-rw-rw-r-- 1 bro bro  46M Sep  1 01:58 weird.log
-rw-rw-r-- 1 bro bro 1.6G Sep  1 01:58 x509.log
-rw-rw-r-- 1 bro bro  683 Sep  1 01:31 xss.log

Logs aren't really helpful.

cat post-terminate-2015-09-01-00-12-15-61637-crash/.crash-diag.log


Bro 2.4

Linux 3.19.0-26-generic



==== No reporter.log


==== stderr.log

warning in /opt/bro/share/bro/brozilla/./intel-dns.bro, line 99:
deprecated (join_string_array)

warning in /nsm/bro/spool/installed-scripts-do-not-touch/site/local.bro,
line 176: multiple initializations for index (10.248.75.6)

warning in /nsm/bro/spool/installed-scripts-do-not-touch/site/local.bro,
line 176: multiple initializations for index (10.248.75.7)

warning in /nsm/bro/spool/installed-scripts-do-not-touch/site/local.bro,
line 177: multiple initializations for index (10.248.22.1)


==== stdout.log

max memory size         (kbytes, -m) unlimited

data seg size           (kbytes, -d) unlimited

virtual memory          (kbytes, -v) unlimited

core file size          (blocks, -c) unlimited


==== .cmdline

-U .status -p broctl -p broctl-live -p local -p nsmserver1-manager
local.bro broctl base/frameworks/cluster local-manager.bro broctl/auto


==== .env_vars

PATH=/opt/bro/bin:/opt/bro/share/broctl/scripts:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games

BROPATH=/nsm/bro/spool/installed-scripts-do-not-touch/site::/nsm/bro/spool/installed-scripts-do-not-touch/auto:/opt/bro/share/bro:/opt/bro/share/bro/policy:/opt/bro/share/bro/site

CLUSTER_NODE=nsmserver1-manager


==== .status

TERMINATED [atexit]


==== No prof.log


==== No packet_filter.log


==== No loaded_scripts.log

bro at nsmserver1:/nsm/bro/spool/tmp$


More information about the Bro mailing list