[Bro] 5 node cluster

Fri Oct 7 14:35:26 PDT 2016

> On Oct 7, 2016, at 5:18 PM, Darrain Waters <dwaters at bioteam.net> wrote:
> 
> Thanks for the quick reply. I put proxy on everything because I was grabbing at straws. I did only have 1 proxy and it was on the manager with the same results.
> 
> 
> Why are you using 7,8,9,10,11,18,19,20,21,22 in particular?  What CPUs do you have?  This is potentially not doing what you intend.  Most likely 7/19 8/20 9/21 10/22 are the same cpu.
> 
> Those are the core that are with node 1 and node 1 is associated with the myricom card.
> 
> [bromgr at bromgr 2016-10-07]$ lscpu
> 
> Architecture:          x86_64
> 
> CPU op-mode(s):        32-bit, 64-bit
> 
> Byte Order:            Little Endian
> 
> CPU(s):                24
> 
> On-line CPU(s) list:   0-23
> 
> Thread(s) per core:    2
> 
> Core(s) per socket:    6

I see.  You have 2 6 core cpus with hyper threading.  So those are the two sets of cpus that make up each hypertheading pair.  We haven't gotten to do performance testing for this yet, but you might get better performance by just using 2,3,4,5,6,7,8,9,10,11.  It's the tradeoff between having to copy half of the packets across to the other numa node, but using more of the 'real' cores and less of the hyper threading ones.

> 
> Your underlying problem is probably that a firewall is enabled on your hosts and the worker processes can't reach the manager. 
> I have ip6 & iptables off

On all the machines?  "everything is working but there are no logs" almost always turns out to be firewall rules.  The last time it turned out that another admin had re-enabled the firewall.. :-)

One thing to check for that are the logs written to the spool/ on each worker.  There will be a local communication.log for each worker that may be complaining about something.

Now that I reread your first message I see "I am not getting any log information in prefix/logs".  Do you mean that there are literally no log files in there?  under current/ you should at least have stderr.log and communication.log.  If you literally have no log files you may have some permission issues if you are not running bro as root.

You can also run tcpdump on the manager and see if the workers are even trying to send it anything.

> peerstatus
> 
> 
> 
> [BroControl] > peerstatus
> 
>     manager
> 
> 1475875039.738664 peer=worker-2-2 host=10.0.40.17 events_in=3165 events_out=3165 ops_in=0 ops_out=3472 bytes_in=? bytes_out=?
> 
> 1475875039.738664 peer=worker-1-3 host=10.0.40.18 events_in=3165 events_out=3165 ops_in=0 ops_out=3472 bytes_in=? bytes_out=?
> 
> 1475875039.738664 peer=proxy-2 host=10.0.40.17 events_in=3165 events_out=3165 ops_in=0 ops_out=3472 bytes_in=? bytes_out=?
> 
> 1475875039.738664 peer=proxiy-5 host=10.0.40.19 events_in=3165 events_out=3165 ops_in=0 ops_out=3472 bytes_in=? bytes_out=?
> 
> 1475875039.738664 peer=worker-3-4 host=10.0.40.16 events_in=3165 events_out=3165 ops_in=0 ops_out=3472 bytes_in=? bytes_out=?
> 
> 1475875039.738664 peer=worker-3-3 host=10.0.40.16 events_in=3165 events_out=3165 ops_in=0 ops_out=3472 bytes_in=? bytes_out=?
> 
That appears normal.. I'm not sure what bytes_in and bytes_out were supposed to be.. it doesn't look like we output that anymore.

What does 'broctl netstats' show?

-- 
- Justin Azoff