[Bro] High-CPU on just a single worker in the cluster

Azoff, Justin S jazoff at illinois.edu
Wed Apr 13 18:43:26 PDT 2016


Can you load this script that will add a node column to the conn.log that says which node handled that connection:

https://github.com/broala/bro-snippets/blob/master/add-node-to-conn.bro

also, what 'broctl netstats' outputs would be useful to see.


-- 
- Justin Azoff

> On Apr 13, 2016, at 7:03 PM, Dave Crawford <bro at pingtrip.com> wrote:
> 
> I'm in the process of trying to debug an odd high-cpu issue and looking for guidance.
> 
> The deployment is a follows:
>  - Cluster has with two nodes, each with 10 workers and the workers are pinned to specific cpu cores.
>  - x520 with PF_RING
>  - Traffic to each node is load balanced equally
> 
> The issue is that one worker on one of the nodes is always at 100% CPU while all other workers are around 50%. If I restart Bro a different worker will pin to 100%, but always on the same node.
> 
> I ran 'strace' on both a "bad" and "good" worker and one anomaly I spotted was that the "bad" worker never called 'nanosleep', whereas the "good" worker had about 84,000 'nanosleep' calls in the same amount of time.
> 
> I'm wondering if its possible for a queue to go bad on the x520, which might explain why its a random worker on the same node after restarting.
> 
> Is there a way to determine which x520 queue a specific worker is reading from? 
> 
> Thanks,
> -Dave
> 
> 
> _______________________________________________
> Bro mailing list
> bro at bro-ids.org
> http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/bro




More information about the Bro mailing list