[Bro-Dev] Bro 2.5 Packet Drop Issue

Rajput, Jawad (CONTR) Jawad.Rajput at hq.doe.gov
Thu Aug 30 14:13:32 PDT 2018


Thank you so much Justin, the solution worked. We were literally troubleshooting for more than a month and did not find anything online.  


Jawad Rajput 
System Administrator
U.S. Department of Energy 
IM-62 /Germantown Building
HQ Network Security Team
Email: Jawad.Rajput at hq.doe.gov
Office: 301-903-2176
Office: 301-903-3895
Cell: 301-795-5406



-----Original Message-----
From: Azoff, Justin S [mailto:jazoff at illinois.edu] 
Sent: Thursday, August 30, 2018 4:29 PM
To: Rajput, Jawad (CONTR) <Jawad.Rajput at hq.doe.gov>
Cc: bro-dev at bro.org; Danis, Andrew (CONTR) <Andrew.Danis at hq.doe.gov>
Subject: Re: [Bro-Dev] Bro 2.5 Packet Drop Issue


> On Aug 30, 2018, at 4:11 PM, Rajput, Jawad (CONTR) <Jawad.Rajput at hq.doe.gov> wrote:
> 
> Hello Everyone,
>  
> I am reaching out with the hope that someone will be able to help us with an issue we are having with Bro upgrade from 2.4.1 to 2.5.X.
>  
> We have a system with  12 core (3Ghz) ,128GB RAM, and 10G NIC (Intel X520-SR2 10GbE Dual-port), monitoring between 1.5 - 2.5 Gbps traffic.
>  
> Bro 2.4.1 is working great and periodically drops 2-5% when traffic peaks at ~ 2.5. However, when we upgrade to Bro 2.5.3/4 on the same exact system the drops go up to 90%.
>  
> We are using CentOS-7 and tired installing Bro and Pfring from both rpm and source without any luck. I wonder if anyone has seen this issue and can give some clues to resolve this issue.
>  
> Bro Node Conf: 
> [manager]
> type=manager
> host=localhost
> #
> [proxy-1]
> type=proxy
> host=localhost
>  
> #
> [worker-1]
> type=worker
> host=localhost
> interface=ens1f1
> lb_method=pf_ring
> lb_procs=11
> pin_cpus=1,2,3,4,5,6,7,8,9,10,11

You're missing a logger process, adding one will make the cluster run better:

[logger]
type=logger
host=localhost


> [root at bro-test ~]# cat /proc/net/pf_ring/info
> PF_RING Version          : 7.3.0 (unknown)
> Total rings              : 11

you should have 1, not 11...

> Standard (non ZC) Options
> Ring slots               : 65534
> Slot version             : 17
> Capture TX               : No [RX only]
> IP Defragment            : No
> Socket Mode              : Standard
> Cluster Fragment Queue   : 0
> Cluster Fragment Discard : 0

Looks like you are having the issue where bro is not actually use pf_ring load balancing if you installed it from rpms.
What you're effectively doing is running 11 workers that are all receiving 100% of the traffic, so you are doing 11 times the work.

You can further confirm that this is the problem you are having by running

	broctl config | grep -i clusterid

and seeing if the id is set to 0:

	pfringclusterid = 0

if so, edit /opt/bro/etc/broctl.cfg and add

	PFRINGClusterID = 11

and broctl deploy to restart everything.

This is already fixed and won't happen again in bro >= 2.6... just keeps tripping people up on 2.5.x

You should also look into switching to the native bro pf_ring plugin or the bro af_packet plugin which are both better choices than using the pcap wrapper method.

— 
Justin Azoff



More information about the bro-dev mailing list