[Bro] FW: Issue: load balancer PF_RING drops 25% of incoming packets
Rosinger, Enno (DualStudy)
enno.rosinger at hpe.com
Tue Jul 26 17:27:43 PDT 2016
Hi Justin,
Thank you for the fast reply.
21 Million received packets: Bro receives it's traffic on an isolated network (where the traffic is generated another server by TCPreplay). I manually take the stats of received packets of the NIC before and after a replaying by issuing "ifconfig eno2(interface-name)" .
16 Million handled packets: I use broctl and issue the command "netstats" to see the number of each worker process' received packets. If you make a sum out of that you will come to 16 Million (NOTE: now 18 Million, as I upgraded to Zero Copy drivers since the last mail).
###Ifconfig on Bro system###
###Before replaying###
[root at slinky-3-4 kernel]# ifconfig eno2
[...]
RX packets 25758824 bytes 20353552393 (18.9 GiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 182 bytes 36558 (35.7 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 [...]
###After replaying###
[root at slinky-3-4 kernel]# ifconfig eno2
[...]
RX packets 47447181 bytes 37400251832 (34.8 GiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 268 bytes 54486 (53.2 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 [...]
That makes 47447181 - 25758824 = 21.688.357 received packets
###netstats in broctl on Bro system###
### after replaying ###
[BroControl] > netstats
worker-1-1: 1469577816.953862 recvd=5088052 dropped=0 link=5088052
worker-1-2: 1469577817.153796 recvd=4205599 dropped=0 link=4205599
worker-1-3: 1469577817.353889 recvd=4562288 dropped=0 link=4562288
worker-1-4: 1469577817.554795 recvd=4546975 dropped=0 link=4546975
The sum of this is 18.402.914 packets, which are seen by BRO as "on the link".
Thanks to your help on the build issue I can also support this number with the stats of pfcount (NOTE: This is another run - slightly different numbers ) ##PFcount result Absolute Stats: [18'416'555 pkts total][0 pkts dropped][0.0% dropped]
[18'416'555 pkts rcvd][17'225'248'719 bytes rcvd][58'886.73 pkt/sec][440.62 Mbit/sec] ========================= Actual Stats: [0 pkts rcvd][722.14 ms][0.00 pps][0.00 Gbps]
As you requeted the capture_loss stats. I currently do not understand what the issue is with this.
I hope you can help me track down the cause for this numbers ...
###first caputer loss file###
#path capture_loss
#open 2016-07-26-16-48-14
#fields ts ts_delta peer gaps acks percent_lost
#types time interval string count count double
1469576894.926898 900.000078 worker-1-4 1156978 1683888 68.708726
1469576894.926602 900.000073 worker-1-1 1396713 1911004 73.087916
1469576894.977632 900.000080 worker-1-2 1055436 1544723 68.32526
1469576895.027647 900.000080 worker-1-3 1218489 1710519 71.235046
#close 2016-07-26-17-00-0
###second caputer loss file###
#open 2016-07-26-17-03-37
#fields ts ts_delta peer gaps acks percent_lost
#types time interval string count count double
1469577794.926695 900.000093 worker-1-1 0 0 0.0
1469577794.977721 900.000089 worker-1-2 0 0 0.0
1469577795.027754 900.000107 worker-1-3 0 0 0.0
1469577794.927012 900.000114 worker-1-4 0 0 0.0
#close 2016-07-26-17-05-030
Looking forward to your response. It already helps me a lot to have more support on this issue.
Best,
Enno
-----Original Message-----
From: Azoff, Justin S [mailto:jazoff at illinois.edu]
Sent: Dienstag, 26. Juli 2016 14:09
To: Rosinger, Enno (DualStudy) <enno.rosinger at hpe.com>
Cc: bro at bro.org
Subject: Re: [Bro] Issue: load balancer PF_RING drops 25% of incoming packets
> On Jul 26, 2016, at 4:23 PM, Rosinger, Enno (DualStudy) <enno.rosinger at hpe.com> wrote:
>
> Strangely only 16 Million of my 21 Million packet input pass through the PF_RING kernel module. Nevertheless they are then distributed correctly on the Bro processes.
> How can I avoid this loss of 5 Million packets and how can I verify that PF_RING is configured correctly?
What are you using to measure the difference in packet counts? Where is the 21 and 16 coming from?
Can you add this to your local.bro and see what it logs to capture_loss.log after 30 minutes or so?
@load misc/capture-loss
>
> I use Intel Corporation I350 Gigabit Network Connection as NICs. They work with the igb drivers.
> The input rate is 0.5Gb/s = 60k to 80k packets/s and currently I am
> working without the ZeroCopy drivers It is verified that all of my 21 Million packets are received by my NIC’s driver.
> The PF_Ring module itself exists and BRO is running with load balancing.
>
> Looking forward to your response and hope to solve this problem with you. Below you will find more detailed information about my system.
> If you need something else let me know.
>
> Best,
> Enno
>
> Additional information:
>
> One interesting fact: I cannot run “make” in
> “PF_RING/userland/examples”, because
> gcc: error: ../libpcap/libpcap.a: No such file or directory
>
> PF_RING/userland looks like this. Indeed “libpcap” is missing
> c++ examples examples_zc fast_bpf go lib libpcap-1.7.4 Makefile
> c++ snort tcpdump-4.7.4
This should fix your build issue:
cd PF_RING/userland
ln -s libpcap-1.7.4 libpcap
--
- Justin Azoff
More information about the Bro
mailing list