[Bro] FW: Issue: load balancer PF_RING drops 25% of incoming packets

Rosinger, Enno (DualStudy) enno.rosinger at hpe.com
Tue Jul 26 17:27:43 PDT 2016


Hi Justin,

Thank you for the fast reply.

21 Million received packets: Bro receives it's traffic on an isolated network (where the traffic is generated another server by TCPreplay). I manually take the stats of received packets of the NIC before and after a replaying by issuing "ifconfig eno2(interface-name)" .
16 Million handled packets: I use broctl and issue the command "netstats" to see the number of each worker process' received packets. If you make a sum out of that you will come to 16 Million (NOTE: now 18 Million, as I upgraded to Zero Copy drivers since the last mail).

###Ifconfig on Bro system###
###Before replaying###
[root at slinky-3-4 kernel]# ifconfig eno2
[...]
        RX packets 25758824  bytes 20353552393 (18.9 GiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 182  bytes 36558 (35.7 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0 [...]

###After replaying###
[root at slinky-3-4 kernel]# ifconfig eno2
[...]
        RX packets 47447181  bytes 37400251832 (34.8 GiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 268  bytes 54486 (53.2 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0 [...]

That makes 47447181 - 25758824  = 21.688.357 received packets

###netstats in broctl on Bro system###
### after replaying ###
[BroControl] > netstats
 worker-1-1: 1469577816.953862 recvd=5088052 dropped=0 link=5088052
 worker-1-2: 1469577817.153796 recvd=4205599 dropped=0 link=4205599
 worker-1-3: 1469577817.353889 recvd=4562288 dropped=0 link=4562288
 worker-1-4: 1469577817.554795 recvd=4546975 dropped=0 link=4546975

The sum of this is 18.402.914 packets, which are seen by BRO as "on the link".

Thanks to your help on the build issue I can also support this number with the stats of pfcount (NOTE: This is another run - slightly different numbers ) ##PFcount result Absolute Stats: [18'416'555 pkts total][0 pkts dropped][0.0% dropped]
[18'416'555 pkts rcvd][17'225'248'719 bytes rcvd][58'886.73 pkt/sec][440.62 Mbit/sec] ========================= Actual Stats: [0 pkts rcvd][722.14 ms][0.00 pps][0.00 Gbps]

As you requeted the capture_loss stats. I currently do not understand what the issue is with this.
I hope you can help me track down the cause for this numbers ...
###first caputer loss  file###
#path   capture_loss
#open   2016-07-26-16-48-14
#fields ts      ts_delta        peer    gaps    acks    percent_lost
#types  time    interval        string  count   count   double
1469576894.926898       900.000078      worker-1-4      1156978 1683888 68.708726
1469576894.926602       900.000073      worker-1-1      1396713 1911004 73.087916
1469576894.977632       900.000080      worker-1-2      1055436 1544723 68.32526
1469576895.027647       900.000080      worker-1-3      1218489 1710519 71.235046
#close  2016-07-26-17-00-0 

###second caputer loss file###
#open   2016-07-26-17-03-37
#fields ts      ts_delta        peer    gaps    acks    percent_lost
#types  time    interval        string  count   count   double
1469577794.926695       900.000093      worker-1-1      0       0       0.0
1469577794.977721       900.000089      worker-1-2      0       0       0.0
1469577795.027754       900.000107      worker-1-3      0       0       0.0
1469577794.927012       900.000114      worker-1-4      0       0       0.0
#close  2016-07-26-17-05-030

Looking forward to your response. It already helps me a lot to have more support on this issue.

Best,
Enno

-----Original Message-----
From: Azoff, Justin S [mailto:jazoff at illinois.edu]
Sent: Dienstag, 26. Juli 2016 14:09
To: Rosinger, Enno (DualStudy) <enno.rosinger at hpe.com>
Cc: bro at bro.org
Subject: Re: [Bro] Issue: load balancer PF_RING drops 25% of incoming packets


> On Jul 26, 2016, at 4:23 PM, Rosinger, Enno (DualStudy) <enno.rosinger at hpe.com> wrote:
> 
> Strangely only 16 Million of my 21 Million packet input pass through the PF_RING kernel module. Nevertheless they are then distributed correctly on the Bro processes.
> How can I avoid this loss of 5 Million packets and how can I verify that PF_RING is configured correctly?

What are you using to measure the difference in packet counts?  Where is the 21 and 16 coming from?

Can you add this to your local.bro and see what it logs to capture_loss.log after 30 minutes or so?

    @load misc/capture-loss


  
> 
> I use Intel Corporation I350 Gigabit Network Connection as NICs. They work with the igb drivers.
> The input rate is 0.5Gb/s = 60k to 80k packets/s and currently I am 
> working without the ZeroCopy drivers It is verified that all of my 21 Million packets are received by my NIC’s driver.
> The PF_Ring module itself exists and BRO is running with load balancing.
>  
> Looking forward to your response and hope to solve this problem with you. Below you will find more detailed information about my system.
> If you need something else let me know.
>  
> Best,
> Enno
>  
> Additional information:
>  
> One interesting fact: I cannot run “make” in 
> “PF_RING/userland/examples”, because
> gcc: error: ../libpcap/libpcap.a: No such file or directory
>  
> PF_RING/userland looks like this. Indeed “libpcap” is missing
> c++  examples  examples_zc  fast_bpf  go  lib  libpcap-1.7.4  Makefile  
> c++ snort  tcpdump-4.7.4

This should fix your build issue:

    cd PF_RING/userland
    ln -s libpcap-1.7.4 libpcap



--
- Justin Azoff




More information about the Bro mailing list