[Bro] Bro Packet Loss / 10gb ixgbe / pf_ring

Thu Jan 7 18:14:38 PST 2016

Ah, I'm still on 6.0.3 with DNA/Libzero, so didn't realize that
adapters_to_enable was changed/removed. I'm about to start testing 6.2
with ZC soon, but probably using zbalance_ipc instead of relying on RSS.
If you are running into RSS limits, but have more cores on another
socket using zbalance_ipc should allow you to do things like aggregate
NICs and do 2,4,5-tuple hashing to as many worker queues/ring buffers as
you can handle, as well as duplicate traffic to a secondary app such as
capstats, since it takes over acting as the on host load-balancer and
doesn't rely on RSS. In that case you set RSS=1 for each interface going
into zbalance_ipc.  Also looking at the 2.4.1 broctl related code in
<path-to-bro>/lib/broctl/plugins/lb_pf_ring.py seems to imply that
possibly the ZC interface naming is only supported when using
zbalance_ipc, but perhaps I'm wrong, relevant snippet is below:

            if nn.interface.startswith("zc"):
                # For the case where a user is running zbalance_ipc
                nn.interface = "%s@%d" % (nn.interface, app_instance)

For DNA there are specific entries for DNA using RSS and DNA using
pfdnacluster_master in the code. So possibly try ZC with zbalance_ipc
with an interface name in node.cfg of zc:whatever clusterid you assigned.

There is a thread discussing using zbalance_ipc, including syntax in the
bro archives that might be more helpful than myself starting with this
post
<http://mailman.icsi.berkeley.edu/pipermail/bro/2015-February/008154.html>.
It might be worth reading through that whole thread as it involves
troubleshooting.   

Bro probably isn't going to like duplicate packets such as if you are
tapping both the inside and outside interfaces of a firewall. Have you
checked weird.log to see if it is complaining about that? Are the taps
you refer to plugged directly into your Bro sensor, or coming off some
sort of tap aggregation load-balancer, or are you really using span port
(the latter can sometimes see performance hits due to sampling or
router/switch cpu load)? If using an optical tap is there any chance the
fiber plant isn't installed such that you see both send and receive? Do
you do any sort of packet slicing that might throw off loss numbers?
Looking at weird.log might give you an indication if you are seeing one
sided conversations as well or have other upstream network issues.

Another thought is that if you have jumbo frames enabled on your network
you may want to check MTU sizes. I currently have mine set to 9216 to
match the max packet size on our upstream router. If you are collecting
flows somewhere it might also be worth looking to see if you have any
sources of large flows that might be impacting overall sensor performance.

~Gary

On 1/7/2016 7:03 PM, Nash, Paul wrote:
> Thanks Gary. 
>   Sorry to top post, I'm stuck on OWA at the moment.  Thanks for your suggestions - here are some quick replies:
>
> - capture_loss.bro - running it, every 15min it reports ~70% packet loss (or greater) across all of the workers
> - 'adapters_to_enable' ixgbe.ko argument doesn't exist in the latest driver bundled w/pf_ring 6.2.0
> - I've enabled the multi-queue stuff (MQ=1) on the 2nd interface (MQ=0,2) as well as enabled the 16 hw RSS queues = (RSS=1,16)
> - I have a license in /etc/pf_ring
> - bro is linked against the pf_ring enabled libpcap 
> - I've confirmed that the .ko's I'm loading are the latest from pf_ring 6.2.0
>
>
> Right now, pfcount says that eth3 is receiving 462Mbit/sec - I left it running for 5 minutes or so and there are zero dropped packets.  As soon as I start up bro, I'm already dropping 50%+ packets per worker. 
>
> The only other thing I can think of could be packet duplication from some new taps that we deployed and potentially protocols that bro isn't parsing?
>
>  -Paul
>
> ________________________________________
> From: Gary Faulkner [gfaulkner.nsm at gmail.com]
> Sent: Thursday, January 07, 2016 7:04 PM
> To: Nash, Paul
> Cc: bro at bro.org
> Subject: Re: [Bro] Bro Packet Loss / 10gb ixgbe / pf_ring
>
> Some thoughts inline...
>
> On 1/7/16 3:37 PM, Nash, Paul wrote:
>> I’m trying to debug some packet drops that I’m experiencing and am turning to the list for help.   The recorded packet loss is ~50 – 70% at times.   The packet loss is recorded in broctl’s netstats as well as in the notice.log file.
>>
>> Running netstats at startup – I’m dropping more than I’m receiving from the very start.
> Have you tried enabling the bro capture_loss script in your local.bro as
> a way to double check your loss numbers? It will give you per worker
> loss on 15 minute intervals in a separate log file.
>
> In local.bro:
> @load policy/misc/capture-loss
>
>> insmod /lib/modules/2.6.32-431.11.2.el6.x86_64/kernel/net/pf_ring/pf_ring.ko enable_tx_capture=0 min_num_slots=32768 quick_mode=1
>>
>> insmod  /lib/modules/2.6.32-431.11.2.el6.x86_64/kernel/drivers/net/ixgbe/ixgbe.ko numa_cpu_affinity=0,0 MQ=0,1 RSS=0,0
>>
>> I checked /proc/sys/pci/devices to confirm that the interface is running on numa_node 0.  ‘lscpu’ shows that cpus 0-7 are one node 0, socket 0, and cpus 8-15 are on node 1, socket 0.  I figured having the 16 RSS queues on the same socket is probably better than having them bounce around.
>>
>>
>> The node.cfg looks like this:
>>
>>
>> [manager]
>>
>> type=manager
>>
>> host=10.99.99.15
>>
>>
>> #
>>
>> [proxy-1]
>>
>> type=proxy
>>
>> host=10.99.99.15
>>
>>
>> #
>>
>> [worker-1]
>>
>> type=worker
>>
>> host=10.99.99.15
>>
>> interface=eth3
>>
>> lb_method=pf_ring
>>
>> lb_procs=16
>>
>> pin_cpus=0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15
>>
>>
>> I have a license for ZC, and if I change the interface from eth3 to zc:eth3, it will spawn up 16 workers, but only one of them is receiving any traffic.  I’m assuming that it is looking at zc:eth3 at 0 only.   Netstats proves that out.   If I run pfcount –I zc at eth3, it will show me that I’m receiving ~1gbp/s of traffic on the interface and not dropping anything.
> As far as ZC usage, when using in ZC mode did you specify which adapters
> to enable at the end of your ixgbe insmod statement like this -->
> adapters_to_enable=<insert comma separated list of licensed mac
> addresses you want to use>? Also did you try setting RSS to match the
> number of workers instead of leaving it up to the NIC? Example RSS=16
> instead of 0 (comma separated per NIC if more than 1 NIC). Did you try
> pfcount –I zc at eth3@0 (thru 15) etc to test each RSS queue? Did you put
> the necessary license files in /etc/pf_ring? Also, just to be certain,
> are you using the IXGBE drivers that come with PF_RING and have you
> compiled Bro against the PF_RING libpcap?
>
>> Am I missing something obvious?  I saw many threads about disabling hyper threading, but that seems specific to intel processors – I’m running AMD operterons with their own hyper transport stuff which doesn’t create virtual cpus.
> I'm not sure I understand AMD architecture well enough to know how cores
> map to nodes, so I can't comment on your pinning configuration in terms
> of workers per core, but assuming each worker is pinned to a physical
> core and you truly have 16 physical cores on that socket, have you left
> any cores unpinned somewhere else (maybe a processor in another socket),
> for the system, bro manager, proxy etc to use? If not you could have
> other processes stomping on your workers. If any workers are sharing
> physical cores that could be problematic as well. Do you have htop or
> something similar installed where you can easily watch whether processes
> seem to be competing for the same physical core?
>
> Have you tried running capstats (broctl capstats if using broctl) to see
> what sort of traffic bro thinks it is seeing across all workers when you
> are seeing loss? Depending on the clock speed and efficiency of each
> core you may be able to process anywhere from 100-300+Mbps per core, but
> if that 1Gbps of traffic was only representative of a single RSS queue
> on your 10G NIC you could be oversubscribed. If you have free cores on
> another socket it might be worth taking whatever small performance hit
> there is over the bus to have more workers running on those other cores.
> Also, I tend to leave the 1st couple logical cores open for the system
> as Linux at least seems to prefer them for system use. I do tend to find
> pinning workers to specific cores helps overall in the loss department
> vs letting workers bounce between cores, so I think you are on the right
> track.
>
> ~Gary

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ICSI.Berkeley.EDU/pipermail/bro/attachments/20160107/d1d2d614/attachment.html