[Bro] Capture Loss
dwdixon at umich.edu
Tue Mar 14 19:49:17 PDT 2017
Glad to hear you're now on the right track! You're very welcome. FWIW I
think the other person on the other similar thread I copied my reply from
might not have known about installing PF_RING with DKMS so wanted to cover
possible kernel module issues etc.. I was going to guess your issue was
upstream based on what you described in your first email but I didn't want
to speculate too much heh. Taps are def. the way to go if you have the
option to use them instead of SPAN ports, for sure.
On Tue, Mar 14, 2017 at 5:47 PM, Arash Fallah <af7 at umbc.edu> wrote:
> Hey Drew,
> I've been on the list for over a year, I tried searching to see similar
> issues but I didn't find it. We are capturing from a span port, we have 3
> edge routers and tons of asymmetrical routing. We are experiencing packet
> loss at such a high rate, we believe the error might be upstream (thanks to
> Seth)! We are going to try passive taps instead of capturing from SPAN
> PF_RING is installed with DKMS. All offloading has been disabled and I
> have been checking reporter.log for invalid checksums (none so far). CPU
> pinning is enabled. Though I did I did not know about ring slots for
> PF_RING, I do not think our network at 3Gbps requires increasing the
> threshold from my research.
> Thanks so much, you were on point with your questions.
> On Thu, Mar 9, 2017 at 4:27 PM, Drew Dixon <dwdixon at umich.edu> wrote:
>> Did you search the email list already or did you just join the list? Are
>> you capturing the traffic from a SPAN port or a Tap? Is your network full
>> of asymmetrical traffic/routing? Answers to these two questions first is
>> pretty important IMO. I responded to a very similar question around 6 days
>> ago or so on list...here's what I said again:
>> First I think the recommended number of workers is something like number
>> of *real* cores (not counting hyperthreading) -2 so for 8 *real* cores you
>> would use 6 workers, if you have 16 *real* cores you probably want closer
>> to 14 workers if this is a dedicated bro box. Maybe try bumping up your
>> number of workers and enabling cpu pinning if you haven't done so.
>> Have you reviewed everything located here? :
>> Specifically a few things come to mind...I know you mentioned NIC
>> settings but are you sure you disabled all the NIC offloading features
>> using ethtool?, more detail on that at this link:
>> Also, wouldn't hurt to double check the the pf_ring kernel module is
>> loaded/loading staying loaded? If you patch the server and the kernel gets
>> updated unless you have something automated to reload/reinstall the pf_ring
>> module you will probably need to reload the pf_ring module for the new
>> Also, did you configure the number of ring slots for PF_RING ?
>> Check to be sure that /etc/modprobe.d/pf_ring.conf exists for your
>> PF_RING installation...this is where you will configure the number of ring
>> slots for PF_RING, the default is 4096 I believe but on busy networks this
>> needs to be increased as appropriate (in increments of 4096)...the max
>> value is 65534. I would try that if you've tried everything else at the
>> first link above to no avail...
>> This is also a great resource re: PF_RING and number of ring slots:
>> Hope this helps,
>> On Tue, Mar 7, 2017 at 10:34 AM, Arash Fallah <af7 at umbc.edu> wrote:
>>> I'm running Bro in a clustered configuration using PF_RING to have 8
>>> separate workers on one box. Additionally, I have commented out almost
>>> everything in the default local.bro to run in Bro as efficiently as
>>> possible. Together, these 8 workers are using less than 20% of total CPU
>>> However, we are experiencing capture loss consistently in the 50% range,
>>> even though CPUs are idle 80% of the time on average.
>>> Does anyone have any experience with this? I would greatly appreciate
>>> the help.
>>> Bro mailing list
>>> bro at bro-ids.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Bro