[Bro] Bro 2.0 packets dropped

Mon Feb 6 11:30:22 PST 2012

Here my problem.   I have a single server and a defined 10 works on it to divided up the load.  Here the output of "broctl status"

root at homey manager]# broctl status
Name       Type       Host       Status        Pid    Peers  Started              
manager    manager    homey.tacc.utexas.edu running       6202   11     05 Feb 15:22:15  
proxy-1    proxy      homey.tacc.utexas.edu running       6237   11     05 Feb 15:22:17  
worker-1   worker     mojo1.tacc.utexas.edu running       18356  2      05 Feb 15:26:41  
worker-10  worker     mojo1.tacc.utexas.edu running       18350  2      05 Feb 15:26:41  
worker-2   worker     mojo1.tacc.utexas.edu running       18348  2      05 Feb 15:26:41  
worker-3   worker     mojo1.tacc.utexas.edu running       18349  2      05 Feb 15:26:41  
worker-4   worker     mojo1.tacc.utexas.edu running       18357  2      05 Feb 15:26:41  
worker-5   worker     mojo1.tacc.utexas.edu running       18352  2      05 Feb 15:26:41  
worker-6   worker     mojo1.tacc.utexas.edu running       18353  2      05 Feb 15:26:41  
worker-7   worker     mojo1.tacc.utexas.edu running       18354  2      05 Feb 15:26:41  
worker-8   worker     mojo1.tacc.utexas.edu running       18355  2      05 Feb 15:26:41  
worker-9   worker     mojo1.tacc.utexas.edu running       18351  2      05 Feb 15:26:41  

I now add woker-11 to to the configuration and "bro status" returns:

BroControl] > status
Name       Type       Host       Status        Pid    Peers  Started              
manager    manager    homey.tacc.utexas.edu running       29316  12     06 Feb 13:16:59  
proxy-1    proxy      homey.tacc.utexas.edu running       29351  12     06 Feb 13:17:01  
worker-1   worker     mojo1.tacc.utexas.edu running       25026  2      06 Feb 13:17:06  
worker-10  worker     mojo1.tacc.utexas.edu running       25028  2      06 Feb 13:17:06  
worker-11  worker     mojo1.tacc.utexas.edu running       25033  2      06 Feb 13:17:06  
worker-2   worker     mojo1.tacc.utexas.edu running       25032  2      06 Feb 13:17:06  
worker-3   worker     mojo1.tacc.utexas.edu running       25025  2      06 Feb 13:17:06  
worker-4   worker     mojo1.tacc.utexas.edu running       25031  2      06 Feb 13:17:06  
worker-5   worker     mojo1.tacc.utexas.edu running       25029  2      06 Feb 13:17:06  
worker-6   worker     mojo1.tacc.utexas.edu running       25027  2      06 Feb 13:17:06  
worker-7   worker     mojo1.tacc.utexas.edu running       25034  2      06 Feb 13:17:06  
worker-8   worker     mojo1.tacc.utexas.edu running       25030  2      06 Feb 13:17:06  
worker-9   worker     mojo1.tacc.utexas.edu running       25036  ???    06 Feb 13:17:06  

Notice the ???.   It an indication that something is not working  correct;y the bro communication library. 

-----Original Message-----
From: bro-bounces at bro-ids.org [mailto:bro-bounces at bro-ids.org] On Behalf Of Seth Hall
Sent: Monday, February 06, 2012 12:55 PM
To: Machiel van Veen
Cc: bro at bro-ids.org
Subject: Re: [Bro] Bro 2.0 packets dropped

On Feb 3, 2012, at 10:18 AM, Machiel van Veen wrote:

> It is one interface, there might be a problem load balancing. I've switched to 
> a standalone setup for now.

If you aren't taking any steps to load balance the traffic then it definitely isn't working.  We don't have automated load balanced configuration available in BroControl yet. :)

Today, I did just write a script that automates a BPF based load balancing technique on clusters which will be getting merged in along with the rest of the automated load balancing code soon.

> "bro: 1328281729.277621 recvd=3553337 dropped=4503 link=3557842"
> "2012-02-03-15:39:46 CaptureLoss::Too_Much_Loss

> The capture loss script detected an estimated loss rate above 27.282%"

Are sniffing from a tap or a SPAN port?  I'm a little suspicious because the first line indicates that the NIC was showing 0.1% packet loss, but the second line indicates much more loss.  The misc/capture-loss.bro script can detect loss due to reasons beyond the monitoring host (like an overloaded SPAN port) so I'm just trying to figure out where there is a such a huge disparity between the two measurements.

Oh, one other thought.  Are you disabling all of the offload features of your NIC?  Here's an article about it:
	1. http://securityonion.blogspot.com/2011/10/when-is-full-packet-capture-not-full.html

Is the MTU on your NIC larger than 8192 (Bro 2.0's default snaplen).  If there are packets larger than that they won't be seen by default.

>> Oh, that brings up another question.  What NICs are you using?
> 
> Broadcom NetXtreme II BCM5708 1000Base-T (B2) PCI-X 64-bit 133MHz
> driver: bnx2
> version: 2.1.11
> firmware-version: bc 4.6.0 ipms 1.6.0

I usually recommend not using Broadcom nics for monitoring.  At times with various broadcom nics I've run into weird problems so I tend to avoid them.

  .Seth

--
Seth Hall
International Computer Science Institute
(Bro) because everyone has a network
http://www.bro-ids.org/

_______________________________________________
Bro mailing list
bro at bro-ids.org
http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/bro