[Bro] Unexplained Performance Differences Between Like Servers

Fri Jun 13 07:05:20 PDT 2014

The different traffic profiles can can cause different performance. My guess is you are seeing more traffic of a certain type on one of the boxes vs the other. To really know you would need to profile the traffic but if you think about it if more http traffic for instance would be more files processing etc.

Mike

On Jun 13, 2014, at 9:49 AM, John Hoyt <john.h.hoyt at gmail.com> wrote:

> Hey Jason,
> 
> What versions of Bro, and it is the same for both?  I had some serious resource issues from one of the Beta versions recently, and switched back to the stable version. 
> 
> -John
> 
> 
> On Fri, Jun 13, 2014 at 9:01 AM, Jason Batchelor <jxbatchelor at gmail.com> wrote:
> Hello everyone:
>  
> I have Bro installed on two Dell r720s each with the following specs...
>  
> Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz x 32
> 48GB RAM
>  
> Running: CentOs 6.5
>  
> Both have the following PF_RING configuration:
>  
> PF_RING Version          : 6.0.2 ($Revision: 7746$)
> Total rings              : 16
> Standard (non DNA) Options
> Ring slots               : 32768
> Slot version             : 15
> Capture TX               : No [RX only]
> IP Defragment            : No
> Socket Mode              : Standard
> Transparent mode         : Yes [mode 0]
> Total plugins            : 0
> Cluster Fragment Queue   : 1917
> Cluster Fragment Discard : 26648
> The only difference in PF Ring is the other server (Server A) is going off revision 7601, where B is rev 7746.
>  
> I've tuned the NIC to the following settings...
>  
> ethtool -K p4p2 tso off
> ethtool -K p4p2 gro off
> ethtool -K p4p2 lro off
> ethtool -K p4p2 gso off
> ethtool -K p4p2 rx off
> ethtool -K p4p2 tx off
> ethtool -K p4p2 sg off
> ethtool -K p4p2 rxvlan off
> ethtool -K p4p2 txvlan off
> ethtool -N p4p2 rx-flow-hash udp4 sdfn
> ethtool -N p4p2 rx-flow-hash udp6 sdfn
> ethtool -n p4p2 rx-flow-hash udp6
> ethtool -n p4p2 rx-flow-hash udp4
> ethtool -C p4p2 rx-usecs 1000
> ethtool -C p4p2 adaptive-rx off
> ethtool -G p4p2 rx 4096
> I've got the following sysctl settings on each.
>  
> # turn off selective ACK and timestamps
> net.ipv4.tcp_sack = 0
> net.ipv4.tcp_timestamps = 0
> # memory allocation min/pressure/max.
> # read buffer, write buffer, and buffer space
> net.ipv4.tcp_rmem = 10000000 10000000 10000000
> net.ipv4.tcp_wmem = 10000000 10000000 10000000
> net.ipv4.tcp_mem = 10000000 10000000 10000000
> net.core.rmem_max = 524287
> net.core.wmem_max = 524287
> net.core.rmem_default = 524287
> net.core.wmem_default = 524287
> net.core.optmem_max = 524287
> net.core.netdev_max_backlog = 300000
> Each bro configuration is using the following...
>  
> [manager]
> type=manager
> host=localhost
> [proxy-1]
> type=proxy
> host=localhost
> [worker-1]
> type=worker
> host=localhost
> interface=p4p2
> lb_method=pf_ring
> lb_procs=16
> Both have the same NIC driver version (ixgbe):
> 3.15.1-k
>  
> Same services installed (min install).
>  
> Slightly different Kernel versions...
> Server A (2.6.32-431.11.2.el6.x86_64)
> Server B (2.6.32-431.17.1.el6.x86_64)
>  
>  
> At the moment Server A is getting about 700MB/s and Server B is getting about 600Mb/s.
>  
> What I don't understand, is Server A is having several orders of magnatude better performance compared to Server B?
>  
> TOP from A (included a few bro workers):
>  
> top - 12:48:45 up 1 day, 17:03,  2 users,  load average: 5.30, 3.99, 3.13
> Tasks: 706 total,  19 running, 687 sleeping,   0 stopped,   0 zombie
> Cpu(s): 33.9%us,  6.6%sy,  1.1%ni, 57.2%id,  0.0%wa,  0.0%hi,  1.2%si,  0.0%st
> Mem:  49376004k total, 33605828k used, 15770176k free,    93100k buffers
> Swap:  2621432k total,     9760k used,  2611672k free,  9206880k cached
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>  5768 root      20   0 1808m 1.7g 519m R 100.0  3.6  32:24.92 bro
>  5760 root      20   0 1688m 1.6g 519m R 99.7  3.4  34:08.36 bro
>  3314 root      20   0 2160m 269m 4764 R 96.1  0.6  30:14.12 bro
>  5754 root      20   0 1451m 1.4g 519m R 82.8  2.9  36:40.02 bro
> TOP from B (included a few bro workers)
>  
> top - 12:49:33 up 14:24,  2 users,  load average: 10.28, 9.31, 8.06
> Tasks: 708 total,  25 running, 683 sleeping,   0 stopped,   0 zombie
> Cpu(s): 41.6%us,  6.0%sy,  1.0%ni, 50.4%id,  0.0%wa,  0.0%hi,  1.1%si,  0.0%st
> Mem:  49376004k total, 31837340k used, 17538664k free,   147212k buffers
> Swap:  2621432k total,        0k used,  2621432k free, 13494332k cached
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>  3178 root      20   0 1073m 1.0g 264m R 100.0  2.1 401:47.31 bro
>  3188 root      20   0  881m 832m 264m R 100.0  1.7 377:48.90 bro
>  3189 root      20   0 1247m 1.2g 264m R 100.0  2.5 403:22.95 bro
>  3193 root      20   0  920m 871m 264m R 100.0  1.8 429:45.98 bro
> Both have the same amount of Bro workers. I just do not understand why Server A is literally half the utilization on top of seeing more traffic? The only real and consistent difference between the two I see is that server A seems to have twice the amount of SHR (shared memory) compared to server B.
>  
> Could this be part of the issue, if not the root cause? How might I go about rectifying the issue?
>  
> FWIW, both are not dropping packets and doing well. However, I want to run other apps on top of this, and the poor performance on Server B is likely to have effects on it.
>  
> Thanks advance for the advice!
>  
> -Jason
>  
>  
>  
> 
> _______________________________________________
> Bro mailing list
> bro at bro-ids.org
> http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/bro
> 
> _______________________________________________
> Bro mailing list
> bro at bro-ids.org
> http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/bro

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ICSI.Berkeley.EDU/pipermail/bro/attachments/20140613/2a23b4da/attachment.html