[Bro] Unexplained Performance Differences Between Like Servers

Jason Batchelor jxbatchelor at gmail.com
Fri Jun 13 06:01:54 PDT 2014


Hello everyone:

I have Bro installed on two Dell r720s each with the following specs...

Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz x 32
48GB RAM

Running: CentOs 6.5

Both have the following PF_RING configuration:

PF_RING Version          : 6.0.2 ($Revision: 7746$)
Total rings              : 16
Standard (non DNA) Options
Ring slots               : 32768
Slot version             : 15
Capture TX               : No [RX only]
IP Defragment            : No
Socket Mode              : Standard
Transparent mode         : Yes [mode 0]
Total plugins            : 0
Cluster Fragment Queue   : 1917
Cluster Fragment Discard : 26648
The only difference in PF Ring is the other server (Server A) is going off
revision 7601, where B is rev 7746.

I've tuned the NIC to the following settings...

ethtool -K p4p2 tso off
ethtool -K p4p2 gro off
ethtool -K p4p2 lro off
ethtool -K p4p2 gso off
ethtool -K p4p2 rx off
ethtool -K p4p2 tx off
ethtool -K p4p2 sg off
ethtool -K p4p2 rxvlan off
ethtool -K p4p2 txvlan off
ethtool -N p4p2 rx-flow-hash udp4 sdfn
ethtool -N p4p2 rx-flow-hash udp6 sdfn
ethtool -n p4p2 rx-flow-hash udp6
ethtool -n p4p2 rx-flow-hash udp4
ethtool -C p4p2 rx-usecs 1000
ethtool -C p4p2 adaptive-rx off
ethtool -G p4p2 rx 4096
I've got the following sysctl settings on each.

# turn off selective ACK and timestamps
net.ipv4.tcp_sack = 0
net.ipv4.tcp_timestamps = 0
# memory allocation min/pressure/max.
# read buffer, write buffer, and buffer space
net.ipv4.tcp_rmem = 10000000 10000000 10000000
net.ipv4.tcp_wmem = 10000000 10000000 10000000
net.ipv4.tcp_mem = 10000000 10000000 10000000
net.core.rmem_max = 524287
net.core.wmem_max = 524287
net.core.rmem_default = 524287
net.core.wmem_default = 524287
net.core.optmem_max = 524287
net.core.netdev_max_backlog = 300000
Each bro configuration is using the following...

[manager]
type=manager
host=localhost
[proxy-1]
type=proxy
host=localhost
[worker-1]
type=worker
host=localhost
interface=p4p2
lb_method=pf_ring
lb_procs=16
Both have the same NIC driver version (ixgbe):
3.15.1-k

Same services installed (min install).

Slightly different Kernel versions...
Server A (2.6.32-431.11.2.el6.x86_64)
Server B (2.6.32-431.17.1.el6.x86_64)


At the moment Server A is getting about 700MB/s and Server B is getting
about 600Mb/s.

What I don't understand, is Server A is having several orders of magnatude
better performance compared to Server B?

TOP from A (included a few bro workers):

top - 12:48:45 up 1 day, 17:03,  2 users,  load average: 5.30, 3.99, 3.13
Tasks: 706 total,  19 running, 687 sleeping,   0 stopped,   0 zombie
Cpu(s): 33.9%us,  6.6%sy,  1.1%ni, 57.2%id,  0.0%wa,  0.0%hi,  1.2%si,
0.0%st
Mem:  49376004k total, 33605828k used, 15770176k free,    93100k buffers
Swap:  2621432k total,     9760k used,  2611672k free,  9206880k cached
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 5768 root      20   0 1808m 1.7g 519m R 100.0  3.6  32:24.92 bro
 5760 root      20   0 1688m 1.6g 519m R 99.7  3.4  34:08.36 bro
 3314 root      20   0 2160m 269m 4764 R 96.1  0.6  30:14.12 bro
 5754 root      20   0 1451m 1.4g 519m R 82.8  2.9  36:40.02 bro
TOP from B (included a few bro workers)

top - 12:49:33 up 14:24,  2 users,  load average: 10.28, 9.31, 8.06
Tasks: 708 total,  25 running, 683 sleeping,   0 stopped,   0 zombie
Cpu(s): 41.6%us,  6.0%sy,  1.0%ni, 50.4%id,  0.0%wa,  0.0%hi,  1.1%si,
0.0%st
Mem:  49376004k total, 31837340k used, 17538664k free,   147212k buffers
Swap:  2621432k total,        0k used,  2621432k free, 13494332k cached
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 3178 root      20   0 1073m 1.0g 264m R 100.0  2.1 401:47.31 bro
 3188 root      20   0  881m 832m 264m R 100.0  1.7 377:48.90 bro
 3189 root      20   0 1247m 1.2g 264m R 100.0  2.5 403:22.95 bro
 3193 root      20   0  920m 871m 264m R 100.0  1.8 429:45.98 bro
Both have the same amount of Bro workers. I just do not understand
why Server A is literally half the utilization on top of seeing more
traffic? The only real and consistent difference between the two I see is
that server A seems to have twice the amount of SHR (shared memory)
compared to server B.

Could this be part of the issue, if not the root cause? How might I go
about rectifying the issue?

FWIW, both are not dropping packets and doing well. However, I want to run
other apps on top of this, and the poor performance on Server B is likely
to have effects on it.

Thanks advance for the advice!

-Jason
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ICSI.Berkeley.EDU/pipermail/bro/attachments/20140613/66803369/attachment.html 


More information about the Bro mailing list