[Bro] Unexplained Performance Differences Between Like Servers
Gilbert Clark
gc355804 at ohio.edu
Fri Jun 13 06:54:51 PDT 2014
Hi Jason:
Is the type of traffic in the 600 Mbps stream similar to the type of
traffic in the 700 Mbps stream?
Cheers,
Gilbert Clark
On 6/13/14, 9:01 AM, Jason Batchelor wrote:
> Hello everyone:
> I have Bro installed on two Dell r720s each with the following specs...
> Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz x 32
> 48GB RAM
> Running: CentOs 6.5
> Both have the following PF_RING configuration:
> PF_RING Version : 6.0.2 ($Revision: 7746$)
> Total rings : 16
> Standard (non DNA) Options
> Ring slots : 32768
> Slot version : 15
> Capture TX : No [RX only]
> IP Defragment : No
> Socket Mode : Standard
> Transparent mode : Yes [mode 0]
> Total plugins : 0
> Cluster Fragment Queue : 1917
> Cluster Fragment Discard : 26648
> The only difference in PF Ring is the other server (Server A) is going
> off revision 7601, where B is rev 7746.
> I've tuned the NIC to the following settings...
> ethtool -K p4p2 tso off
> ethtool -K p4p2 gro off
> ethtool -K p4p2 lro off
> ethtool -K p4p2 gso off
> ethtool -K p4p2 rx off
> ethtool -K p4p2 tx off
> ethtool -K p4p2 sg off
> ethtool -K p4p2 rxvlan off
> ethtool -K p4p2 txvlan off
> ethtool -N p4p2 rx-flow-hash udp4 sdfn
> ethtool -N p4p2 rx-flow-hash udp6 sdfn
> ethtool -n p4p2 rx-flow-hash udp6
> ethtool -n p4p2 rx-flow-hash udp4
> ethtool -C p4p2 rx-usecs 1000
> ethtool -C p4p2 adaptive-rx off
> ethtool -G p4p2 rx 4096
> I've got the following sysctl settings on each.
> # turn off selective ACK and timestamps
> net.ipv4.tcp_sack = 0
> net.ipv4.tcp_timestamps = 0
> # memory allocation min/pressure/max.
> # read buffer, write buffer, and buffer space
> net.ipv4.tcp_rmem = 10000000 10000000 10000000
> net.ipv4.tcp_wmem = 10000000 10000000 10000000
> net.ipv4.tcp_mem = 10000000 10000000 10000000
> net.core.rmem_max = 524287
> net.core.wmem_max = 524287
> net.core.rmem_default = 524287
> net.core.wmem_default = 524287
> net.core.optmem_max = 524287
> net.core.netdev_max_backlog = 300000
> Each bro configuration is using the following...
> [manager]
> type=manager
> host=localhost
> [proxy-1]
> type=proxy
> host=localhost
> [worker-1]
> type=worker
> host=localhost
> interface=p4p2
> lb_method=pf_ring
> lb_procs=16
> Both have the same NIC driver version (ixgbe):
> 3.15.1-k
> Same services installed (min install).
> Slightly different Kernel versions...
> Server A (2.6.32-431.11.2.el6.x86_64)
> Server B (2.6.32-431.17.1.el6.x86_64)
> At the moment Server A is getting about 700MB/s and Server B is
> getting about 600Mb/s.
> What I don't understand, is Server A is having several orders of
> magnatude better performance compared to Server B?
> TOP from A (included a few bro workers):
> top - 12:48:45 up 1 day, 17:03, 2 users, load average: 5.30, 3.99, 3.13
> Tasks: 706 total, 19 running, 687 sleeping, 0 stopped, 0 zombie
> Cpu(s): 33.9%us, 6.6%sy, 1.1%ni, 57.2%id, 0.0%wa, 0.0%hi, 1.2%si,
> 0.0%st
> Mem: 49376004k total, 33605828k used, 15770176k free, 93100k buffers
> Swap: 2621432k total, 9760k used, 2611672k free, 9206880k cached
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 5768 root 20 0 1808m 1.7g 519m R 100.0 3.6 32:24.92 bro
> 5760 root 20 0 1688m 1.6g 519m R 99.7 3.4 34:08.36 bro
> 3314 root 20 0 2160m 269m 4764 R 96.1 0.6 30:14.12 bro
> 5754 root 20 0 1451m 1.4g 519m R 82.8 2.9 36:40.02 bro
> TOP from B (included a few bro workers)
> top - 12:49:33 up 14:24, 2 users, load average: 10.28, 9.31, 8.06
> Tasks: 708 total, 25 running, 683 sleeping, 0 stopped, 0 zombie
> Cpu(s): 41.6%us, 6.0%sy, 1.0%ni, 50.4%id, 0.0%wa, 0.0%hi, 1.1%si,
> 0.0%st
> Mem: 49376004k total, 31837340k used, 17538664k free, 147212k buffers
> Swap: 2621432k total, 0k used, 2621432k free, 13494332k cached
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 3178 root 20 0 1073m 1.0g 264m R 100.0 2.1 401:47.31 bro
> 3188 root 20 0 881m 832m 264m R 100.0 1.7 377:48.90 bro
> 3189 root 20 0 1247m 1.2g 264m R 100.0 2.5 403:22.95 bro
> 3193 root 20 0 920m 871m 264m R 100.0 1.8 429:45.98 bro
> Both have the same amount of Bro workers. I just do not understand
> why Server A is literally half the utilization on top of seeing more
> traffic? The only real and consistent difference between the two I see
> is that server A seems to have twice the amount of SHR (shared memory)
> compared to server B.
> Could this be part of the issue, if not the root cause? How might I go
> about rectifying the issue?
> FWIW, both are not dropping packets and doing well. However, I want to
> run other apps on top of this, and the poor performance on Server B is
> likely to have effects on it.
> Thanks advance for the advice!
> -Jason
More information about the Bro
mailing list