[Bro] Unexplained Performance Differences Between Like Servers
Jason Batchelor
jxbatchelor at gmail.com
Fri Jun 13 07:32:42 PDT 2014
FWIW:
I just ran iptraf for a bit on both and one thing really stuck out to me:
Server A:
Other IP: 5273 633087 5273 633087 0 0
Server B:
Other IP: 952797 445867K 952797 445867K 0
0
So server A is seeing 633087 bytes of 'other' traffic, while B is seeing
445867 kilobytes of 'other' traffic. Do you think this other traffic could
be the root cause of the issues here? If so, would a bpf filter looking for
only tcp/udp/ipv4 traffic be sufficient? How might I apply that within Bro?
Here is the full view taken some time after the metrics above:
Server A:
x Total Total Incoming Incoming Outgoing
Outgoing x
x Packets Bytes Packets Bytes Packets
Bytes x
x Total: 80187229 51270M 80187229 51270M
0 0 x
x IPv4: 80187193 50026M 80187193 50026M
0 0 x
x IPv6: 36 1296 36 1296
0 0 x
x TCP: 70040618 47342M 70040618 47342M
0 0 x
x UDP: 10052947 2676M 10052947 2676M
0 0 x
x ICMP: 85189 6652550 85189 6652550
0 0 x
x Other IP: 8475 1060993 8475 1060993
0 0
Server B:
x Total Total Incoming Incoming Outgoing
Outgoing x
x Packets Bytes Packets Bytes Packets
Bytes x
x Total: 89718860 53317M 89718860 53317M
0 0 x
x IPv4: 89712988 51882M 89712988 51882M
0 0 x
x IPv6: 5872 51778 5872 51778
0 0 x
x TCP: 79615124 49170M 79615124 49170M
0 0 x
x UDP: 7627607 1682M 7627607 1682M
0 0 x
x ICMP: 86620 5619078 86620 5619078
0 0 x
x Other IP: 2389509 1023M 2389509 1023M
0 0 x
Many thanks in advance for the quick and helpful replies!
On Fri, Jun 13, 2014 at 9:19 AM, Jason Batchelor <jxbatchelor at gmail.com>
wrote:
> Wow, thanks for all the quick replies :)
>
> > What versions of Bro, and it is the same for both?
>
> I am using the same version of Bro for each server (1.2).
>
> > Is the type of traffic in the 600 Mbps stream similar to the type of
> traffic in the 700 Mbps stream?
>
> I'm not 100% sure but I think that is a really good question to ask. Do
> you know of any good tools that might help inform an answer? I know of
> iptraf for example, is there one that folks generally prefer the most?
>
> > Are you only running 4 workers or did you truncate the output?
> Yes, I truncated the output to show four workers each (I have 16 total).
>
> > Are you doing 4 tuple load balancing or 2 tuple load balancing between
> the two servers?
>
> Sorry I am not sure what you mean by this or the implications of one over
> the other. Is there an easy way I can find out (I am kinda new to this)? I
> agree with the likelihood that B may be recieving more flows.
>
> Thanks!
> Jason
>
>
>
>
> On Fri, Jun 13, 2014 at 9:09 AM, Justin Azoff <JAzoff at albany.edu> wrote:
>
>> On Fri, Jun 13, 2014 at 08:01:54AM -0500, Jason Batchelor wrote:
>> > At the moment Server A is getting about 700MB/s and Server B is getting
>> about
>> > 600Mb/s.
>> >
>> > What I don't understand, is Server A is having several orders of
>> magnatude
>> > better performance compared to Server B?
>> >
>> > TOP from A (included a few bro workers):
>> >
>> > top - 12:48:45 up 1 day, 17:03, 2 users, load average: 5.30, 3.99,
>> 3.13
>> > Tasks: 706 total, 19 running, 687 sleeping, 0 stopped, 0 zombie
>> > Cpu(s): 33.9%us, 6.6%sy, 1.1%ni, 57.2%id, 0.0%wa, 0.0%hi, 1.2%si,
>> 0.0%st
>> > Mem: 49376004k total, 33605828k used, 15770176k free, 93100k buffers
>> > Swap: 2621432k total, 9760k used, 2611672k free, 9206880k cached
>> > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
>> > 5768 root 20 0 1808m 1.7g 519m R 100.0 3.6 32:24.92 bro
>> > 5760 root 20 0 1688m 1.6g 519m R 99.7 3.4 34:08.36 bro
>> > 3314 root 20 0 2160m 269m 4764 R 96.1 0.6 30:14.12 bro
>> > 5754 root 20 0 1451m 1.4g 519m R 82.8 2.9 36:40.02 bro
>>
>> Server A Bro cpu utilization = 378.6
>>
>> > TOP from B (included a few bro workers)
>> >
>> > top - 12:49:33 up 14:24, 2 users, load average: 10.28, 9.31, 8.06
>> > Tasks: 708 total, 25 running, 683 sleeping, 0 stopped, 0 zombie
>> > Cpu(s): 41.6%us, 6.0%sy, 1.0%ni, 50.4%id, 0.0%wa, 0.0%hi, 1.1%si,
>> 0.0%st
>> > Mem: 49376004k total, 31837340k used, 17538664k free, 147212k buffers
>> > Swap: 2621432k total, 0k used, 2621432k free, 13494332k cached
>> > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
>> > 3178 root 20 0 1073m 1.0g 264m R 100.0 2.1 401:47.31 bro
>> > 3188 root 20 0 881m 832m 264m R 100.0 1.7 377:48.90 bro
>> > 3189 root 20 0 1247m 1.2g 264m R 100.0 2.5 403:22.95 bro
>> > 3193 root 20 0 920m 871m 264m R 100.0 1.8 429:45.98 bro
>>
>> > Both have the same amount of Bro workers. I just do not understand why
>> Server
>> > A is literally half the utilization on top of seeing more traffic? The
>> only
>> > real and consistent difference between the two I see is that server A
>> seems to
>> > have twice the amount of SHR (shared memory) compared to server B.
>>
>> Server B Bro cpu utilization = 400%
>>
>> Are you only running 4 workers or did you truncate the output? Is that
>> running at 100% 24/7 or does it vary with the traffic?
>>
>> Are you doing 4 tuple load balancing or 2 tuple load balancing between
>> the two servers? Most likely Server B is seeing more flows.
>>
>>
>> --
>> -- Justin Azoff
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ICSI.Berkeley.EDU/pipermail/bro/attachments/20140613/bcb999b4/attachment.html
More information about the Bro
mailing list