[Bro] Unexplained Performance Differences Between Like Servers
Jason Batchelor
jxbatchelor at gmail.com
Wed Jun 18 18:10:27 PDT 2014
Thanks for the reply Gilbert! I will take a closer look into these items
tomorrow but off the top of my head, I do not recall there being any great
difference in file size in the log files. On the surface (using iptraf), it
seems like there is a significant amount of non-ip traffic so I modified
local.bro to include the following:
redef cmd_line_bpf_filter = "ip"
In hopes that it has the desired effect.
One other question I had was the effect of implementing TCP sequence
randomization on performance (if it was enabled on an ASA for example)?
What impact would this have on flows (presumably a large increase)? How
might I best quantify the amount of flows being processed compared to the
other server?
Sorry for all the questions, I am very much a novice at this, but very
willing to learn so I appreciate the help so far!
On Mon, Jun 16, 2014 at 1:54 AM, Gilbert Clark <gc355804 at ohio.edu> wrote:
> Hi Jason:
>
> I believe one way to set a BPF filter is to modify site/local.bro to
> include:
>
> redef cmd_line_bpf_filter = "ip or not ip";
>
> I think there's also a packet filter framework (
> http://www.bro.org/sphinx/scripts/base/frameworks/packet-filter/main.html)
> which supports more elaborate filtering schemes, but I don't really know
> much about it offhand :)
>
> Regarding the "other" traffic being the root cause of the issues: I think
> it's pretty difficult to say. A few ideas:
>
> * check the size of log files for significant differences. if http.log /
> reporter.log / weird.log / etc. is much longer on one system than on
> another, maybe that might be a place to start looking
> * try setting a filter to only accept a certain type of traffic (e.g.
> HTTP, SSH) to see relative load for that specific traffic type
> * try playing with which scripts bro loads (e.g. tweak local.bro and / or
> try running bro in bare mode with a very small set of loaded scripts) to
> see if that has any effect
> * bro can be told to dump performance statistics into a human-readable
> ASCII log by including the "misc/profiling.bro" script: some of the
> information included there might be useful to have
> * try capturing a trace and playing that trace back to a standalone bro
> process ... using tools like 'time' and 'perf' could help identify how
> performance changes based on the trace and scripts currently being loaded.
> } this has the benefit of not dropping packets while scripts are being
> tweaked...
>
> As some food for thought: in general, bro does a few things every time
> there's a new packet:
>
> * Retrieve the packet from the NIC
> * Dissect the packet and generate events
> * Spend time in script-land processing events that have been generated
> * Spend time handling administrative overhead (e.g. check timers, check
> triggers)
>
> Thus, in general, making bro go faster is probably going to mean making
> one of those things take less time.
>
> Anyway, hope something in there is useful :)
>
> Cheers,
> Gilbert
>
>
> On 6/13/14, 10:32 AM, Jason Batchelor wrote:
>
> FWIW:
>
> I just ran iptraf for a bit on both and one thing really stuck out to me:
>
> Server A:
> Other IP: 5273 633087 5273 633087 0
> 0
>
> Server B:
> Other IP: 952797 445867K 952797 445867K
> 0 0
>
> So server A is seeing 633087 bytes of 'other' traffic, while B is seeing
> 445867 kilobytes of 'other' traffic. Do you think this other traffic could
> be the root cause of the issues here? If so, would a bpf filter looking for
> only tcp/udp/ipv4 traffic be sufficient? How might I apply that within Bro?
>
> Here is the full view taken some time after the metrics above:
>
> Server A:
>
> x Total Total Incoming Incoming Outgoing
> Outgoing x
> x Packets Bytes Packets Bytes Packets
> Bytes x
> x Total: 80187229 51270M 80187229 51270M
> 0 0 x
> x IPv4: 80187193 50026M 80187193 50026M
> 0 0 x
> x IPv6: 36 1296 36 1296
> 0 0 x
> x TCP: 70040618 47342M 70040618 47342M
> 0 0 x
> x UDP: 10052947 2676M 10052947 2676M
> 0 0 x
> x ICMP: 85189 6652550 85189 6652550
> 0 0 x
> x Other IP: 8475 1060993 8475 1060993
> 0 0
>
> Server B:
>
> x Total Total Incoming Incoming Outgoing
> Outgoing x
> x Packets Bytes Packets Bytes Packets
> Bytes x
> x Total: 89718860 53317M 89718860 53317M
> 0 0 x
> x IPv4: 89712988 51882M 89712988 51882M
> 0 0 x
> x IPv6: 5872 51778 5872 51778
> 0 0 x
> x TCP: 79615124 49170M 79615124 49170M
> 0 0 x
> x UDP: 7627607 1682M 7627607 1682M
> 0 0 x
> x ICMP: 86620 5619078 86620 5619078
> 0 0 x
> x Other IP: 2389509 1023M 2389509 1023M
> 0 0 x
> Many thanks in advance for the quick and helpful replies!
>
>
> On Fri, Jun 13, 2014 at 9:19 AM, Jason Batchelor <jxbatchelor at gmail.com>
> wrote:
>
>> Wow, thanks for all the quick replies :)
>>
>> > What versions of Bro, and it is the same for both?
>>
>> I am using the same version of Bro for each server (1.2).
>>
>> > Is the type of traffic in the 600 Mbps stream similar to the type of
>> traffic in the 700 Mbps stream?
>>
>> I'm not 100% sure but I think that is a really good question to ask. Do
>> you know of any good tools that might help inform an answer? I know of
>> iptraf for example, is there one that folks generally prefer the most?
>>
>> > Are you only running 4 workers or did you truncate the output?
>> Yes, I truncated the output to show four workers each (I have 16 total).
>>
>> > Are you doing 4 tuple load balancing or 2 tuple load balancing between
>> the two servers?
>>
>> Sorry I am not sure what you mean by this or the implications of one
>> over the other. Is there an easy way I can find out (I am kinda new to
>> this)? I agree with the likelihood that B may be recieving more flows.
>>
>> Thanks!
>> Jason
>>
>>
>>
>>
>> On Fri, Jun 13, 2014 at 9:09 AM, Justin Azoff <JAzoff at albany.edu> wrote:
>>
>>> On Fri, Jun 13, 2014 at 08:01:54AM -0500, Jason Batchelor wrote:
>>> > At the moment Server A is getting about 700MB/s and Server B is
>>> getting about
>>> > 600Mb/s.
>>> >
>>> > What I don't understand, is Server A is having several orders of
>>> magnatude
>>> > better performance compared to Server B?
>>> >
>>> > TOP from A (included a few bro workers):
>>> >
>>> > top - 12:48:45 up 1 day, 17:03, 2 users, load average: 5.30, 3.99,
>>> 3.13
>>> > Tasks: 706 total, 19 running, 687 sleeping, 0 stopped, 0 zombie
>>> > Cpu(s): 33.9%us, 6.6%sy, 1.1%ni, 57.2%id, 0.0%wa, 0.0%hi, 1.2%si,
>>> 0.0%st
>>> > Mem: 49376004k total, 33605828k used, 15770176k free, 93100k
>>> buffers
>>> > Swap: 2621432k total, 9760k used, 2611672k free, 9206880k cached
>>> > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
>>> > 5768 root 20 0 1808m 1.7g 519m R 100.0 3.6 32:24.92 bro
>>> > 5760 root 20 0 1688m 1.6g 519m R 99.7 3.4 34:08.36 bro
>>> > 3314 root 20 0 2160m 269m 4764 R 96.1 0.6 30:14.12 bro
>>> > 5754 root 20 0 1451m 1.4g 519m R 82.8 2.9 36:40.02 bro
>>>
>>> Server A Bro cpu utilization = 378.6
>>>
>>> > TOP from B (included a few bro workers)
>>> >
>>> > top - 12:49:33 up 14:24, 2 users, load average: 10.28, 9.31, 8.06
>>> > Tasks: 708 total, 25 running, 683 sleeping, 0 stopped, 0 zombie
>>> > Cpu(s): 41.6%us, 6.0%sy, 1.0%ni, 50.4%id, 0.0%wa, 0.0%hi, 1.1%si,
>>> 0.0%st
>>> > Mem: 49376004k total, 31837340k used, 17538664k free, 147212k
>>> buffers
>>> > Swap: 2621432k total, 0k used, 2621432k free, 13494332k cached
>>> > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
>>> > 3178 root 20 0 1073m 1.0g 264m R 100.0 2.1 401:47.31 bro
>>> > 3188 root 20 0 881m 832m 264m R 100.0 1.7 377:48.90 bro
>>> > 3189 root 20 0 1247m 1.2g 264m R 100.0 2.5 403:22.95 bro
>>> > 3193 root 20 0 920m 871m 264m R 100.0 1.8 429:45.98 bro
>>>
>>> > Both have the same amount of Bro workers. I just do not understand why
>>> Server
>>> > A is literally half the utilization on top of seeing more traffic? The
>>> only
>>> > real and consistent difference between the two I see is that server A
>>> seems to
>>> > have twice the amount of SHR (shared memory) compared to server B.
>>>
>>> Server B Bro cpu utilization = 400%
>>>
>>> Are you only running 4 workers or did you truncate the output? Is that
>>> running at 100% 24/7 or does it vary with the traffic?
>>>
>>> Are you doing 4 tuple load balancing or 2 tuple load balancing between
>>> the two servers? Most likely Server B is seeing more flows.
>>>
>>>
>>> --
>>> -- Justin Azoff
>>>
>>
>>
>
>
> _______________________________________________
> Bro mailing list
> bro at bro-ids.org
> http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/bro
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ICSI.Berkeley.EDU/pipermail/bro/attachments/20140618/9ec0afe1/attachment.html
More information about the Bro
mailing list