[Bro] Unexplained Performance Differences Between Like Servers

Jason Batchelor jxbatchelor at gmail.com
Wed Jun 18 18:10:27 PDT 2014


Thanks for the reply Gilbert! I will take a closer look into these items
tomorrow but off the top of my head, I do not recall there being any great
difference in file size in the log files. On the surface (using iptraf), it
seems like there is a significant amount of non-ip traffic so I modified
local.bro to include the following:

redef cmd_line_bpf_filter = "ip"

In hopes that it has the desired effect.

One other question I had was the effect of implementing TCP sequence
randomization on performance (if it was enabled on an ASA for example)?
What impact would this have on flows (presumably a large increase)? How
might I best quantify the amount of flows being processed compared to the
other server?

Sorry for all the questions, I am very much a novice at this, but very
willing to learn so I appreciate the help so far!



On Mon, Jun 16, 2014 at 1:54 AM, Gilbert Clark <gc355804 at ohio.edu> wrote:

>  Hi Jason:
>
> I believe one way to set a BPF filter is to modify site/local.bro to
> include:
>
> redef cmd_line_bpf_filter = "ip or not ip";
>
> I think there's also a packet filter framework (
> http://www.bro.org/sphinx/scripts/base/frameworks/packet-filter/main.html)
> which supports more elaborate filtering schemes, but I don't really know
> much about it offhand :)
>
> Regarding the "other" traffic being the root cause of the issues: I think
> it's pretty difficult to say.  A few ideas:
>
> * check the size of log files for significant differences.  if http.log /
> reporter.log / weird.log / etc. is much longer on one system than on
> another, maybe that might be a place to start looking
> * try setting a filter to only accept a certain type of traffic (e.g.
> HTTP, SSH) to see relative load for that specific traffic type
> * try playing with which scripts bro loads (e.g. tweak local.bro and / or
> try running bro in bare mode with a very small set of loaded scripts) to
> see if that has any effect
> * bro can be told to dump performance statistics into a human-readable
> ASCII log by including the "misc/profiling.bro" script: some of the
> information included there might be useful to have
> * try capturing a trace and playing that trace back to a standalone bro
> process ... using tools like 'time' and 'perf' could help identify how
> performance changes based on the trace and scripts currently being loaded.
>     } this has the benefit of not dropping packets while scripts are being
> tweaked...
>
> As some food for thought: in general, bro does a few things every time
> there's a new packet:
>
> * Retrieve the packet from the NIC
> * Dissect the packet and generate events
> * Spend time in script-land processing events that have been generated
> * Spend time handling administrative overhead (e.g. check timers, check
> triggers)
>
> Thus, in general, making bro go faster is probably going to mean making
> one of those things take less time.
>
> Anyway, hope something in there is useful :)
>
> Cheers,
> Gilbert
>
>
> On 6/13/14, 10:32 AM, Jason Batchelor wrote:
>
>  FWIW:
>
> I just ran iptraf for a bit on both and one thing really stuck out to me:
>
> Server A:
> Other IP:      5273     633087        5273     633087           0
> 0
>
> Server B:
>  Other IP:    952797    445867K      952797    445867K
> 0          0
>
> So server A is seeing 633087 bytes of 'other' traffic, while B is seeing
> 445867 kilobytes of 'other' traffic. Do you think this other traffic could
> be the root cause of the issues here? If so, would a bpf filter looking for
> only tcp/udp/ipv4 traffic be sufficient? How might I apply that within Bro?
>
> Here is the full view taken some time after the metrics above:
>
> Server A:
>
> x               Total      Total    Incoming   Incoming    Outgoing
> Outgoing              x
> x             Packets      Bytes     Packets      Bytes     Packets
> Bytes              x
> x Total:     80187229     51270M    80187229     51270M
> 0          0              x
> x IPv4:      80187193     50026M    80187193     50026M
> 0          0              x
> x IPv6:            36       1296          36       1296
> 0          0              x
> x TCP:       70040618     47342M    70040618     47342M
> 0          0              x
> x UDP:       10052947      2676M    10052947      2676M
> 0          0              x
> x ICMP:         85189    6652550       85189    6652550
> 0          0              x
> x Other IP:      8475    1060993        8475    1060993
> 0          0
>
> Server B:
>
> x               Total      Total    Incoming   Incoming    Outgoing
> Outgoing                   x
> x             Packets      Bytes     Packets      Bytes     Packets
> Bytes                   x
> x Total:     89718860     53317M    89718860     53317M
> 0          0                   x
> x IPv4:      89712988     51882M    89712988     51882M
> 0          0                   x
> x IPv6:          5872      51778        5872      51778
> 0          0                   x
> x TCP:       79615124     49170M    79615124     49170M
> 0          0                   x
> x UDP:        7627607      1682M     7627607      1682M
> 0          0                   x
> x ICMP:         86620    5619078       86620    5619078
> 0          0                   x
> x Other IP:   2389509      1023M     2389509      1023M
> 0          0                   x
>  Many thanks in advance for the quick and helpful replies!
>
>
> On Fri, Jun 13, 2014 at 9:19 AM, Jason Batchelor <jxbatchelor at gmail.com>
> wrote:
>
>>  Wow, thanks for all the quick replies :)
>>
>> > What versions of Bro, and it is the same for both?
>>
>>  I am using the same version of Bro for each server (1.2).
>>
>> > Is the type of traffic in the 600 Mbps stream similar to the type of
>> traffic in the 700 Mbps stream?
>>
>>  I'm not 100% sure but I think that is a really good question to ask. Do
>> you know of any good tools that might help inform an answer? I know of
>> iptraf for example, is there one that folks generally prefer the most?
>>
>> > Are you only running 4 workers or did you truncate the output?
>>  Yes, I truncated the output to show four workers each (I have 16 total).
>>
>> > Are you doing 4 tuple load balancing or 2 tuple load balancing between
>> the two servers?
>>
>>  Sorry I am not sure what you mean by this or the implications of one
>> over the other. Is there an easy way I can find out (I am kinda new to
>> this)? I agree with the likelihood that B may be recieving more flows.
>>
>> Thanks!
>>  Jason
>>
>>
>>
>>
>>  On Fri, Jun 13, 2014 at 9:09 AM, Justin Azoff <JAzoff at albany.edu> wrote:
>>
>>> On Fri, Jun 13, 2014 at 08:01:54AM -0500, Jason Batchelor wrote:
>>> > At the moment Server A is getting about 700MB/s and Server B is
>>> getting about
>>> > 600Mb/s.
>>> >
>>> > What I don't understand, is Server A is having several orders of
>>> magnatude
>>> > better performance compared to Server B?
>>> >
>>> > TOP from A (included a few bro workers):
>>> >
>>> > top - 12:48:45 up 1 day, 17:03,  2 users,  load average: 5.30, 3.99,
>>> 3.13
>>> > Tasks: 706 total,  19 running, 687 sleeping,   0 stopped,   0 zombie
>>> > Cpu(s): 33.9%us,  6.6%sy,  1.1%ni, 57.2%id,  0.0%wa,  0.0%hi,  1.2%si,
>>>  0.0%st
>>> > Mem:  49376004k total, 33605828k used, 15770176k free,    93100k
>>> buffers
>>> > Swap:  2621432k total,     9760k used,  2611672k free,  9206880k cached
>>> >   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>>> >  5768 root      20   0 1808m 1.7g 519m R 100.0  3.6  32:24.92 bro
>>> >  5760 root      20   0 1688m 1.6g 519m R 99.7  3.4  34:08.36 bro
>>> >  3314 root      20   0 2160m 269m 4764 R 96.1  0.6  30:14.12 bro
>>> >  5754 root      20   0 1451m 1.4g 519m R 82.8  2.9  36:40.02 bro
>>>
>>>  Server A Bro cpu utilization = 378.6
>>>
>>> > TOP from B (included a few bro workers)
>>> >
>>> > top - 12:49:33 up 14:24,  2 users,  load average: 10.28, 9.31, 8.06
>>> > Tasks: 708 total,  25 running, 683 sleeping,   0 stopped,   0 zombie
>>> > Cpu(s): 41.6%us,  6.0%sy,  1.0%ni, 50.4%id,  0.0%wa,  0.0%hi,  1.1%si,
>>>  0.0%st
>>> > Mem:  49376004k total, 31837340k used, 17538664k free,   147212k
>>> buffers
>>> > Swap:  2621432k total,        0k used,  2621432k free, 13494332k cached
>>> >   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>>> >  3178 root      20   0 1073m 1.0g 264m R 100.0  2.1 401:47.31 bro
>>> >  3188 root      20   0  881m 832m 264m R 100.0  1.7 377:48.90 bro
>>> >  3189 root      20   0 1247m 1.2g 264m R 100.0  2.5 403:22.95 bro
>>> >  3193 root      20   0  920m 871m 264m R 100.0  1.8 429:45.98 bro
>>>
>>> > Both have the same amount of Bro workers. I just do not understand why
>>> Server
>>> > A is literally half the utilization on top of seeing more traffic? The
>>> only
>>> > real and consistent difference between the two I see is that server A
>>> seems to
>>> > have twice the amount of SHR (shared memory) compared to server B.
>>>
>>>  Server B Bro cpu utilization = 400%
>>>
>>> Are you only running 4 workers or did you truncate the output?  Is that
>>> running at 100% 24/7 or does it vary with the traffic?
>>>
>>> Are you doing 4 tuple load balancing or 2 tuple load balancing between
>>> the two servers?  Most likely Server B is seeing more flows.
>>>
>>>
>>> --
>>> -- Justin Azoff
>>>
>>
>>
>
>
> _______________________________________________
> Bro mailing list
> bro at bro-ids.org
> http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/bro
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ICSI.Berkeley.EDU/pipermail/bro/attachments/20140618/9ec0afe1/attachment.html 


More information about the Bro mailing list