[Bro] Memory Consumption

Thu Jun 26 11:26:38 PDT 2014

Small follow up to this as well since it may be relevant. I notice the
timers for stale connections seems to increase in paralel with memory...

grep 'ConnectionInactivityTimer' prof.log | awk 'NR % 10 == 1'
1403802069.314888         ConnectionInactivityTimer = 5844
1403802219.315759         ConnectionInactivityTimer = 21747
1403802369.316387         ConnectionInactivityTimer = 32275
1403802519.317613         ConnectionInactivityTimer = 32716
1403802669.318303         ConnectionInactivityTimer = 32597
1403802819.319193         ConnectionInactivityTimer = 34207
1403802969.320204         ConnectionInactivityTimer = 39176
1403803119.321978         ConnectionInactivityTimer = 40394
1403803269.323058         ConnectionInactivityTimer = 38631
1403803419.323688         ConnectionInactivityTimer = 35847
1403803569.324716         ConnectionInactivityTimer = 34432
1403803719.325888         ConnectionInactivityTimer = 34591
1403803869.326713         ConnectionInactivityTimer = 34716
1403804019.327664         ConnectionInactivityTimer = 35361
1403804169.329254         ConnectionInactivityTimer = 35915
1403804319.330507         ConnectionInactivityTimer = 34994
1403804469.331842         ConnectionInactivityTimer = 33212
1403804619.332236         ConnectionInactivityTimer = 32290
1403804769.332993         ConnectionInactivityTimer = 32513
1403804919.333717         ConnectionInactivityTimer = 32592
1403805069.334477         ConnectionInactivityTimer = 32388
1403805219.334875         ConnectionInactivityTimer = 32932
1403805369.335753         ConnectionInactivityTimer = 31771
1403805519.337054         ConnectionInactivityTimer = 28749
1403805669.337563         ConnectionInactivityTimer = 26509
1403805819.339240         ConnectionInactivityTimer = 26654
1403805969.340812         ConnectionInactivityTimer = 26297
1403806119.341841         ConnectionInactivityTimer = 25362
1403806269.344342         ConnectionInactivityTimer = 24435
1403806419.345146         ConnectionInactivityTimer = 24954
1403806569.346057         ConnectionInactivityTimer = 24088
1403806719.347671         ConnectionInactivityTimer = 30207
1403806869.349643         ConnectionInactivityTimer = 34276

Notice the steady increase, then slight decrease, then steady increase
again. Is there a way to control these settings for performance testing
purposes?

I know while I was tuning Suricata, I needed to be mindful of connection
timeouts and due to the volume of flows I am getting I needed to be pretty
aggressive.

Thanks,
Jason

On Thu, Jun 26, 2014 at 12:29 PM, Jason Batchelor <jxbatchelor at gmail.com>
wrote:

> Thanks Seth:
>
> I'm not sure I have a license for an experianced bro memory debugger,
> however I will document what I've done here for folks in hopes it proves
> useful!
>
> I've enabled profiling by adding the following.
>
> Vim /opt/bro/share/bro/site/local.bro
> @load misc/profiling
>
> Then enforced the changes...
>
> broctl stop
> broctl install
> broctl start
>
> At the moment I have 46308184k used  3067820k free memory.
>
> In /var/opt/bro/spool/worker-1-1, prof.log content is captured as you
> mentioned (and likewise for all nodes).
>
> Earlier you wrote:
>
> >  Every so often in there will be an indication of the largest global
> variables
>
> Is this what you mean (taken from one worker)....?
>
> 1403803224.322453 Global_sizes > 100k: 0K
> 1403803224.322453                Known::known_services = 469K (3130/3130
> entries)
> 1403803224.322453                Cluster::manager2worker_events = 137K
> 1403803224.322453                Weird::weird_ignore = 31492K
> (146569/146569 entries)
> 1403803224.322453                Known::certs = 58K (310/310 entries)
> 1403803224.322453                SumStats::threshold_tracker = 668K
> (4/2916 entries)
> 1403803224.322453                FTP::ftp_data_expected = 181K (46/46
> entries)
> 1403803224.322453                Notice::suppressing = 595K (2243/2243
> entries)
> 1403803224.322453                Communication::connected_peers = 156K
> (2/2 entries)
> 1403803224.322453                SumStats::sending_results = 8028K (3/5545
> entries)
> 1403803224.322453                Software::tracked = 33477K (12424/31111
> entries)
> 1403803224.322453                FTP::cmd_reply_code = 48K (325/325
> entries)
> 1403803224.322453                SumStats::result_store = 27962K (5/19978
> entries)
> 1403803224.322453                SSL::cipher_desc = 97K (356/356 entries)
> 1403803224.322453                RADIUS::attr_types = 44K (169/169 entries)
> 1403803224.322453                Weird::actions = 35K (163/163 entries)
> 1403803224.322453                Known::known_hosts = 3221K (21773/21773
> entries)
> 1403803224.322453                Weird::did_log = 54K (287/287 entries)
> 1403803224.322453                SSL::recently_validated_certs = 8667K
> (24752/24752 entries)
> 1403803224.322453                Communication::nodes = 188K (4/4 entries)
> 1403803224.322453                SSL::root_certs = 204K (144/144 entries)
> 1403803224.322453 Global_sizes total: 116727K
> 1403803224.322453 Total number of table entries: 213548/260715
> 1403803239.322685 ------------------------
> 1403803239.322685 Memory: total=1185296K total_adj=1137108K malloced:
> 1144576K
>
> Any other pointers on how to interpret this data?
>
> FWIW, here are some additional statistics from the worker prof.log...
>
> grep  "Memory: " prof.log | awk 'NR % 10 == 1'
> 0.000000 Memory: total=48188K total_adj=0K malloced: 47965K
> 1403802189.315606 Memory: total=614476K total_adj=566288K malloced: 614022K
> 1403802339.316381 Memory: total=938380K total_adj=890192K malloced: 938275K
> 1403802489.317426 Memory: total=1006168K total_adj=957980K malloced:
> 1003385K
> 1403802639.318199 Memory: total=1041288K total_adj=993100K malloced:
> 1035422K
> 1403802789.319107 Memory: total=1063544K total_adj=1015356K malloced:
> 1058229K
> 1403802939.320170 Memory: total=1140652K total_adj=1092464K malloced:
> 1139608K
> 1403803089.321327 Memory: total=1184540K total_adj=1136352K malloced:
> 1179411K
> 1403803239.322685 Memory: total=1185296K total_adj=1137108K malloced:
> 1144576K
> 1403803389.323680 Memory: total=1185296K total_adj=1137108K malloced:
> 1118961K
> 1403803539.324677 Memory: total=1185296K total_adj=1137108K malloced:
> 1092719K
> 1403803689.325763 Memory: total=1185296K total_adj=1137108K malloced:
> 1091447K
>
>
> On Thu, Jun 26, 2014 at 11:49 AM, Seth Hall <seth at icir.org> wrote:
>
>>
>> On Jun 26, 2014, at 12:43 PM, Jason Batchelor <jxbatchelor at gmail.com>
>> wrote:
>>
>> > > Bro typically does consume quite a bit of memory and you're a bit
>> tight on memory for the number of workers you're running.
>> > Curious what would you recommend for just bro itself? Double, triple
>> this?
>>
>> It seems like most people just put 128G of memory in Bro boxes now
>> because the cost just isn't really worth going any lower if there's a
>> remote possibility you might use it.
>>
>> > I will definately take a look, thanks for the info!
>>
>> Feel free to ask again if you're having trouble.  We really should write
>> up some debugging documentation for this process sometime.  Anyone with
>> experience doing this memory debugging activity up for it?  Doesn't have to
>> be anything fancy, just the steps and various things to look at to figure
>> out what exactly is happening.
>>
>>   .Seth
>>
>> --
>> Seth Hall
>> International Computer Science Institute
>> (Bro) because everyone has a network
>> http://www.bro.org/
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ICSI.Berkeley.EDU/pipermail/bro/attachments/20140626/d5f4136a/attachment.html