[Bro] Memory Issue with Bro

Eric Ooi ericooi at gmail.com
Sat Oct 24 07:39:43 PDT 2015


I have two Security Onion sensors running 2.4, one monitors a combination of 100M general office internet traffic + 10G network, the other monitors four 1G networks which includes a publicly available website and lots of syslog and SMTP traffic.  I ran the default Security Onion configuration on both and noticed the sensor monitoring four 1G networks would run out of memory over the course of a few hours.  I spent a day turning off various analyzers until I isolated it to the intel analyzer.  Ever since I turned intel.log off for that sensor, it’s run great for weeks.  I tried adding intel feeds (via CriticalStack) and using a blank intel file, with no luck.  Simply having the intel analyzer on always resulted in memory loss over time.  I’m guessing it has something to do with the type of traffic that particular sensor sees (more HTTP, syslog, and SMTP), but I’m not entirely sure.

Don’t know if it’s related, but just thought I’d share my experience with Bro memory issues.

Eric

> On Oct 23, 2015, at 11:38 AM, Mike Waite <mfw113 at psu.edu> wrote:
> 
> After 30 min things look better, I will let you know how the rest of it makes out after a bit.
> 
> Oct 23 12:09:52	manager	child	-	-	-	info	selects=100000 canwrites=97046 pending=0
> Oct 23 12:11:54	manager	child	-	-	-	info	selects=200000 canwrites=97046 pending=0
> Oct 23 12:14:04	manager	child	-	-	-	info	selects=300000 canwrites=97046 pending=0
> Oct 23 12:14:43	manager	child	-	-	-	info	selects=400000 canwrites=97046 pending=0
> Oct 23 12:15:20	manager	child	-	-	-	info	selects=500000 canwrites=97046 pending=0
> Oct 23 12:15:54	manager	child	-	-	-	info	selects=600000 canwrites=97046 pending=0
> Oct 23 12:16:38	manager	child	-	-	-	info	selects=700000 canwrites=97046 pending=0
> Oct 23 12:17:41	manager	child	-	-	-	info	selects=800000 canwrites=97046 pending=0
> Oct 23 12:19:03	manager	child	-	-	-	info	selects=900000 canwrites=97046 pending=0
> Oct 23 12:20:46	manager	child	-	-	-	info	selects=1000000 canwrites=97046 pending=0
> Oct 23 12:23:04	manager	child	-	-	-	info	selects=1100000 canwrites=97046 pending=0
> Oct 23 12:25:10	manager	child	-	-	-	info	selects=1200000 canwrites=104987 pending=0
> Oct 23 12:26:40	manager	child	-	-	-	info	selects=1300000 canwrites=104987 pending=0
> Oct 23 12:28:13	manager	child	-	-	-	info	selects=1400000 canwrites=104987 pending=0
> Oct 23 12:31:12	manager	child	-	-	-	info	selects=1600000 canwrites=110134 pending=0
> Oct 23 12:32:24	manager	child	-	-	-	info	selects=1700000 canwrites=110134 pending=0
> Oct 23 12:34:03	manager	child	-	-	-	info	selects=1800000 canwrites=110134 pending=0
> Oct 23 12:35:12	manager	child	-	-	-	info	selects=1900000 canwrites=110134 pending=0
> Oct 23 12:36:15	manager	child	-	-	-	info	selects=2000000 canwrites=110134 pending=0
> Oct 23 12:37:31	manager	child	-	-	-	info	selects=2100000 canwrites=110134 pending=0
> 
> 
> --
> Mike Waite
> CyberSecurity Intrusion Analyst
> Office of Information Security
> The Pennsylvania State University
> ↪ 15-10-23 11:09:58, Seth Hall <seth at icir.org>:
>> Mike, could you back out that patch and try my branch, topic/seth/remove-flare ?
>> 
>> .Seth
>> 
>> 
>>> On Oct 23, 2015, at 10:19 AM, Azoff, Justin S <jazoff at illinois.edu> wrote:
>>> 
>>> Well that doesn't look great, but could be a lot worse.  Hard to say without knowing what it looked like before the patch.
>>> 
>>> The fact that pending ever goes down at all is a good sign, but pending=0 is really the optimal state.
>>> 
>>> --
>>> - Justin Azoff
>>> 
>>>> On Oct 23, 2015, at 9:21 AM, Mike Waite <mfw113 at psu.edu> wrote:
>>>> 
>>>> Patch applied, after 15 minutes I am seeing
>>>> 
>>>> Oct 23 09:00:43	manager	child	-	-	-	info	selects=300000 canwrites=216206 pending=0
>>>> Oct 23 09:01:29	manager	child	-	-	-	info	selects=400000 canwrites=216206 pending=0
>>>> Oct 23 09:02:08	manager	child	-	-	-	info	selects=500000 canwrites=216552 pending=0
>>>> Oct 23 09:02:49	manager	child	-	-	-	info	selects=600000 canwrites=216557 pending=0
>>>> Oct 23 09:03:34	manager	child	-	-	-	info	selects=700000 canwrites=216557 pending=0
>>>> Oct 23 09:04:29	manager	child	-	-	-	info	selects=800000 canwrites=255305 pending=4007
>>>> Oct 23 09:05:21	manager	child	-	-	-	info	selects=900000 canwrites=355305 pending=6593
>>>> Oct 23 09:06:13	manager	child	-	-	-	info	selects=1000000 canwrites=455305 pending=6003
>>>> Oct 23 09:07:04	manager	child	-	-	-	info	selects=1100000 canwrites=555305 pending=3077
>>>> Oct 23 09:07:55	manager	child	-	-	-	info	selects=1200000 canwrites=640438 pending=3399
>>>> Oct 23 09:08:45	manager	child	-	-	-	info	selects=1300000 canwrites=740438 pending=3163
>>>> Oct 23 09:09:36	manager	child	-	-	-	info	selects=1400000 canwrites=840438 pending=5245
>>>> Oct 23 09:10:25	manager	child	-	-	-	info	selects=1500000 canwrites=940438 pending=6027
>>>> Oct 23 09:11:15	manager	child	-	-	-	info	selects=1600000 canwrites=1040438 pending=6713
>>>> Oct 23 09:12:01	manager	child	-	-	-	info	selects=1700000 canwrites=1140438 pending=5713
>>>> Oct 23 09:12:50	manager	child	-	-	-	info	selects=1800000 canwrites=1240438 pending=6747
>>>> Oct 23 09:13:39	manager	child	-	-	-	info	selects=1900000 canwrites=1340438 pending=7417
>>>> Oct 23 09:14:32	manager	child	-	-	-	info	selects=2000000 canwrites=1440438 pending=13117
>>>> Oct 23 09:15:10	manager	child	-	-	-	info	selects=2100000 canwrites=1540438 pending=20825
>>>> Oct 23 09:15:59	manager	child	-	-	-	info	selects=2200000 canwrites=1640438 pending=18539
>>>> Oct 23 09:16:47	manager	child	-	-	-	info	selects=2300000 canwrites=1740438 pending=15881
>>>> Oct 23 09:17:35	manager	child	-	-	-	info	selects=2400000 canwrites=1840438 pending=15389
>>>> Oct 23 09:18:28	manager	child	-	-	-	info	selects=2500000 canwrites=1940438 pending=16685
>>>> Oct 23 09:19:18	manager	child	-	-	-	info	selects=2600000 canwrites=2040438 pending=17031
>>>> 
>>>> 
>>>> I will let you know about the mem usage after a bit
>>>> 
>>>> --
>>>> Mike Waite
>>>> CyberSecurity Intrusion Analyst
>>>> Office of Information Security
>>>> The Pennsylvania State University
>>>> ↪ 15-10-22 10:22:18, Azoff, Justin S <jazoff at illinois.edu>:
>>>>>> On Oct 22, 2015, at 8:12 AM, Mike Waite <mfw113 at psu.edu> wrote:
>>>>>> 
>>>>>> I know we are still seeing issues with the manager child proccess.  The process will consume over 200GB of RAM in 8 hours.
>>>>>> 
>>>>> 
>>>>> Give the attached patch a try.
>>>>> 
>>>>> 
>>>>> 
>>>>> Monitor by using
>>>>> 
>>>>> cat logs/current/communication.log |egrep 'manager.child'
>>>>> 
>>>>> And check to see if pending=0 or at least not growing.
>>>>> 
>>>>> 
>>>>> --
>>>>> - Justin Azoff
>>>>> 
>>>>> 
>>>> 
>>>> 
>>> 
>>> 
>>> _______________________________________________
>>> Bro mailing list
>>> bro at bro-ids.org
>>> http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/bro
>>> 
>>> 
>> 
>> --
>> Seth Hall
>> International Computer Science Institute
>> (Bro) because everyone has a network
>> http://www.bro.org/
>> 
> _______________________________________________
> Bro mailing list
> bro at bro-ids.org
> http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/bro




More information about the Bro mailing list