[Bro] Manager memory requirements for the intel framework

Mon May 21 08:53:39 PDT 2018

All,

A quick update: cluster workers apparently must be receiving traffic to process Intel::new_item events, because the rapid manager memory growth didn't occur while our test system was receiving traffic.  Thanks for the help, we've moved on to generating a more representative intel input file and replaying traffic for further testing.

Brian

-----Original Message-----
From: "Azoff, Justin S" <jazoff at illinois.edu>
Date: Tuesday, May 15, 2018 at 2:06 PM
To: Brian OBerry <brian.oberry at bluvector.io>
Cc: "bro at bro.org" <bro at bro.org>, Jon Siwek <jsiwek at corelight.com>
Subject: EXT: Re: [Bro] Manager memory requirements for the intel framework

    On May 15, 2018, at 9:13 AM, Brian OBerry <brian.oberry at bluvector.io> wrote:
    > 
    >  It remained at 27G after many cycles of replacing the input file with 18K new unique items.

    That is interesting because by default the intel framework doesn't expire items, so every time you replaced the file you were loading an additional 18k items..

    If I get a chance I will resurrect the benchmarking code I was working on a while ago.. It would do things like create a table of hosts and add 10k,20k,30k,40k hosts to it and see what the memory usage was for each count to see what the real work data usage is for different sized data structures.  I never tried it with the intel framework though.

    > We commented the conditional that invokes “event Intel::new_item(item)” in base/frameworks/intel/main.bro to disable remote synchronization with the workers, and the huge VSize disappeared.
    > 

    This makes more sense.. I don't think your memory usage has anything to do with the intel itself, I think the communication code is falling behind.

    How many worker processes do you have configured?  Are they running on the same box or separate boxes?

    If you load up 18k indicators but have 100 worker nodes, the bro manager needs to send out 1,800,000 events to all the workers.  if the workers can't keep up, that data just ends up buffered in memory on the manager until it can be sent out.

    Jon: this is the use case I had for the Cluster::relay_rr, offloading the messaging load from the manager

    # On the manager, the new_item event indicates a new indicator that
    # has to be distributed.
    event Intel::new_item(item: Item) &priority=5
        {
        Broker::publish(indicator_topic, Intel::insert_indicator, item);
        }

    so that should maybe be used there, instead of the manager having to do all the communication work.

    — 
    Justin Azoff