[Bro] Manager and logger threads crash immediately on deploy
Azoff, Justin S
jazoff at illinois.edu
Wed Jul 12 12:49:04 PDT 2017
> On Jul 12, 2017, at 1:38 PM, Chris Herdt <cherdt at umn.edu> wrote:
> What I'm finding is that any time the number of worker processes exceeds ~160 (not a magic number--not consistent, but around that value based on observation), the manager and logger threads crash. If I keep the number of worker processes at or below ~160 (either by reducing processes per node, reducing nodes per host, or reducing hosts in the cluster) it runs successfully. Ideally, the cluster would have 288 worker processes.
Yes.. this is a problem.
Bro currently uses select() internally for the IO loop and select can't handle more than 1024 file descriptors.
Around 170 worker processes is where the manager will accumulate more than 1024 fds.
There are a few options here:
* Run less lb_procs per port to stay under the limit.
* Run two separate bro manager installations so that each manager/logger only handles half the workers. You can currently run more than one logger, but that doesn't help for the manager.
* Wait until the broker work is done and the old select code is removed.
* Swap out all the uses of select in the communication code with poll - I had started doing this a while back, but it got put on hold. It's probably not that much work to update it for 2.5. From what I remember it seemed to work but I didn't have a chance to do much testing on it.
- Justin Azoff
More information about the Bro