[Bro-Dev] [Bro-Commits] [git/bro] topic/actor-system: First-pass broker-enabled Cluster scripting API + misc. (07ad06b)

Jan Grashöfer jan.grashoefer at gmail.com
Fri Nov 3 12:13:15 PDT 2017

On 03/11/17 18:07, Azoff, Justin S wrote:> Partitioning the intel data 
set is a little tricky since it supports subnets and hashing
> and won't necessarily give you the same node.  Maybe subnets need to exist on all
> nodes but everything else can be partitioned?  

Good point! Subnets are stored kind of separate to allow prefix matches 
anyway. However, I am a bit hesitant as it would become a quite complex 

> There would also need to be a method for
> re-distributing the data if the cluster configuration changes due to nodes being added or removed.

Right, that's exactly what I was thinking of. I guess this applies also 
to other use cases which will use HRW. I am just not sure whether 
dynamic layout changes are out of scope at the moment...

> 'Each data node serving a part of a cluster' is kind of like what we have now with proxies,
> but that is statically configured and has no support for failover.  I've seen cluster setups where
> there are 4 worker boxes and run one proxy on each box.  The problem is if one box down,
> 1/4 of the workers on the remaining 3 boxes are configured to use a proxy that no longer exists.
> So minimally just having a copy of the data in another process and using RR would be an improvement.
> There may be an issue with scaling out data notes to 8+ processes for things like scan detection and sumstats,
> if those 8 data nodes would also need to have a full copy of the intel data in memory. I don't know how much
> memory a large intel data set is inside a running bro process though.

Fully agreed! In that case it might be nice if one can define separate 
special purpose data nodes, e.g. "intel data nodes". But, I am not sure 
whether this is a good idea as this might lead to complex cluster 
definitions and poor usability as users need to know a bit about how the 
underlying mechanisms work. On the other hand this would theoretically 
allow to completely decouple the intel data store (e.g. interface a 
"real" database with some pybroker-scripts).


More information about the bro-dev mailing list