[Bro-Dev] design summary: porting Bro scripts to use Broker

Mon Oct 9 11:46:30 PDT 2017

> On Oct 9, 2017, at 2:08 PM, Siwek, Jon <jsiwek at illinois.edu> wrote:
> 
> 
>> I got send_event_hashed to work via a bit of a hack (https://github.com/JustinAzoff/broker_distributed_events/blob/master/distributed_broker.bro),
>> but it needs support from inside broker or at least the bro/broker integration to work properly in the case of node failure.
>> 
>> My ultimate vision is a cluster with 2+ physical datanode/manager/logger boxes where one box can fail and the cluster will continue to function perfectly.
>> The only thing this requires is a send_event_hashed function that does consistent ring hashing and is aware of node failure.
> 
> Yeah, that sounds like a good idea that I can try to work into the design.  What is a “data node” though?  We don’t currently have that?

We did at one point, see

topic/seth/broker-merge / topic/mfischer/broker-integration

The data node replaced the proxies and did stuff related to broker data stores.

I think the idea was that a data node process would own the broker data store.

My usage of data nodes was for scaling out data aggregation, I never did anything with the data stores.  The data nodes were just a place to stream scan attempts to for aggregation.

> More broadly, it sounds like a user needs a way to specify which nodes they want to belong to a worker pool, do you still imagine that is done like you had in the example broctl.cfg from the earlier thread?  Do you need to be able to specify more than one type of pool?

People have asked for this now as solution for fixing an overloaded manager process, but if we get load balancing/failover working as well as QoS/priorities there may not be a point into statically configuring things like that.. like someone might want to do

# a node for tracking spam
[spam]
type = data/spam

# a node for sumstats
[sumstats]
type = data/sumstats

# a node for known hosts/certs/etc tracking
[known]
Type = data/known

But I think just having the ability to do

[data]
type = data
lb_procs = 6

This would work better for everyone.  Sending one type of data to one type of data node is still going to eventually overload a single process.

>> For things that don't need necessarily need consistent partitioning - like maybe logs if you were using Kafka, a way to designate that a topic should be distributed round-robin between subscribers would be useful too.
> 
> Yeah, that seems like it would require pretty much the same set of functionality to get working and then user can just specify a different function to use for distributing events (e.g. hash vs. round-robin).
> 
> - Jon

Great!  Right now broctl configures this in a 'round-robin' type way by assigning every other worker to a different logger node.  With support for this in broker it could just connect every worker to every logger process and broker could handle the load balancing/failover.

— 
Justin Azoff