[Bro-Dev] design summary: porting Bro scripts to use Broker

Fri Oct 6 10:40:11 PDT 2017

> On Oct 6, 2017, at 12:53 PM, Siwek, Jon <jsiwek at illinois.edu> wrote:
> 
> I want to check if there’s any feedback on the approach I’m planning to take when porting over Bro’s scripts to use Broker.  There’s two major areas to consider: (1) how users specify network topology e.g. either for traditional cluster configuration or manually connecting Bro instances and (2) replacing &synchronized with Broker’s distributed data storage features.
> 

...

> Then subscriptions and auto-publications still get automatically set up by the cluster framework in bro_init().
> 
> Other Manual/Custom Topologies
> ------------------------------
> 
> I don’t see anything to do here as the Broker API already has enough to set up peerings and subscriptions in arbitrary ways.  The old “communication” framework scripts can just go away as most of its functions have direct corollaries in the new “broker” framework.
> 
> The one thing that is missing is the “Communication::nodes” table which acts as both a state-tracking structure and an API that users may use to have the comm. framework automatically set up connections between the nodes in the table.  I find this redundant — there’s two APIs to accomplish the same thing, with the table being an additional layer of indirection to the actual connect/listen functions a user can just as easily use themselves.  I also think it’s not useful for state-tracking as a user operating at the level of this use-case is can easily track nodes themselves or has some other notion of the state structures they need to track that is more intuitive for the particular problem they're solving.  Unless there’s arguments or I find it’s actually needed, I don’t plan to port this to Broker.
> 

I had some feedback related to this sort of thing earlier in the year:

http://mailman.icsi.berkeley.edu/pipermail/bro-dev/2017-February/012386.html
http://mailman.icsi.berkeley.edu/pipermail/bro-dev/2017-March/012411.html

I got send_event_hashed to work via a bit of a hack (https://github.com/JustinAzoff/broker_distributed_events/blob/master/distributed_broker.bro),
but it needs support from inside broker or at least the bro/broker integration to work properly in the case of node failure.

My ultimate vision is a cluster with 2+ physical datanode/manager/logger boxes where one box can fail and the cluster will continue to function perfectly.
The only thing this requires is a send_event_hashed function that does consistent ring hashing and is aware of node failure.

For things that don't need necessarily need consistent partitioning - like maybe logs if you were using Kafka, a way to designate that a topic should be distributed round-robin between subscribers would be useful too.

— 
Justin Azoff