[Bro-Dev] [Bro-Commits] [git/bro] topic/actor-system: First-pass broker-enabled Cluster scripting API + misc. (07ad06b)

Siwek, Jon jsiwek at illinois.edu
Tue Oct 31 15:35:23 PDT 2017

> On Oct 31, 2017, at 1:16 PM, Robin Sommer <robin at icir.org> wrote:
>    - One thing I can't quite tell is if this is still aiming to
>      maintain compatibility with the old communication system, like
>      by keeping the proxies and also the *_events patterns. Looking
>      at setup-connections, it seems so. I'd say just go ahead and
>      remove all legacy pieces. Maintain two schemes in parallel is
>      cumbersome, and I think it's fine to just force everything over
>      to Broker.

It does keep the old functionality around if one does “redef Cluster::use_broker=F", but the “T” branch of the code doesn’t aim to maintain compatibility.  For the moment, I like having the old functionality around as reference: e.g. as I port other scripts/frameworks I may find it helpful to switch back to the old version to test/compare what it was doing.  I’ve made a note to remove it after I get everything working/stable.

The "use_broker=T” code branch does keep the notion of proxies the same (at least in the way the connect to other nodes in the cluster).  My thought was they can conceptually still be used for the same type of stuff: data sharing and offloading other misc. analysis/calculation.  And the only change to the setup I think I’d have to make is that each worker would now connect with all proxies instead of just one and proxies do not connect to each other.

I’ve also been talking with Justin and it seems like he wants the ability for there to be multiple logger nodes in a cluster w/ ability to distribute logs between then, which seems possible, but would need some API changes in Bro to get that working (e.g. change the static log topic to one returned by user-defined function).  I think he also had been expecting ‘data’ nodes to be a thing (not sure how those differ from proxies), so generally I’m worried I missed a previous discussion on what people expect the new cluster layout to look like or maybe just no one has put forth a coherent plan/design for that yet?

>    - Is the idea for the "*_topic" constants that one just picks the
>      apppropiate one when sending events? Like if I want to publish
>      something to all workers, I'd publish to Cluster::worker_topic?

Yeah, you have the idea right.

>      I think that's fine, though I'm wondering if we could compress
>      the API there somehow so that Cluster doesn't need to export all
>      those constants indvidiually. One idea would be a function that
>      returns a topic based on node type?

Yeah, could do that, but also don't really see the problem with exporting things individually.  At least that way, the topic strings are guaranteed to be correct in the generated docs.  With a function, you’d have to maintain the topic strings in two places: the docs and the internal function implementation, which may seem trivial to get right, but I’ve seen enough instances of outdated documentation that I have doubts...

>    - I like the Pools! How about moving Pool with its functions out
>      of the main.bro, just for clarity.


>    - Looks like the hello/bye events are broadcasted by all nodes. Is
>      that on purpose, or should that be limited to just one, like
>      just the master sending them out? Or does it not matter and this
>      provides for more redundancy?

Mostly on purpose.  The point of the “hello” message is to map broker node ID to cluster node name.  E.g. node IDs provided by Broker::peer_added are a hash of a MAC address concatenated w/ process ID (hard to read and associate with a cluster node) and node names are “manager”, “worker-1”, etc.  At the point where two nodes connect, I don’t think we have any other information other than node IDs and we need the node names to be able to send more directed messages, thus the broadcast.  At least I don’t think there’s another way to send directed messages (e.g. based on node ID) in Bro’s current API, maybe I missed it?

And the “bye” event is only raised locally so users can potentially handle it to know when a cluster node goes down (i.e. it gives them the friendlier node name rather than the broker node ID that you’d get from handling Broker::peer_lost).

I might generally be missing some context here: I remember broker endpoints originally being able to self-identify with the friendly names, so these new hello/bye events wouldn’t have been needed, but it didn’t seem like that functionality was around anymore.

>    - create_store() vs "stores": Is the idea that I'd normally use
>      create_store() and that populates the table, but I could also
>      redef it myself instead of using create_store() to create more
>      custom entries? If so, maybe make that a bit more explicit in
>      the comments that there're two ways to configure that table.

That’s right, I’ll improve the docs.

- Jon

More information about the bro-dev mailing list