[Bro-Dev] Broker::publish API
Jan Grashöfer
jan.grashoefer at gmail.com
Tue Aug 7 03:05:53 PDT 2018
To be honest, I have somehow lost track of the discussion. What I can
recall, it's about simplifying the API in the light of multi-hop
routing, which is not fully functional yet.
Regarding multi-hop routing I am even not sure what the actual goal is
that we are currently aiming at. However, from a conceptual perspective
I think "routing" either needs routing algorithms or strict conventions
of how the network, to route messages through, is structured. So, what
would a "deep cluster" look like and what kind of message flows do we
expect in there?
Some comments on the observations:
On 06/08/18 21:50, Robin Sommer wrote:
> - The main topics are bro/cluster/<node-type> and
> bro/cluster/node/<name>. For these we wouldn't have a problem
> with loops if we enabled automatic, topic-driven forwading as
> far as I can see.
How does forwarding work if I add another node type? Do we assume a
certain cluster structure here? If yes: Is that a valid assumption?
> - bro/cluster/broadcast seems to be the main case with a looping
> problem, because everybody subscribes to it. It's hardly used
> though. (bro/config/change is used similarly though).
The topic-concept is a multicast scheme, isn't it? Having a broadcast
functionality on top of that feels odd. However, it's limited to the
cluster topic. This leads me to the question which domains do we operate
on? If I think of messages, I start to think about a cluster but that
might be only one domain of application. I think it would be good to
define layers of abstraction more precise here.
> - There are a couple of script-specific topics where I'm wondering
> if these could switch to using bro/cluster/<node-type> instead
> (bro/intel/*, bro/irc/dcc_transfer_update). In other words: when
> clusterizing scripts, prefer not to introduce new topics.
From my understanding this would mean going back to the old
communication patterns. What's the point of having topics if we don't
use them?
> - There's a lot of checks in publishing code of the type "if I am
> (not) of node type X".
That's something I would have expected. I don't think this is
necessarily an indicator of bad design. Having these kind of checks
means that roles are somehow fixed and responsibilities are explicitly
codified.
> - Pools are used for two different things: 1. the known-* scripts
> pick a proxy to process and log the information; whereas 2. the
> Intel scripts pick a proxy just as a relay to broadcast stuff
> out, reducing load. That 1st application is a good, but the 2nd
> feels like should be handled differently.
I think we should be careful about introducing too much abstractions.
Communication patterns tend to be complex and the more of the complexity
is hidden, the easier it will be to generate misunderstandings. For
example, in case of the intel framework, proxy nodes might be able to
implement some more logic than just relaying at some point. Having the
relay abstraction would mean to deal with two different levels of
abstractions regarding intel on proxy nodes in this case.
> Overall I have to say I found it pretty hard to follow this all
> because we don't have much consistency right now in how scripts
> structure their communication. That's not surprising, given that we're
> just starting to use all this, but it suggests that we have room for
> improvement in our abstractions. :)
I totally agree here! I think it could help to come up with some more
use cases to identify the best abstractions.
Jan
More information about the bro-dev
mailing list