From daviderobusto at gmail.com Wed Jun 10 08:06:57 2020 From: daviderobusto at gmail.com (Davide Robusto) Date: Wed, 10 Jun 2020 17:06:57 +0200 Subject: [Zeek-Dev] Use zeek scripts only with the "manager" in a cluster configuration Message-ID: Hi, I have abnormal behavior when I use the same script in two different configurations: 1. Zeek single thread configuration 2. Zeek configuration with four cores and four workers. By starting the script in question, in the first mode (zeek on a single thread) the flow execution is linear with respect to the code, so the instructions are executed without any Abnormal behavior. By starting the script in question, with the second modality (zeek multithread with four cores and four workers) the code is started correctly, but at some point workers are seen to be performing actions randomly. Is it possible to inhibit the use of code fragments for workers? Zeek scripts can only be used by Zeek's "manager" or "logger", so as to prevent "workers" from doing unwanted actions? If yes, is also possible to make "worker" able to use some code fragments instead of others ? -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.icsi.berkeley.edu/pipermail/zeek-dev/attachments/20200610/87cd8794/attachment.html From jsiwek at corelight.com Wed Jun 10 09:28:13 2020 From: jsiwek at corelight.com (Jon Siwek) Date: Wed, 10 Jun 2020 09:28:13 -0700 Subject: [Zeek-Dev] Use zeek scripts only with the "manager" in a cluster configuration In-Reply-To: References: Message-ID: On Wed, Jun 10, 2020 at 8:14 AM Davide Robusto wrote: > Is it possible to inhibit the use of code fragments for workers? > Zeek scripts can only be used by Zeek's "manager" or "logger", > so as to prevent "workers" from doing unwanted actions? If yes, is also possible to make "worker" > able to use some code fragments instead of others ? See if these examples help: https://docs.zeek.org/en/current/frameworks/broker.html#cluster-framework-examples This particular one shows 3 different way to conditionalize the execution of code based on either the node type or a specific node's name: https://docs.zeek.org/en/current/frameworks/broker.html#manager-sending-events-to-workers - Jon From daviderobusto at gmail.com Wed Jun 10 10:20:38 2020 From: daviderobusto at gmail.com (Davide Robusto) Date: Wed, 10 Jun 2020 19:20:38 +0200 Subject: [Zeek-Dev] Use zeek scripts only with the "manager" in a cluster configuration In-Reply-To: References:

Message-ID: Awesome, thanks! Il Mer 10 Giu 2020, 18:28 Jon Siwek ha scritto: > On Wed, Jun 10, 2020 at 8:14 AM Davide Robusto > wrote: > > > Is it possible to inhibit the use of code fragments for workers? > > Zeek scripts can only be used by Zeek's "manager" or "logger", > > so as to prevent "workers" from doing unwanted actions? If yes, is also > possible to make "worker" > > able to use some code fragments instead of others ? > > See if these examples help: > > https://docs.zeek.org/en/current/frameworks/broker.html#cluster-framework-examples > > This particular one shows 3 different way to conditionalize the > execution of code based on either the node type or a specific node's > name: > https://docs.zeek.org/en/current/frameworks/broker.html#manager-sending-events-to-workers > > - Jon > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.icsi.berkeley.edu/pipermail/zeek-dev/attachments/20200610/61086e0e/attachment.html From jsiwek at corelight.com Wed Jun 17 19:07:53 2020 From: jsiwek at corelight.com (Jon Siwek) Date: Wed, 17 Jun 2020 19:07:53 -0700 Subject: [Zeek-Dev] Zeek Supervisor Command-Line Client Message-ID: Don't recall any basic "project infrastructure" discussions happening yet for the upcoming replacement/alternative for ZeekControl that we want to introduce in Zeek 3.2 (roadmap/design links found at [1]), so here's starting questions. # What to Name It ? Suggestion: `zeekcl`, Zeek (Command-Line) CLlient. Open to ideas, but will use `zeekcl` below. # What Programming Language ? `zeekcl` has different/narrower scope than ZeekControl. It's more clearly a "client" with sole job of handling requests/responses via Broker without many (any?) system-level operations/integrations. Meaning there may be less of an approachability/convenience gap between C++ versus Python with `zeekcl` than there was with ZeekControl. Also nice if `zeekcl` doesn't require more dependencies beyond what `zeek` needs since they're expected to be used together. Is use of Python still desirable for other reasons? Otherwise, I lean towards `zeekcl` being C++. For reference/sanity-check in terms of what people expect `zeekcl` to be: in my testing of the SupervisorControl framework [2], I had a sloppy Zeek script implementing the full "client side" (essentially the majority of what `zeekcl` will do) in ~100 LOC. Most operations are that simple: send request and display response. That does mean the third option to consider besides either Python or C++ is Zeek's scripting language (e.g. `ctl.zeek`), but I don't suggest that since (1) using a full `zeek` process is way more than we need and (2) the command-line interface is awkward (`zeek ctl Supervisor::cmd="status"` versus `zeekcl status`) # Where's the Source Code Live ? Past experiences with ZeekControl being in a separate repo than Zeek are negative in terms of CI/testing: changes in Zeek have broken ZeekControl, but go uncaught for a while since it is tested independently. Since common use/maintenance will involve both `zeek` and `zeekcl`, and also don't expect the later to accrue large amounts of code deserving of a separate project, I plan to have `zeekcl` code/tests live inside the main Zeek repo. - Jon [1] https://github.com/zeek/zeek/issues/582 [2] https://github.com/zeek/zeek/blob/689a242836092fba7818ba24724b74a7a7902e48/scripts/base/frameworks/supervisor/control.zeek From vlad at es.net Wed Jun 17 20:32:52 2020 From: vlad at es.net (Vlad Grigorescu) Date: Thu, 18 Jun 2020 03:32:52 +0000 Subject: [Zeek-Dev] Zeek Supervisor Command-Line Client In-Reply-To: References: Message-ID: I'm still fuzzy on the Supervisor framework, as we're still in the process of upgrading systems to the point of supporting the new C++ requirements. As a concrete example, what does a cluster upgrade look like? Today, that means install the new version on the manager, and then do `zeekctl deploy`, which copies the files to the nodes and restarts the cluster. All of that is done without Broker. What does that look like with zeekcl + Broker? Let's say I install the new version on the manager. If I then tell zeekcl to destroy the running instance, will that work, or will the newer zeekcl be incompatible with the Broker version of the running Zeek? Reading the script linked in [2], I notice that zeekcl would not support copying files from one node to another? Other features that would be missing that we routinely use are `zeekctl print` and `zeekctl exec`. I'm assuming `zeekcl` would be running in some uber-bare mode if it's written in Zeek? --Vlad On Thu, Jun 18, 2020 at 2:15 AM Jon Siwek wrote: > Don't recall any basic "project infrastructure" discussions happening > yet for the upcoming replacement/alternative for ZeekControl that we > want to introduce in Zeek 3.2 (roadmap/design links found at [1]), so > here's starting questions. > > # What to Name It ? > > Suggestion: `zeekcl`, Zeek (Command-Line) CLlient. > > Open to ideas, but will use `zeekcl` below. > > # What Programming Language ? > > `zeekcl` has different/narrower scope than ZeekControl. It's more > clearly a "client" with sole job of handling requests/responses via > Broker without many (any?) system-level operations/integrations. > Meaning there may be less of an approachability/convenience gap > between C++ versus Python with `zeekcl` than there was with > ZeekControl. > > Also nice if `zeekcl` doesn't require more dependencies beyond what > `zeek` needs since they're expected to be used together. > > Is use of Python still desirable for other reasons? Otherwise, I lean > towards `zeekcl` being C++. > > For reference/sanity-check in terms of what people expect `zeekcl` to > be: in my testing of the SupervisorControl framework [2], I had a > sloppy Zeek script implementing the full "client side" (essentially > the majority of what `zeekcl` will do) in ~100 LOC. Most operations > are that simple: send request and display response. > > That does mean the third option to consider besides either Python or > C++ is Zeek's scripting language (e.g. `ctl.zeek`), but I don't > suggest that since (1) using a full `zeek` process is way more than we > need and (2) the command-line interface is awkward (`zeek ctl > Supervisor::cmd="status"` versus `zeekcl status`) > > # Where's the Source Code Live ? > > Past experiences with ZeekControl being in a separate repo than Zeek > are negative in terms of CI/testing: changes in Zeek have broken > ZeekControl, but go uncaught for a while since it is tested > independently. > > Since common use/maintenance will involve both `zeek` and `zeekcl`, > and also don't expect the later to accrue large amounts of code > deserving of a separate project, I plan to have `zeekcl` code/tests > live inside the main Zeek repo. > > - Jon > > [1] https://github.com/zeek/zeek/issues/582 > [2] > https://github.com/zeek/zeek/blob/689a242836092fba7818ba24724b74a7a7902e48/scripts/base/frameworks/supervisor/control.zeek > _______________________________________________ > Zeek-Dev mailing list > Zeek-Dev at zeek.org > http://mailman.icsi.berkeley.edu/mailman/listinfo/zeek-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.icsi.berkeley.edu/pipermail/zeek-dev/attachments/20200618/5cc1271c/attachment.html From robin at corelight.com Wed Jun 17 23:34:00 2020 From: robin at corelight.com (Robin Sommer) Date: Thu, 18 Jun 2020 06:34:00 +0000 Subject: [Zeek-Dev] Zeek Supervisor Command-Line Client In-Reply-To: References: Message-ID: <20200618063400.GG9200@corelight.com> > Suggestion: `zeekcl`, Zeek (Command-Line) CLlient. "zeekcl" is very close to "zeekctl", which could lead to confusion. "zcl" maybe? > Is use of Python still desirable for other reasons? Otherwise, I lean > towards `zeekcl` being C++. No particular preference from my side, I can see either. Effort is probably about the same in this model, and C++ does have the advantage of less dependency issues. > Zeek's scripting language (e.g. `ctl.zeek`), but I don't suggest that Ack, agree. > I plan to have `zeekcl` code/tests live inside the main Zeek repo. Makes sense to me as well. Robin -- Robin Sommer * Corelight, Inc. * robin at corelight.com * www.corelight.com From robin at corelight.com Thu Jun 18 00:11:41 2020 From: robin at corelight.com (Robin Sommer) Date: Thu, 18 Jun 2020 07:11:41 +0000 Subject: [Zeek-Dev] Zeek Supervisor Command-Line Client In-Reply-To: References:

Message-ID: <20200618071141.GH9200@corelight.com> On Thu, Jun 18, 2020 at 03:32 +0000, Vlad Grigorescu wrote: > As a concrete example, what does a cluster upgrade look like? The idea is to handle this more like other system services: you'll be in charge of getting the new Zeek version onto all your systems yourself, using whatever method you use for other software as well. For example, if you're installing through a package manager, you'd just run "update" on all systems. If you're installing from source, you'll either need to compile on each system, or copy the installation over manually. The underlying assumption is that people will already have a mechanism in place for administration of their systems, and we shouldn't be trying to reinvent the wheel, as ZeekControl oddly does. From a sysadmin perspective, ZeekControl is really doing a lot more right now that it should be doing; other tools don't work that way. We don't want it look like an APT anymore (https://github.com/zeek/zeek/issues/259). :-) > Today, that means install the new version on the manager, and then do > `zeekctl deploy`, which copies the files to the nodes and restarts the > cluster. All of that is done without Broker. There are two parts here: (1) deploying the Zeek installation itself, and (2) deploying any configuration changes (incl. new Zeek scripts). For (1), the above applies: we'll rely on standard sysadmin processes for updating. That means you'd use "zeekcl" to shutdown the cluster processes, then run "yum update" (or whatever), then use "zeekcl" again to start things up again. (The Zeek supervisor will be running already at that point, managaged through systemd or whatever you're using). (2) is still a bit up in the air. With 3.2, there won't be any support for distributing configurations automatically, but we could add that so that config files/scripts/packages do get copied around over Broker. Feedback would be appreciated here: What's better, having zeekcl manage that, or leave it to standard sysadmin process as well? > Reading the script linked in [2], I notice that zeekcl would not support > copying files from one node to another? Correct right now, (2) may or may not change that. > zeekctl print "print" will be supported (roadmap says not in 3.2 yet, but it should be easy to do, maybe we can get it in still). > zeekctl exec. "exec" will likely not be supported. We *could* support it, no technical reason for not doing that over Broker. It just s seems like another things that's better handled with different tools. Robin -- Robin Sommer * Corelight, Inc. * robin at corelight.com * www.corelight.com From vlad at es.net Thu Jun 18 07:45:33 2020 From: vlad at es.net (Vlad Grigorescu) Date: Thu, 18 Jun 2020 09:45:33 -0500 Subject: [Zeek-Dev] Zeek Supervisor Command-Line Client In-Reply-To: <20200618071141.GH9200@corelight.com> References:

<20200618071141.GH9200@corelight.com> Message-ID: Thanks Robin, that helps. On Thu, Jun 18, 2020 at 2:11 AM Robin Sommer wrote: > > There are two parts here: (1) deploying the Zeek installation itself, > and (2) deploying any configuration changes (incl. new Zeek scripts). > > For (1), the above applies: we'll rely on standard sysadmin processes > for updating. That means you'd use "zeekcl" to shutdown the cluster > processes, then run "yum update" (or whatever), then use "zeekcl" > again to start things up again. (The Zeek supervisor will be running > already at that point, managaged through systemd or whatever you're > using). > > (2) is still a bit up in the air. With 3.2, there won't be any support > for distributing configurations automatically, but we could add that > so that config files/scripts/packages do get copied around over > Broker. Feedback would be appreciated here: What's better, having > zeekcl manage that, or leave it to standard sysadmin process as well? > I re-read the design doc, and I think that the part I missed the first time through was suicide on orphaning. (Side-note: Given the much-needed trend towards bias-free terminology in technology, perhaps there's a better term here). My main concern was Broker version incompatibilities between the newly-installed zcl, and the running cluster, which I think would be addressed by that (i.e. to stop a cluster, you stop the supervisor service on the manager, and then the other services will lose their connection and also stop). I'm still a bit unclear on how to start a cluster. In my mind, where simply using the standard process/job control falls short is the need to operate across multiple physical systems. So, would that be a job for zcl? Or would the desired goal be that I have my, say, systemd unit set to constantly be restarting Zeek on my worker systems? If it can't connect to the manager, it would presumably immediately die given the orphaned state. The more tightly we couple the nodes together, the more quickly it'll detect failures, but the more sensitive it will be to flapping and unnecessary restarts. The cluster is relatively fragile right now (e.g. a manager node going away even for a brief period of time tends to lead to a crash, as on an even relatively busy system, as the backlog won't clear as timers and other events stack up). So I think that if we're moving cluster supervision out of a parallel process in `zeekctl cron` and into Zeek itself, we'll need to improve error detection and graceful recovery where possible. --Vlad -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.icsi.berkeley.edu/pipermail/zeek-dev/attachments/20200618/4a888290/attachment.html From jsiwek at corelight.com Thu Jun 18 13:00:44 2020 From: jsiwek at corelight.com (Jon Siwek) Date: Thu, 18 Jun 2020 13:00:44 -0700 Subject: [Zeek-Dev] Zeek Supervisor Command-Line Client In-Reply-To: <20200618071141.GH9200@corelight.com> References:

<20200618071141.GH9200@corelight.com> Message-ID: On Thu, Jun 18, 2020 at 12:11 AM Robin Sommer wrote: > For (1), the above applies: we'll rely on standard sysadmin processes > for updating. That means you'd use "zeekcl" to shutdown the cluster > processes, then run "yum update" (or whatever), then use "zeekcl" > again to start things up again. (The Zeek supervisor will be running > already at that point, managaged through systemd or whatever you're > using). I have a slightly different take: isn't it more common to expect "start" and "stop" operations here to be done by the service-manager rather than Zeek client? I'm assuming "update/deploy Zeek installation" could involve a change in the `zeek` binary and that implements the supervisor process itself, so you'd want, at the level of system services, to stop the entire Zeek process tree, including the root supervisor. That doesn't exclude the possibility of the client having operations like "start" (spawn `zeek -j `), "stop" (kill the root `zeek` supervisor process), or even others that dynamically add/remove cluster nodes from the tree, but that's probably not the common/expected usage to prioritize since it's again back to model of the process tree being managed manually by the user, independent from a system's service-manager. - Jon From jsiwek at corelight.com Thu Jun 18 13:15:14 2020 From: jsiwek at corelight.com (Jon Siwek) Date: Thu, 18 Jun 2020 13:15:14 -0700 Subject: [Zeek-Dev] Zeek Supervisor Command-Line Client In-Reply-To: References:

<20200618071141.GH9200@corelight.com> Message-ID: On Thu, Jun 18, 2020 at 7:45 AM Vlad Grigorescu wrote: > My main concern was Broker version incompatibilities between the newly-installed zcl, and the running cluster, which I think would be addressed by that (i.e. to stop a cluster, you stop the supervisor service on the manager, and then the other services will lose their connection and also stop). A clarification that may help you: the "orphaning" behavior isn't related to Broker connections, it's related to the parent-child relationship between processes. So there's a process tree here with `zeek` in supervisor-mode at the root and child processes that are individual cluster nodes (worker, manager, logger, proxy). The normal termination behavior for the supervisor process is to gracefully kill and wait for all children to exit. In the very exceptional case of the supervisor exiting/crashing without having cleaned up all children, those children will self-terminate upon noticing they are no longer parented to the supervisor. - Jon From johanna at icir.org Thu Jun 18 13:34:14 2020 From: johanna at icir.org (Johanna Amann) Date: Thu, 18 Jun 2020 13:34:14 -0700 Subject: [Zeek-Dev] Zeek Supervisor Command-Line Client In-Reply-To: <20200618063400.GG9200@corelight.com> References: <20200618063400.GG9200@corelight.com> Message-ID: >> Suggestion: `zeekcl`, Zeek (Command-Line) CLlient. > > "zeekcl" is very close to "zeekctl", which could lead to confusion. > "zcl" maybe? > >> Is use of Python still desirable for other reasons? Otherwise, I >> lean >> towards `zeekcl` being C++. > > No particular preference from my side, I can see either. Effort is > probably about the same in this model, and C++ does have the advantage > of less dependency issues. I agree - I actually kind of like the idea that zeekcl does not have python as a dependency. >> I plan to have `zeekcl` code/tests live inside the main Zeek repo. > > Makes sense to me as well. Agreed here too. Johanna From robin at corelight.com Fri Jun 19 01:38:10 2020 From: robin at corelight.com (Robin Sommer) Date: Fri, 19 Jun 2020 08:38:10 +0000 Subject: [Zeek-Dev] Zeek Supervisor Command-Line Client In-Reply-To: References:

<20200618071141.GH9200@corelight.com> Message-ID: <20200619083810.GE49063@corelight.com> On Thu, Jun 18, 2020 at 13:00 -0700, Jon Siwek wrote: > > For (1), the above applies: we'll rely on standard sysadmin processes > > for updating. That means you'd use "zeekcl" to shutdown the cluster > > processes, then run "yum update" (or whatever), then use "zeekcl" > > again to start things up again. > I have a slightly different take: isn't it more common to expect > "start" and "stop" operations here to be done by the service-manager > rather than Zeek client? I believe we're pretty close to saying the same thing. I'm making a distinction between the supervisor Zeek process (which the service manager starts & stops), and the cluster's node processes (manager, workers, etc). The supervisor manages the latter and will by default shut them down when it gets the "stop" from its service-manager. But I think we also want their state controllable from the client as well, so that one can have an orderly shutdown of a multi-system cluster without loss of data (e.g., one probably wants to shutdown workers first to collect remaining log data). This what I meant above by "shutdown the cluster processes": "zeek-client stop" would tell the supervisors to shutdown their node processes (or rather: "zeek-client stop workers", or maybe "zeek-client" would now the order in which to stop nodes or systems). And I imagine one would do that before starting to a cluster-wide upgrade to the next Zeek version. That said, your note on Slack sounds right: let's figure out the single-system operation first and get that usable. I'm pretty confident that we will then be able to build the multi-system model on top of that without too much trouble, and it'll we easier to collect requirements for administration/management of multi-system setups once we got some experience with single-system setups. Robin -- Robin Sommer * Corelight, Inc. * robin at corelight.com * www.corelight.com From jsiwek at corelight.com Fri Jun 19 11:46:14 2020 From: jsiwek at corelight.com (Jon Siwek) Date: Fri, 19 Jun 2020 11:46:14 -0700 Subject: [Zeek-Dev] Zeek Supervisor Command-Line Client In-Reply-To: <20200619083810.GE49063@corelight.com> References:

<20200618071141.GH9200@corelight.com> <20200619083810.GE49063@corelight.com> Message-ID: On Fri, Jun 19, 2020 at 1:38 AM Robin Sommer wrote: > think we also want their state controllable from the client as well, > so that one can have an orderly shutdown of a multi-system cluster > without loss of data (e.g., one probably wants to shutdown workers > first to collect remaining log data). This what I meant above by > "shutdown the cluster processes": "zeek-client stop" would tell the > supervisors to shutdown their node processes (or rather: "zeek-client > stop workers", or maybe "zeek-client" would now the order in which to > stop nodes or systems). Ack, got it and agree that the distinction is likely helpful: the supervisor node implements the low-level "dirty work" of stopping processes and can ensure shutdown of its entire process tree if it really has to, but the client can carry out shutdown logic with a higher-level of insight into directing a shutdown process (possibly across many hosts) in orderly fashion. Also, based on "naming" feedback: plan to use `zeekc`. - Jon From jsiwek at corelight.com Tue Jun 30 01:39:17 2020 From: jsiwek at corelight.com (Jon Siwek) Date: Tue, 30 Jun 2020 01:39:17 -0700 Subject: [Zeek-Dev] Zeek Supervisor: designing client and log archival behavior Message-ID: Looking for feedback on the design/plan for these two Zeek Supervisor components: * https://github.com/zeek/zeek/wiki/Zeek-Supervisor-Client * https://github.com/zeek/zeek/wiki/Zeek-Supervisor-Log-Handling - Jon From seth at corelight.com Tue Jun 30 06:35:12 2020 From: seth at corelight.com (Seth Hall) Date: Tue, 30 Jun 2020 09:35:12 -0400 Subject: [Zeek-Dev] Zeek Supervisor Command-Line Client In-Reply-To: References:

<20200618071141.GH9200@corelight.com> <20200619083810.GE49063@corelight.com> Message-ID: Sorry for chiming in late on this... On 19 Jun 2020, at 14:46, Jon Siwek wrote: > Ack, got it and agree that the distinction is likely helpful: the > supervisor node implements the low-level "dirty work" of stopping > processes and can ensure shutdown of its entire process tree if it > really has to, but the client can carry out shutdown logic with a > higher-level of insight into directing a shutdown process (possibly > across many hosts) in orderly fashion. I think that the script we ship with zeek that effectively implements the supervisor behavior should understand the business logic of shutting down a cluster in the correct order. One way to think about it is that the supervisor script will presumably understand the business logic for starting a cluster in the right order so consequently it would seem that it should understand how to shut down the cluster as well. We talked about it recently and now that I've had some more time to think about it I'm really starting to think that the business logic for correctly starting and stopping a cluster should be fully implemented in the supervisor script. The zeekc tool could then just be a dumb tool that says to start and stop and doesn't end up causing us to spread our logic around to other tooling. .Seth -- Seth Hall * Corelight, Inc * www.corelight.com From robin at corelight.com Tue Jun 30 08:47:26 2020 From: robin at corelight.com (Robin Sommer) Date: Tue, 30 Jun 2020 15:47:26 +0000 Subject: [Zeek-Dev] Zeek Supervisor Command-Line Client In-Reply-To: References:

<20200618071141.GH9200@corelight.com> <20200619083810.GE49063@corelight.com>

Message-ID: <20200630154726.GB33767@corelight.com> On Tue, Jun 30, 2020 at 09:35 -0400, I wrote: > I think that the script we ship with zeek that effectively implements the > supervisor behavior should understand the business logic of shutting down a > cluster in the correct order. How would that then work across multiple systems? Robin -- Robin Sommer * Corelight, Inc. * robin at corelight.com * www.corelight.com From jsiwek at corelight.com Tue Jun 30 14:29:12 2020 From: jsiwek at corelight.com (Jon Siwek) Date: Tue, 30 Jun 2020 14:29:12 -0700 Subject: [Zeek-Dev] Zeek Supervisor Command-Line Client In-Reply-To: References:

<20200618071141.GH9200@corelight.com> <20200619083810.GE49063@corelight.com>

Message-ID: On Tue, Jun 30, 2020 at 6:35 AM Seth Hall wrote: > I'm really starting to think that the business logic for > correctly starting and stopping a cluster should be fully implemented in > the supervisor script. The zeekc tool could then just be a dumb tool > that says to start and stop and doesn't end up causing us to spread our > logic around to other tooling. Maybe the important observation is that the logic can be performed anywhere that has access to the Zeek-Supervisor process. * The Supervisor process itself would be able to perform the logic via direct BIF access. * External processes, like zeekc, have access to a Zeek-event interface to indirectly access those same BIFs, so they can also execute equivalent logic (either via multiple events, or a single "convenience" event that implements a sequence of BIF calls on remote) When we bring multi-hosting into the mix, it's still a similar situation, just with beefed up logic for orchestrating node-type-specific steps across many peers: anyone with access to the Zeek-event interface could implement this logic. You could pick zeekc to orchestrate, or you could pick a single Zeek-Supervisor process to orchestrate between other Supervisors, or you could pick a regular Zeek process, or you could write a Python script just using Broker Python bindings, etc. So where we put the logic at this point may not be important. If we can find a single-best-place for the logic to live, that's great, but if there's utility for others to have their own independent-yet-equivalent logic, I don't see a problem with that. - Jon