[Zeek] Request for Feedback - Zeek Process Supervision Model

Fri Mar 22 07:30:11 PDT 2019

On 19 Mar 2019, at 16:06, Samuel Oehlert wrote:

> Personally, I think it would be poor design to rebuild host OS 
> monitoring
> inside the Zeek supervisor. I think that should be left up to the many
> other projects specifically designed to monitor disk usage, etc. That 
> being
> said, exposing some metrics about Zeek the application layer sounds 
> like it
> would be a win. That being said, that might be outside the scope of a
> supervisor as well.

Just tell yourself that all of the processes that are being spawned and 
supervised are just threads and then you may think about this project 
differently.  The fact that we will be spawning and monitoring child 
processes is merely an implementation detail.  If we chose to offset the 
responsibility for starting and managing all of the process to something 
like systemd then it would specifically tie us to systemd (and we 
definitely don't want to maintain compatibility with multiple 
supervisors).

The benefit to this approach is that from the OS perspective it's easy 
to run under any system supervisor and in Docker since it effectively 
has the same model of "run in the foreground and monitor that the 
process is still alive".  There is an additional benefit too because 
we've been discussing doing an "early fork" of the supervisor process so 
that they all derive from the same binary (same initial memory image) 
which you can think of like a stem cell so the supervisor can tell it to 
fork again and specialize into a particular cluster process.  This has 
the benefit of being sure that all of the processes are the same.  
Otherwise, if systemd restarted one of the workers and the binary on 
disk had changed in the intervening time it would end up being a 
different process (different version of Zeek?).  I know it's a somewhat 
contrived example but it's always surprising to see the problems that 
will be encountered in the real world so the more potential problems we 
can avoid up front in the design is probably better.

Another benefit to this approach is that a full cluster can be started 
from the command line really easily and will run in the foreground.  
It's been really fascinating using the prototype as it is.

   .Seth

--
Seth Hall * Corelight, Inc * www.corelight.com