[Xorp-hackers] process auto restart

Mark Handley M.Handley@cs.ucl.ac.uk
Mon, 09 Aug 2004 20:33:11 +0100


>In other words, the rtrmgr has all the responsibility for restarting
>and configuring the process, and the process itself does not need to
>do anything special.
>
>Note that currently the rtrmgr only detects (using 1(a)) that a
>process has exit abnormally, but it does not attempt to restart it.
>The restarting should be added in one of the future releases.

This is correct.  There's one other question that would need to be
resolved before this can be implemented.  What should the rtrmgr do
about processes that depend on the failed process?  The rtrmgr knows
about process startup dependencies from the dependency information in
the process template files.  But should it use this information?

My feeling is no.  Our error handling rules state that a process
should terminate itself if a process it critically depends on fails.
So in principle, the rtrmgr shouldn't do anything based on this
dependency information - one process failure should cause all the
critically dependent processes to terminate, and the rtrmgr should
wait until all the processes that think they are dependent have
terminated, and then restart everything that exited.  But how does the
rtrmgr know when it's safe to start the restart procedure?  

This isn't a showstopper, but needs to be decided globally, or we'll
have inconsistent behaviour.

 - Mark