[Xorp-hackers] process auto restart
Ray Qiu
ray.qiu@gmail.com
Tue, 10 Aug 2004 10:32:09 -0700
On Mon, 09 Aug 2004 20:33:11 +0100, Mark Handley <m.handley@cs.ucl.ac.uk> wrote:
>
> >In other words, the rtrmgr has all the responsibility for restarting
> >and configuring the process, and the process itself does not need to
> >do anything special.
> >
> >Note that currently the rtrmgr only detects (using 1(a)) that a
> >process has exit abnormally, but it does not attempt to restart it.
> >The restarting should be added in one of the future releases.
>
> This is correct. There's one other question that would need to be
> resolved before this can be implemented. What should the rtrmgr do
> about processes that depend on the failed process? The rtrmgr knows
> about process startup dependencies from the dependency information in
> the process template files. But should it use this information?
>
> My feeling is no. Our error handling rules state that a process
> should terminate itself if a process it critically depends on fails.
> So in principle, the rtrmgr shouldn't do anything based on this
> dependency information - one process failure should cause all the
> critically dependent processes to terminate, and the rtrmgr should
> wait until all the processes that think they are dependent have
> terminated, and then restart everything that exited. But how does the
> rtrmgr know when it's safe to start the restart procedure?
>
> This isn't a showstopper, but needs to be decided globally, or we'll
> have inconsistent behaviour.
>
> - Mark
>
It might be a good idea to define some processes as server components
(e.g. RIB) and others as clients (e.g. BGP). If a server component
dies, one easy thing to do is to simply restart the whole stack. If a
client dies, it can be restarted immediatelly. Too complicated
dependencies might be too hard to deal with.
Just my 2 cents.
--Ray