[Cluster-devel] cluster/group/daemon cman.c cpg.c gd_internal. ...
Robert Peterson
rpeterso at redhat.com
Tue Jun 20 19:43:50 UTC 2006
David Teigland wrote:
> Might be a good idea, I don't really know. I'm not even sure we'd need to
> save much or any additional state that couldn't be pulled from the gfs/dlm
> instances themselves. It seems to me the challenge would be writing the
> daemons so they could put all the pieces and interconnections back
> together again.
>
> If this ends up being a big enough problem to get more attention, I think
> the first practical improvement we could make is something like
> blocking/clearing i/o from the residual fs's (like we do in withdraw) and
> adding the ability to fully purge instances of gfs/dlm from the kernel
> without rebooting the node. Then the machines could all start from
> scratch without rebooting or fencing
Here's another idea that came to me:
For critical cluster processes like cman and fenced, maybe we could use
init's ability
to restart processes, i.e. the "respawn" option in /etc/inittab. Maybe
we can use
"respawn" or something similar to ensure that if a critical process like
fenced dies,
it gets restarted automatically and immediately. Of course, that might
cause problems
for shutdown, etc., and it would probably make it harder to test certain
things...
Bob Peterson
Red Hat Cluster Suite
More information about the Cluster-devel
mailing list