[Linux-cluster] Graceful recover after connectivity failure

Cliff Hones cliff-lc at cliff.hones.org.uk
Fri Jan 11 19:38:59 UTC 2008


I am using Centos5.1 with GNBD and GNBD fencing.

Following the failure of a cluster member - eg a temporary
loss of connectivity - which results in the node being
fenced, is there a clean way to re-join the cluster without
having to reboot the affected node?

I am finding that it is impossible to shut down or restart the
cluster components on the affected node, and even trying to force
a reboot from a ssh session just hangs.

There seems to be a chicken-and-egg situation - a gfs filesystem
cannot be unmounted if the node is fenced, and cman/clvmd cannot
be stopped/restarted if a filesystem is mounted.   Forcibly
trying to kill the cluster processes also fails.

-- Cliff




More information about the Linux-cluster mailing list