[Linux-cluster] Any idea on a stop problem with CS4 ?

Alain Moulle Alain.Moulle at bull.net
Tue Feb 21 09:15:46 UTC 2006


We use a 2 nodes cluster to manage failover services via dedicated scripts.
Using clusvcadm -r <service_name> to migrate a service from one node
to the other, it happens from time to time that the CS4 is stuck with
"service_name stopping" diagnostic.
The stop target of the script associated with the service is not called. Subsequent
clusvcadm -d <service_name> calls return a success diagnostic but do
effectively strictly nothing : the service script is not called.

The  only way we found to solve the problem is to restart the cluster suite
daemons (ccsd, cman, fenced & rgmanager) on both nodes.

Often, this does not work too and it is necessary to reboot both nodes.
If we reboot only the apparent failing node, the cluster fails immediatly again
in the dead-lock situation as if information was taken from some memory of the
paired node.

Any idea on this problem ?
Alain Moullé

More information about the Linux-cluster mailing list