[Linux-cluster] Failover issues when shuting down node

Wed Jan 27 10:53:13 UTC 2010

Hi,

after a few tests with a four-node cluster (mainly shutting one down to
see if the failover was working properly) we had the following messages:

*Jan 27 03:31:02 node2_pub clurgmgrd[4240]: <err> #75: Failed changing
service status
Jan 27 03:31:02 node2_pub clurgmgrd[4240]: <debug> Stopping failed
service service:PID_PA-SA-R2
Jan 27 03:31:07 node2_pub clurgmgrd[4240]: <notice> Stopping service
service:PID_PA-SA-R2*
...
other checks
...
*Jan 27 03:31:25 node2_pub openais[3480]: [TOTEM] entering GATHER state
from 12.
Jan 27 03:31:27 node2_pub clurgmgrd[4240]: <err> #52: Failed changing RG
status*
*Jan 27 03:31:27 node2_pub clurgmgrd[4240]: <crit> #13: Service
service:PID_PA-SA-R2 failed to stop cleanly
Jan 27 03:31:27 node2_pub clurgmgrd[4240]: <debug> Handling failure
request for RG service:PID_PA-SA-R2*
Jan 27 03:31:30 node2_pub openais[3480]: [TOTEM] entering GATHER state
from 11.

The problem is that the same service was running on two nodes which
isn't supposed to happen....

The service in question consists of a virtual ip and a script.
The script's stop doesn't return an error in any circumstance.

Cluster is running RHEL5.3.

What could have caused these errors?

Thanks for any insight and help ;)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100127/85acfd2a/attachment.htm>