[Linux-cluster] Cluster Suite 4 failover problem

Lon Hohberger lhh at redhat.com
Thu Oct 19 19:14:32 UTC 2006


On Thu, 2006-10-19 at 23:31 +0800, Dicky wrote:

> Both services were no longer working. when i restarted the eth0 in 
> node1, restarted the cman service in node1, it still didn't work. Also, 
> when i tried to restart the rgmanager in node1, it only showed that 
> "Waiting for services to stop: " and wating forever. Even i tried to 
> kill the process of the rgmanager, it didn't work. Finally, i  have to 
> reset both machines to get the cluster service back to normal.

Sounds like 'fencing' isn't working.  After node2 decides node1 is dead,
you have to power off node1, then run "fence_ack_manual" on node2.  That
should let things fail over.

It looks like there's a typo in clustat, too, but I don't think that's
related :)


> ======cluster.conf=========
>                         <failoverdomain name="aaa" ordered="0" 
> restricted="0">
>                                 <failoverdomainnode name="node1" 
> priority="1"/>
>                                 <failoverdomainnode name="node2" 
> priority="1"/>
>                         </failoverdomain>

FYI, you don't need to define a failover domain if all nodes in the
cluster are equal.

-- Lon




More information about the Linux-cluster mailing list