[Linux-cluster] Failover not working

Dave Berry dave at eons.com
Thu Mar 8 16:15:32 UTC 2007


I have a 3 node GFS cluster sharing 2 virtual IPs as 2 different 
services.  For some reason the failover is not working correctly.  The 
IPs are listed as services in the cluster.conf and the failover is set 
to use ordered/restricted.  Below is the pertinent cluster.conf parts.  
The IPs failover when the box goes down but does not fail back to the 
correctly prioritized box when it returns.   I have included the error 
from the log at the end.  Thanks.

 <failoverdomains>
                        <failoverdomain name="nfs_domain1" ordered="1" 
restricted="1">
                                <failoverdomainnode name="fs101" 
priority="1"/>
                                <failoverdomainnode name="fs102" 
priority="2"/>
                                <failoverdomainnode name="fs103" 
priority="3"/>
                        </failoverdomain>
                        <failoverdomain name="nfs_domain2" ordered="1" 
restricted="1">
                                <failoverdomainnode name="fs102" 
priority="1"/>
                                <failoverdomainnode name="fs101" 
priority="2"/>
                                <failoverdomainnode name="fs103" 
priority="3"/>
                        </failoverdomain>
                </failoverdomains>

 <resources>
                        <clusterfs device="/dev/mapper/mpath6p1" 
force_unmount="0" fsid="49841" fstype="gfs" 
mountpoint="/opt/eons/shared" name="nfs1"
 options=""/>
                        <ip address="192.168.1.200" monitor_link="1"/>
                        <ip address="192.168.1.201" monitor_link="1"/>
</resources>
 <service autostart="1" domain="nfs_domain1" name="nfs_ip1">
                        <ip ref="192.168.1.200"/>
                </service>
                <service autostart="1" domain="nfs_domain2" name="nfs_ip2">
                        <ip ref="192.168.1.201"/>
                </service>

Mar  8 11:03:26 fs101 clurgmgrd[5684]: <debug> Relocating group nfs_ip2 
to better node fs102
Mar  8 11:03:26 fs101 clurgmgrd[5684]: <debug> Event (0:2:1) Processed
Mar  8 11:03:26 fs101 clurgmgrd[5684]: <notice> Stopping service nfs_ip2
Mar  8 11:03:26 fs101 clurgmgrd[5684]: <err> #52: Failed changing RG status
Mar  8 11:03:26 fs101 clurgmgrd[5684]: <debug> Handling failure request 
for RG nfs_ip2
Mar  8 11:03:26 fs101 clurgmgrd[5684]: <err> #57: Failed changing RG status




More information about the Linux-cluster mailing list