[Linux-cluster] Fencing issue using IPMI (nodes fencing each other ending in a loop)

Stevan Colaco stevan.colaco at gmail.com
Tue Sep 23 17:27:21 UTC 2008


Hello

issue: Fencing using fence_ipmilan, each node keeps fencing the other
node ending in a fence loop.....

We have implemented RH Cluster on RHEL5.2 64bit.
Server Hardware: SUN X4150
Storage: SUN 6140
Fencing Machnism: fence_ipmilan

 We have downloaded the IPMI fence_ipmilan and configured two node
cluster with ipmi fencing. But..

when we ifdown the NIC interface, the node gets fenced but the service
does not relocate to the other node. at the same time when the
initially fenced node joins back the cluster it fences the other
node......
this keeps on ending in a loop.

We downloaded and followed the intructions from the ipmi site
mentioned below
http://docs.sun.com/source/819-6588-13/ipmi_com.html#0_74891

we tested with following  Cmd line method which works fine.
#fence_ipmilan -a "ip addr" -l root -p <Passkey> -o <on|off|reboot>

here is my cluster.conf

<?xml version="1.0"?>
<cluster alias="tibcouat" config_version="12" name="tibcouat">
	<fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/>
	<clusternodes>
		<clusternode name="tibco-node1-uat.kmefic.com.kw" nodeid="1" votes="1">
			<fence>
				<method name="1">
					<device name="tibco-node1"/>
				</method>
			</fence>
		</clusternode>
		<clusternode name="tibco-node2-uat.kmefic.com.kw" nodeid="2" votes="1">
			<fence>
				<method name="1">
					<device name="tibco-node2"/>
				</method>
			</fence>
		</clusternode>
	</clusternodes>
	<cman expected_votes="1" two_node="1"/>
	<fencedevices>
		<fencedevice agent="fence_ipmilan" ipaddr="172.16.71.41"
login="root" name="tibco-node1" passwd="changeme"/>
		<fencedevice agent="fence_ipmilan" ipaddr="172.16.71.42"
login="root" name="tibco-node2" passwd="changeme"/>
	</fencedevices>
	<rm>
		<failoverdomains>
			<failoverdomain name="prefer_node1" nofailback="0" ordered="1"
restricted="1">
				<failoverdomainnode name="tibco-node1-uat.kmefic.com.kw" priority="1"/>
				<failoverdomainnode name="tibco-node2-uat.kmefic.com.kw" priority="2"/>
			</failoverdomain>
		</failoverdomains>
		<resources>
			<ip address="172.16.71.55" monitor_link="1"/>
			<clusterfs device="/dev/vg0/gfsdata" force_unmount="0" fsid="63282"
fstype="gfs" mountpoint="/var/www/html" name="gfsdata"
self_fence="0"/>
			<apache config_file="conf/httpd.conf" name="docroot"
server_root="/etc/httpd" shutdown_wait="0"/>
		</resources>
		<service autostart="1" domain="prefer_node1" exclusive="0"
name="webby" recovery="relocate">
			<ip ref="172.16.71.55"/>
		</service>
	</rm>
</cluster>


Kindly investigate and provide us the solution at the earliest.

Thanks & Best Regards,




More information about the Linux-cluster mailing list