[Linux-cluster] fencing problem in 2 node cluster using apc fence device

Thu Aug 17 05:28:46 UTC 2006

Hi, another related has been puzzling me since I'm new to redhat clustering. I understand that when configuring apc fencing device for each node, I need to fill in the port and switch of the apc device. Port refers to the power outlet which my node is connected to, right? For my case, my node 1 is connected to Outlet 13 of the first pdu. Do I fill in Port=13 or Port=Outlet 13? As for switch, I have absolutely no idea what it refers to. Does it refer to the state I want the outlet to go to during fencing, like 1=Immediate ON, 2=Immediate OFF, 3=Immediate Reboot, etc?

Thanks.

Gary

--- garylua at singnet.com.sg wrote:

> Hi Lon, thanks for the reply. My cluster.conf is as follows.
> coral1 and coral2 are my 2 nodes.
> 
> 
> <?xml version="1.0"?>
> <cluster config_version="264" name="MF_Cluster">
> 	<fence_daemon clean_start="1" post_fail_delay="0"
> post_join_delay="3"/>
> 	<clusternodes>
> 		<clusternode name="coral1" votes="1">
> 			<fence>
> 				<method name="1">
> 					<device name="pdu1" option="off" port="13" switch="1"/>
> 					<device name="pdu2" option="off" port="13" switch="2"/>
> 					<device name="pdu1" option="on" port="13" switch="1"/>
> 					<device name="pdu2" option="on" port="13" switch="2"/>
> 				</method>
> 			</fence>
> 		</clusternode>
> 		<clusternode name="coral2" votes="1">
> 			<fence>
> 				<method name="1">
> 					<device name="pdu1" option="off" port="20" switch="1"/>
> 					<device name="pdu2" option="off" port="20" switch="2"/>
> 					<device name="pdu1" option="on" port="20" switch="1"/>
> 					<device name="pdu2" option="on" port="20" switch="2"/>
> 				</method>
> 			</fence>
> 		</clusternode>
> 	</clusternodes>
> 	<fencedevices>
> 		<fencedevice agent="fence_apc" ipaddr="10.10.50.100" login="apc"
> name="pdu1" passwd="apc"/>
> 		<fencedevice agent="fence_apc" ipaddr="10.10.50.101" login="apc"
> name="pdu2" passwd="apc"/>
> 	</fencedevices>
> 	<rm>
> 		<failoverdomains>
> 			<failoverdomain name="MF_Failover" ordered="0" restricted="1">
> 				<failoverdomainnode name="coral2" priority="1"/>
> 				<failoverdomainnode name="coral1" priority="1"/>
> 			</failoverdomain>
> 		</failoverdomains>
> 		<resources>
> 			<fs device="/dev/sda1" force_fsck="0" force_unmount="1"
> fstype="ext3" mountpoint="/MF/MF_v1.1/shared" name="testmount"
> options="" self_fence="0"/>
> 			<script file="/etc/rc.d/init.d/msgfwd" name="Message
> Forwarder"/>
> 			<ip address="10.10.50.22" monitor_link="1"/>
> 			<script file="/etc/rc.d/init.d/namesvc" name="Name Service"/>
> 		</resources>
> 		<service autostart="1" domain="MF_Failover" name="msgfwd"
> recovery="relocate">
> 			<fs ref="testmount"/>
> 			<script ref="Message Forwarder"/>
> 			<ip ref="10.10.50.22"/>
> 		</service>
> 	</rm>
> 	<cman expected_votes="1" two_node="1"/>
> </cluster>
> 
> 
> 
> 
> --- Lon Hohberger <lhh at redhat.com> wrote:
> 
> > On Wed, 2006-08-16 at 22:37 +0800, Gary Lua wrote:
> > > Hi,
> > > 
> > > I'm currently configuring fencing devices for my 2 nodes on a
> > RHEL4 
> > > cluster. The problem is quite long, so please bear with me.
> > > 
> > > I have 2 nodes (let's call them stone1 and stone2) and 2 APC
> > fencing 
> > > devices (pdu1 and pdu2, both apc 7952 devices). Both stone1 and
> > stone2 
> > > has dual power supplies. Stone1's power supplies are connected
> to
> > outlet 
> > > 13 of pdu1 and pdu2. Stone2's power supplies are connected to
> > outlet 20 
> > > of both the pdus. My question is: during the fencing
> configuration
> > for 
> > > each node, i need to specify which fence device to add to the
> > fence 
> > > level of each node. Is it correct to specify for stone1 as
> follows
> > : 
> > > pdu1 -> port=13, switch=1, pdu2-> port=13, switch=2? The same
> > applies to 
> > > stone 2 : pdu1-> port=20, switch=1, pdu2-> port=20, switch=2?
> > > 
> > > After configuring as mentioned above, with both nodes on the
> > cluster 
> > > running and my application running on stone1, i pull out the
> > ethernet 
> > > cables for stone1 to simulate that the server is down. By
> right,
> > my 
> > > application should fail over to stone2 and fencing should occur
> to
> > 
> > > stone1 (ie, stone1 should be rebooted/shutdown). However, what
> > happened 
> > > is that my application is started on stone2, and stone1 is not
> > fenced. 
> > > In fact, when i reconnect by cables, my application is still
> > running on 
> > > stone1! Seems that there are 2 instances of my application
> > running, each 
> > > on stone1 and stone2.
> > 
> > Post the cluster configuration.
> > 
> > -- Lon
> > 
> > 
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> > 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>