[Linux-cluster] cluster not doing failover
Sai Loganathan
sail at serverengines.com
Fri Mar 9 23:30:15 UTC 2007
Hello,
I am setting up a 2 node redhat cluster to test failover as part of
testing effort in my company for the iscsi product we develop.
I have a iscsi target which is my cluster shared storage.
Downloaded and compiled the open source redhat cluster and installed
the cluster components in both the nodes.
Logged-into the iscsi target, created a gfs filesystem and mounted the
lun on both the nodes.
Created the cluster.conf using system-config-cluster gui and below that
cluster.conf
<?xml version="1.0"?>
<cluster config_version="8" name="alpha_cluster">
<fence_daemon post_fail_delay="0" post_join_delay="3"/>
<clusternodes>
<clusternode name="node1" votes="1">
<fence>
<method name="1">
<device
name="node1_fence"/>
</method>
</fence>
</clusternode>
<clusternode name="node2" votes="1">
<fence>
<method name="1">
<device
name="node2_fence"/>
</method>
</fence>
</clusternode>
</clusternodes>
<cman expected_votes="1" two_node="1"/>
<fencedevices>
<fencedevice agent="fence_ilo" hostname="admin"
login="admin" name="node1_fence" passwd="admin"/>
<fencedevice agent="fence_ilo" hostname="admin"
login="admin" name="node2_fence" passwd="admin"/>
</fencedevices>
<rm>
<failoverdomains>
<failoverdomain name="failover"
ordered="1"
restricted="0">
<failoverdomainnode
name="node1"
priority="1"/>
<failoverdomainnode
name="node2"
priority="2"/>
</failoverdomain>
</failoverdomains>
<resources>
<ip address="172.40.2.119"
monitor_link="1"/>
<clusterfs device="/dev/mapper/vg0-lv"
force_unmount="0" fsid="10938" fstype="gfs" mountpoint="/test1"
name="lun" options=""/>
</resources>
<service autostart="1" domain="failover"
name="iscsi_ip">
<ip ref="172.40.2.119"/>
</service>
<service autostart="1" domain="failover"
name="iscsi_lun">
<clusterfs ref="lun"/>
</service>
</rm>
</cluster>
Using the cluster ip address (172.40.2.119), I was able to do an nfs
mount of the shared lun from a 3rd machine. Started an infinite ls on
that lun.
To simulate failover, I just powered-down the node1 and hoping to see
the read io stop but resume via the node2. But, I see the following
error message on the node 2.
Mar 9 12:14:49 node2 fenced[7422]: fence "node1" failed
Mar 9 12:14:54 node2 fenced[7422]: fencing node "node1"
Mar 9 12:14:54 node2 fenced[7422]: agent "fence_ilo" reports: Can't
call method "configure" on an undefined value at /sbin/fence_ilo line
169, <> line 4.
Mar 9 12:14:54 node2 fenced[7422]: fence "node1" failed
Mar 9 12:14:59 node2 fenced[7422]: fencing node "node1"
Mar 9 12:14:59 node2 fenced[7422]: agent "fence_ilo" reports: Can't
call method "configure" on an undefined value at /sbin/fence_ilo line
169, <> line 4.
Seems like I am not doing something correct with respect to fencing.
Can I setup cluster without fencing first of all?
I don't have any of the fencing power devices. In that case, how do I
do fencing?
Any help would be greatly appreciated.
Thanks,
Sai Logan
_________________________________________________________________________________________________________________
This message and any attachment are confidential and may be privileged or otherwise protected from disclosure. If you are not the intended
recipient please telephone or e-mail the sender and delete this message and all attachments from your system - ServerEngines LLC
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20070309/d60026b1/attachment.htm>
More information about the Linux-cluster
mailing list