[Linux-cluster] RHCS not fence 2nd node in 2 nodes cluster
Wang2, Colin (NSN - CN/Cheng Du)
colin.wang2 at nsn.com
Wed Nov 4 07:40:34 UTC 2009
Hi Gurus,
I am working on setup 2 nodes cluster, and environment is,
Hardware,
IBM BladeCenter with 2 LS42( AMD Opteron Quad Code 2356 CPU, 16GB
Memory).
Storage, EMC CX3-20f
Storage Switch: Brocade 4GB 20 ports switch in IBM bladecenter.
Network Switch: Cisco Switch module in IBM Bladecenter.
Software,
Redhat EL 5.3 x86_64, 2.6.18-128.el5
Redhat Cluster Suite 5.3.
This is 2 nodes cluster, and my problem is that,
- When poweroff 1st node with command "halt -fp", 2nd node can fence
1st node and take over services.
- When poweroff 2nd node with command "halt -fp", 1st node can't fence
2nd node and can't take over services.
fence_tool dump contents,
----for successful test
dump read: Success
1257305495 our_nodeid 2 our_name 198.18.9.34
1257305495 listen 4 member 5 groupd 7
1257305511 client 3: join default
1257305511 delay post_join 3s post_fail 0s
1257305511 clean start, skipping initial nodes
1257305511 setid default 65538
1257305511 start default 1 members 1 2
1257305511 do_recovery stop 0 start 1 finish 0
1257305511 first complete list empty warning
1257305511 finish default 1
1257305611 stop default
1257305611 start default 3 members 2
1257305611 do_recovery stop 1 start 3 finish 1
1257305611 add node 1 to list 1
1257305611 node "198.18.9.33" not a cman member, cn 1
1257305611 node "198.18.9.33" has not been fenced
1257305611 fencing node 198.18.9.33
1257305615 finish default 3
1257305658 client 3: dump
----For failed test
dump read: Success
1257300282 our_nodeid 1 our_name 198.18.9.33
1257300282 listen 4 member 5 groupd 7
1257300297 client 3: join default
1257300297 delay post_join 3s post_fail 0s
1257300297 clean start, skipping initial nodes
1257300297 setid default 65538
1257300297 start default 1 members 1 2
1257300297 do_recovery stop 0 start 1 finish 0
1257300297 first complete list empty warning
1257300297 finish default 1
1257303721 stop default
1257303721 start default 3 members 1
1257303721 do_recovery stop 1 start 3 finish 1
1257303721 add node 2 to list 1
1257303721 averting fence of node 198.18.9.34
1257303721 finish default 3
1257303759 client 3: dump
I think it was caused by "averting fence of node 198.18.9.34", but why
it advert fence? Could you help me out? Thanks in advance.
This cluster.conf for reference.
<?xml version="1.0"?>
<cluster config_version="14" name="x">
<fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/>
<clusternodes>
<clusternode name="198.18.9.33" nodeid="1" votes="1">
<fence>
<method name="1">
<device blade="13" name="mm1"/>
</method>
</fence>
</clusternode>
<clusternode name="198.18.9.34" nodeid="2" votes="1">
<fence>
<method name="1">
<device blade="14" name="mm1"/>
</method>
</fence>
</clusternode>
</clusternodes>
<quorumd device="/dev/sdb1" interval="2" tko="7" votes="1">
<heuristic interval="3" program="ping 198.18.9.61 -c1 -t2"
score="10"/>
</quorumd>
<totem token="27000"/>
<cman expected_votes="3" two_node="0" quorum_dev_poll="23000">
<multicast addr="239.192.148.6"/>
</cman>
<fencedevices>
<fencedevice agent="fence_bladecenter_ssh" ipaddr="x" login="x"
name="mm1" passwd="x"/>
</fencedevices>
BRs,
Colin
More information about the Linux-cluster
mailing list