[Linux-cluster] fence_apc_snmp woes
Brian Sheets
bsheets at singlefin.net
Fri Aug 24 11:06:15 UTC 2007
I have a 2 node cluster on debian. Below is my cluster.conf. If I down node1's nic
node2 sees and tries to fence
Aug 24 10:57:38 oc-index4 fenced[7599]: oc-index3 not a cluster member after 0 sec post_fail_delay
Aug 24 10:57:38 oc-index4 fenced[7599]: fencing node "oc-index3"
Aug 24 10:57:38 oc-index4 fence_manual: Node 172.16.14.100 needs to be reset before recovery can procede. Waiting for 172.16.14.100 to rejoin the cluster or for manual acknowledgement that it has been reset (i.e. fence_ack_manual -n 172.16.14.100)
Aug 24 10:59:34 oc-index4 fenced[7599]: fence "oc-index3" success
It states that it's fencing, but never does, and if I do a fence_ack_manual, then fence_apc_snmp gets run and the node1 gets powered down.
what am I missing?
<?xml version="1.0"?>
<cluster name="index" config_version="2">
<cman two_node="1" expected_votes="1">
</cman>
<clusternodes>
<clusternode name="oc-index3" votes="1">
<fence>
<method name="single">
<device name="oc-cab1-pdu2" port="18" option="off"/>
</method>
</fence>
</clusternode>
<clusternode name="oc-index4" votes="1">
<fence>
<method name="single">
<device name="oc-cab1-pdu1" port="16" option="off"/>
</method>
</fence>
</clusternode>
</clusternodes>
<fencedevices>
<fencedevice name="oc-cab1-pdu2" agent="fence_apc_snmp" ipaddr="172.16.14.9" login="apc" passwd="xxxx"/>
<fencedevice name="oc-cab1-pdu1" agent="fence_apc_snmp" ipaddr="172.16.14.8" login="apc" passwd="xxxx"/>
</fencedevices>
</cluster>
More information about the Linux-cluster
mailing list