[Linux-cluster] fence_apc_snmp woes

Brian Sheets bsheets at singlefin.net
Fri Aug 24 11:06:15 UTC 2007

I have a 2 node cluster on debian. Below is my cluster.conf. If I down node1's nic 
node2 sees and tries to fence

Aug 24 10:57:38 oc-index4 fenced[7599]: oc-index3 not a cluster member after 0 sec post_fail_delay
Aug 24 10:57:38 oc-index4 fenced[7599]: fencing node "oc-index3"
Aug 24 10:57:38 oc-index4 fence_manual: Node needs to be reset before recovery can procede.  Waiting for to rejoin the cluster or for manual acknowledgement that it has been reset (i.e. fence_ack_manual -n
Aug 24 10:59:34 oc-index4 fenced[7599]: fence "oc-index3" success

It states that it's fencing, but never does, and if I do a fence_ack_manual, then fence_apc_snmp gets run and the node1 gets powered down.

what am I missing?

<?xml version="1.0"?>
<cluster name="index" config_version="2">
<cman two_node="1" expected_votes="1">
<clusternode name="oc-index3" votes="1">
                <method name="single">
                       <device name="oc-cab1-pdu2" port="18" option="off"/>

<clusternode name="oc-index4" votes="1">
                <method name="single">
                  <device name="oc-cab1-pdu1" port="16" option="off"/>

        <fencedevice name="oc-cab1-pdu2" agent="fence_apc_snmp" ipaddr="" login="apc" passwd="xxxx"/>
        <fencedevice name="oc-cab1-pdu1" agent="fence_apc_snmp" ipaddr="" login="apc" passwd="xxxx"/>

