[Linux-cluster] Error while manual fencing and output of clustat

Parvez Shaikh parvez.h.shaikh at gmail.com
Mon Jan 10 08:51:14 UTC 2011


Dear experts,

I have two node cluster(node1 and node2), and manual fencing is
configured. Service S2 is running on node2. To ensure failover happen,
I shutdown node2.. I see following messages in /var/log/messages -

                    agent "fence_manual" reports: failed: fence_manual
no node name

fence_ack_manual -n node2 doesn't work saying there is no FIFO in
/tmp. fence_ack_manual -n node2 -e do work and then service S2 fails
over to node2.

Trying to find out why fence_manual is reporting error? node2 is
pingable hostname and its entry is in /etc/hosts of node1 (and vice
versa).  I also see that after failover when I do "clustat -x" I get
cluster status (in XML format) with -

<?xml version="1.0"?>
<clustat version="4.1.1">
  <groups>
    <group name="service:S" state="111" state_str="starting" flags="0"
flags_str="" owner="node1" last_owner="node1" restarts="0"
last_transition="1294676678" last_transition_str="xxxxxxxxxx"/>
  </groups>
</clustat>

I was expecting last_owner would correspond to node2(because this is
node which was running service S and has failed); which would indicate
that service is failing over FROM node2. Is there a way that node in
cluster (a node on which service is failing over) could determine from
which node the given service is failing over?

Any inputs would be greatly appreciated.

Thanks

Yours gratefully




More information about the Linux-cluster mailing list