[Linux-cluster] Error while manual fencing and output of clustat
Parvez Shaikh
parvez.h.shaikh at gmail.com
Tue Jan 11 05:45:36 UTC 2011
Thanks Xaviar.
It resolved the error on fencing.
However I still am grappling with issue of finding name of "Failed
cluster node" on another cluster node to which service on failed node
has failed over to.
I was using output of "clustat -x -S service name" and was parsing XML
file to obtain value of "last_owner" field.
Any input on how to find out name of failed node on another cluster
node, over which services from failed node are starting?
Thanks
On Mon, Jan 10, 2011 at 6:58 PM, Xavier Montagutelli
<xavier.montagutelli at unilim.fr> wrote:
> Hello Parvez,
>
> On Monday 10 January 2011 09:51:14 Parvez Shaikh wrote:
>> Dear experts,
>>
>> I have two node cluster(node1 and node2), and manual fencing is
>> configured. Service S2 is running on node2. To ensure failover happen,
>> I shutdown node2.. I see following messages in /var/log/messages -
>>
>> agent "fence_manual" reports: failed: fence_manual
>> no node name
>
> I am not an expert, but could you show us your cluster.conf file ?
>
> You need to give a "nodename" attribute to the fence_manual agent somewhere,
> the error message makes me think it's missing.
>
> For example :
>
> <fencedevices>
> <fencedevice agent="fence_manual" name="my_fence_manual"/>
> </fencedevices>
> ...
> <clusternode name="node2" ...>
> <fence>
> <method name="1">
> <device name="my_fence_manual" nodename="node2"/>
> </method>
> </fence>
> </clusternode>
>
>>
>> fence_ack_manual -n node2 doesn't work saying there is no FIFO in
>> /tmp. fence_ack_manual -n node2 -e do work and then service S2 fails
>> over to node2.
>>
>> Trying to find out why fence_manual is reporting error? node2 is
>> pingable hostname and its entry is in /etc/hosts of node1 (and vice
>> versa). I also see that after failover when I do "clustat -x" I get
>> cluster status (in XML format) with -
>>
>> <?xml version="1.0"?>
>> <clustat version="4.1.1">
>> <groups>
>> <group name="service:S" state="111" state_str="starting" flags="0"
>> flags_str="" owner="node1" last_owner="node1" restarts="0"
>> last_transition="1294676678" last_transition_str="xxxxxxxxxx"/>
>> </groups>
>> </clustat>
>>
>> I was expecting last_owner would correspond to node2(because this is
>> node which was running service S and has failed); which would indicate
>> that service is failing over FROM node2. Is there a way that node in
>> cluster (a node on which service is failing over) could determine from
>> which node the given service is failing over?
>>
>> Any inputs would be greatly appreciated.
>>
>> Thanks
>>
>> Yours gratefully
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>
> --
> Xavier Montagutelli Tel : +33 (0)5 55 45 77 20
> Service Commun Informatique Fax : +33 (0)5 55 45 75 95
> Universite de Limoges
> 123, avenue Albert Thomas
> 87060 Limoges cedex
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
More information about the Linux-cluster
mailing list