[Linux-cluster] cluster not fencing after filesystem failure

Robert Jacobson Robert.C.Jacobson at nasa.gov
Wed Apr 29 16:49:07 UTC 2015


Actually it does look like the failed node was fenced by the other node:

Apr 25 02:51:46 fenced fencing node sdo-dds-nfsnode2.dds.sdo
Apr 25 02:52:39 fenced fence sdo-dds-nfsnode2.dds.sdo success

However, even after fencing, the working node (sdo-dds-nfsnode1) did not
resume the HA_nfs service.  The service was in a failed state:

[root at sdo-dds-nfsnode1 log]# clustat
Cluster Status for ddsnfs @ Sat Apr 25 03:45:11 2015
Member Status: Quorate

 Member Name                                                   ID   Status
 ------ ----                                                   ---- ------
 sdo-dds-nfsnode1.dds.sdo                                          1
Online, Local, rgmanager
 sdo-dds-nfsnode2.dds.sdo                                          2
Online, rgmanager

 Service Name                                         Owner
(Last)                                         State
 ------- ----                                         -----
------                                         -----
 service:HA_nfs                                      
(sdo-dds-nfsnode2.dds.sdo)                           failed



On 2015-04-29 11:31 AM, Vasil Valchev wrote:
> Hi,
>
> You can check in the log for "fenced" messages, if it tries to fence
> the node at all. Also for "cman".
> Is your cluster hanged after a node failure? That would indicate the
> fencing didn't succeed for some reason.
>
>    >I've tested fencing from the command line and it works:
>    >fence_vmware_soap --ip 192.168.50.9 --username ddsfence --password
>    >secret -z --action reboot -U  "423d288c-03ff-74bf-9a4f-bf661f8ed87b"
>
>
> You can also test fencing with "fence_node <node-to-be-fenced>" - that
> way it is tested with the exact arguments from the cluster.conf and
> you can see if it works or not.
>
> BR,
> Vasil
>
>
>


-- 
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Robert Jacobson               Robert.C.Jacobson at nasa.gov
Lead System Admin       Solar Dynamics Observatory (SDO)
Bldg 14, E222                             (301) 286-1591 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20150429/7d58792e/attachment.htm>


More information about the Linux-cluster mailing list