[Linux-cluster] Fencing Logs

Gordan Bobic gordan at bobich.net
Tue Aug 25 22:43:17 UTC 2009


I have a really strange problem on one of my clusters. It exhibits all 
signs of fencing being broken, but the fencing agents work when tested 
manually, and I cannot find anything in syslog to even suggest that 
fencing is being attempted by the surviving node (which just locks up on 
GFS access until the other node returns).

Has anybody got any suggestions on how to troubleshoot this?

The relevant extract from my cluster.conf is:

<clusternodes>
         <clusternode name="hades-cls" nodeid="1" votes="1">
...
                 <fence>
                         <method name = "1">
                                 <device name = "hades-oob"/>
                         </method>
                 </fence>
...
         </clusternode>
         <clusternode name="persephone-cls" nodeid="2" votes="1">
...
                 <fence>
                         <method name = "1">
                                 <device name ="persephone-oob"/>
                         </method>
                 </fence>
         </clusternode>
</clusternodes>
<fencedevices>
         <fencedevice agent="fence_eric" ipaddr="10.1.254.251" 
login="fence" passwd="some_password" name="hades-oob"/>
         <fencedevice agent="fence_eric" ipaddr="10.1.254.252" 
login="fence" passwd="some_password" name="persephone-oob"/>
</fencedevices>
...

I have a near identical setup on all my other clusters, so this is 
somewhat baffling. What else could be relevant to this, specifically in 
the context of no fencing attempts even showing up in the logs? I have 
set up scores of RHCS clusters and never seen anything like this before. 
The only unusual thing about this cluster is that I had to write a 
bespoke fencing agent for the machines, but these test true when I use 
them to down/reboot the machines.

TIA.

Gordan




More information about the Linux-cluster mailing list