[Linux-cluster] two fencing problems

Greg Forte gforte at leopard.us.udel.edu
Tue Dec 6 16:26:34 UTC 2005

Bryan Cardillo wrote:
 >         I'm in the process of testing the attached patch, basically
 >         just had to remove a portion of the match for the `Control
 >         Outlet' option.

Interesting ... I see you were getting hung on the menu after where I 
was - looks like my problem was that the author didn't expect anyone to 
rename their outlets to something more useful than "Outlet 1", "Outlet 
2", etc.  The same problem plagues the next menu, because it was looking 
to match the "----- Outlet # -------" banner, but the assigned name 
shows up there instead.  The following patch (against the "original") 
seemingly fixes both of these problems generally (incorporating Bryan's 
fix as well).

--- /sbin/fence_apc     2005-08-01 19:01:17.000000000 -0400
+++ fence_apc   2005-12-06 09:09:55.000000000 -0500
@@ -244,10 +244,10 @@
                         /--\s*device manager.*(\d+)\s*-\s*Outlet 
Control/is ||

                         # "Device Manager", "1- Cluster Node 0   ON"
-                       /--\s*Outlet 
Control.*(\d+)\s*-\s+Outlet\s+$opt_n\D[^\n]*\s(?-i:ON|OFF)\*?\s/ism ||
+                       /--\s*Outlet 
Control.*($opt_n)\s*-[^\n]+\s(?-i:ON|OFF)\*?\s/ism ||

                         # Administrator Outlet Control menu
-                       /--\s*Outlet $opt_n\D.*(\d+)\s*-\s*control 
+                       /Outlet\s+:\s*$opt_n\D.*(\d+)\s*-\s*control 
                 ) {

>         here is the clusternode elem I'm using, with the port
>         specified, and seems to work so far.  as far as I know, this
>         must be specified in the cluster.conf manually.
> <clusternode name="node1" votes="1">
>     <fence>
>         <method name="pdu">
>             <device name="pdu" port="1"/>
>         </method>
>     </fence>
> </clusternode>

Ah, I see I was confusing <fencedevice ...> with <fence> - it looks like 
it is configurable in the configuration tool afterall, under "manage 
fencing for this node".  Here's what I got after setting it up with my 
two cross-wired PDUs (the nodes have redundant power, so node 1 is 
plugged into outlet 1 on each pdu, and node 2 to outlet 2 on each pdu):

                 <clusternode name="NODE1" votes="1">
                                 <method name="1">
                                         <device name="FENCE1" 
option="off" port="1" switch="1"/>
                                         <device name="FENCE2" 
option="off" port="1" switch="1"/>
                                         <device name="FENCE1" 
option="on" port="1" switch="1"/>
                                         <device name="FENCE2" 
option="on" port="1" switch="1"/>
                 <clusternode name="NODE2" votes="1">
                                 <method name="1">
                                         <device name="FENCE1" 
option="off" port="2" switch="1"/>
                                         <device name="FENCE2" 
option="off" port="2" switch="1"/>
                                         <device name="FENCE1" 
option="on" port="2" switch="1"/>
                                         <device name="FENCE2" 
option="on" port="2" switch="1"/>

Except then when I stopped the configurator and started it again it 
complained about the "switch=" options that it put there itself! 
removing them by hand seems to have fixed it.  *sigh*

And it still doesn't appear to work ... I can turn the outlets on and 
off from the command line, but if I down the interface on a node, the 
other node reports that it's removing the "failed" node from the 
cluster, and that it's fencing the "failed" node, but the "failed" node 
never gets shut down.  Does this get logged somewhere besides 
/var/log/messages, or is there a way to force it to be more verbose?  If 
I could see what command fenced is actually invoking that might help ...


Greg Forte
gforte at udel.edu
IT - User Services
University of Delaware
Newark, DE

More information about the Linux-cluster mailing list