[Linux-cluster] sudden unfencing problem

Laurence Schuler laurence.schuler at nasa.gov
Sat Mar 23 12:55:05 UTC 2013


I have a two node cluster that has been running fine for a couple of
months (little to 0 reboots though). We recently updated the software
with the latest Centos 6 software but now the cluster will not start. It
keeps throwing errors during startup when attempting to unfence the
disks. I have hard reset the fiber switch, and reset both hosts, but
when I run fence_sanbox2, I am unable to either enable, disable or even
get status of the switch ports. This is the error I get.

> [root at web1 lschule3]# /usr/sbin/fence_sanbox2 -a 192.168.1.190 -l
> admin -S FCpass.sh -o enable -n 5 -v
> telnet> set binary
> Negotiating binary mode with remote host.
> telnet> open 192.168.1.190 -23
> Trying 192.168.1.190...
> Connected to 192.168.1.190.
> Escape character is '^]'.
>
> Firmware V8.0.13.8.0
>
> r3fc1 login:
>
>
>   Establishing connection...   Please wait.
>
>        *****************************************************
>        *                                                   *
>        *       Command Line Interface SHell  (CLISH)       *
>        *                                                   *
>        *****************************************************
>
>        SystemDescription   SANbox 5800 FC Switch
>        HostName            r3fc1
>        EthIPv4NetworkAddr  192.168.1.190
>        EthIPv6NetworkAddr  fe80::2c0:00:00:90b
>        MACAddress          00:c0:dd:77:10:0b
>        WorldWideName       10:00:00:c0:dd:24:09:0b
>        SerialNumber        1236H00833
>        SymbolicName        r3fc1
>        ActiveSWVersion     V8.0.13.8.0
>        ActiveTimestamp     Mon Apr  2 18:32:33 2012
>        POSTStatus          Passed
>        LicensedPorts       12
>        SwitchMode          Full Fabric
>
>   The alarm log is empty.
>
> r3fc1 #> r3fc1 #> Failed: Unable to switch to admin section
> [root at web1 lschule3]# 

I can manually telnet into the FC switch and execute the appropriate
commands to enable/disable ports. But the fence_sanbox2 script will not.
The fence_sanbox2 code has not changed, however python has been upgraded
from 2.6.6-29 to 2.6.6-36.

Has anyone else seen this? Know of a fix? Am I doing/not doing something
stupid? I seem to recall running this command before during setup and it
worked just fine then.

Thanks for any help!

-- 
Laurence Schuler (Larry)                       Laurence.Schuler at nasa.gov
Systems Support                                       ADNET Systems, Inc
Scientific Visualization Studio                 http://svs.gsfc.nasa.gov




More information about the Linux-cluster mailing list