[Linux-cluster] fence_apc unknown screen encountered

Matt Harrington mharrington at eons.com
Tue Aug 26 19:43:52 UTC 2008


I can pinpoint the problem with verbose logging.  For some reason, 
fence_apc repeats the outlet selection menu option.  On an outlet number 
 > 2, this is harmless, but in the case where the outlet <= 2, the 
script horks.  Here is the output on a working call to illustrate the 
duplicate selection; "13" is entered twice:

^M------- Outlet Control/Configuration 
------------------------------------------

     1- Outlet 1                 ON
     2- build                    ON
     3- www103                   ON
     4- www102                   ON
     5- Outlet 5                 ON
     6- Outlet 6                 ON
     7- Outlet 7                 ON
     8- fs102                    ON
     9- build                    ON
    10- app102                   ON
    11- Outlet 11                ON
    12- db103                    ON
    13- fs103                    ON
    14- Outlet 14                ON
    15- Outlet 15                ON
    16- Outlet 16                ON
    17- Master Control/Configuration

     <ESC>- Back, <ENTER>- Refresh, <CTRL-L>- Event Log
 > 13

^M------- fs103 
-----------------------------------------------------------------

        Name         : fs103
        Outlet       : 13
        State        : ON

     1- Control Outlet   
     2- Configure Outlet 

     ?- Help, <ESC>- Back, <ENTER>- Refresh, <CTRL-L>- Event Log
 > 13

^M------- fs103 
-----------------------------------------------------------------

        Name         : fs103
        Outlet       : 13
        State        : ON

     1- Control Outlet   
     2- Configure Outlet 

     ?- Help, <ESC>- Back, <ENTER>- Refresh, <CTRL-L>- Event Log
 > 1


Matt Harrington wrote:
> I am encountering an unknown screen exception from fence_apc when 
> trying to fence a system in a 3-node cluster (centos5.2 
> cman-2.0.84-2.el5).  What is interesting, is that I can fence the 
> other two nodes in my cluster.  I believe the difference is that the 
> problem node has two power supplies which means that fence_apc is 
> called with off/on instead of restart.  This also requires connecting 
> to two different pdus.  It could also be that there is something wrong 
> with the config which was taken from an older system and updated with 
> luci.  I am unable to descern any differences between the menus of the 
> two pdus.
>
>
>
> [root at fs102 ~]# /sbin/fence_node fs103
> agent "fence_apc" reports: Traceback (most recent call last):
>  File "/sbin/fence_apc", line 829, in ?
>    main()
>  File "/sbin/fence_apc", line 303, in main
>    do_power_off(sock)
>  File "/sbin/fence_apc", line 813, in do_power_off
>    x = do_power_switch(sock, "off")
>  File "/sbi
> agent "fence_apc" reports: n/fence_apc", line 611, in do_power_switch
>    result_code, response = power_off(txt + ndbuf)
>  File "/sbin/fence_apc", line 817, in power_off
>    x = power_switch(buffer, False, "2", "3");
>  File "/sbin/fence_apc", line 810, in power_switch
>    raise "un
> agent "fence_apc" reports: known screen encountered in \n" + 
> str(lines) + "\n"
> unknown screen encountered in
> ['', '> 2', '', '', '------- Configure Outlet 
> ------------------------------------------------------', '', '    #  
> State  Ph  Name                     Pwr On Dly  Pwr Off D
> agent "fence_apc" reports: ly  Reboot Dur.', '   
> ----------------------------------------------------------------------------', 
> '    2  ON     1   fs103                    0 sec       0 sec        5 
> sec', '', '     1- Outlet Name         : fs103', '     2- Power On 
> Delay(sec) : 0',
> agent "fence_apc" reports:  '     3- Power Off Delay(sec): 0', '     
> 4- Reboot Duration(sec): 5', '     5- Accept Changes      : ', '', 
> '     ?- Help, <ESC>- Back, <ENTER>- Refresh, <CTRL-L>- Event Log']
>
>
> [root at fs102 ~]# /sbin/fence_apc -a 10.10.1.200 -l pdu -p pdu -n 13 -o 
> status
> Status check successful. Port 13 is OFF
> [root at fs102 ~]# /sbin/fence_apc -a 10.10.1.201 -l pdu -p pdu -n 2 -o 
> status
> Status check successful. Port 2 is ON
> [root at fs102 ~]# /sbin/fence_apc -a 10.10.1.201 -l pdu -p pdu -n 2 -o off
> Traceback (most recent call last):
>  File "/sbin/fence_apc", line 829, in ?
>    main()
>  File "/sbin/fence_apc", line 303, in main
>    do_power_off(sock)
>  File "/sbin/fence_apc", line 813, in do_power_off
>    x = do_power_switch(sock, "off")
>  File "/sbin/fence_apc", line 611, in do_power_switch
>    result_code, response = power_off(txt + ndbuf)
>  File "/sbin/fence_apc", line 817, in power_off
>    x = power_switch(buffer, False, "2", "3");
>  File "/sbin/fence_apc", line 810, in power_switch
>    raise "unknown screen encountered in \n" + str(lines) + "\n"
> unknown screen encountered in
> ['2', '', '', '------- Configure Outlet 
> ------------------------------------------------------', '', '    #  
> State  Ph  Name                     Pwr On Dly  Pwr Off Dly  Reboot 
> Dur.', '   
> ----------------------------------------------------------------------------', 
> '    2  ON     1   fs103                    0 sec       0 sec        5 
> sec', '', '     1- Outlet Name         : fs103', '     2- Power On 
> Delay(sec) : 0', '     3- Power Off Delay(sec): 0', '     4- Reboot 
> Duration(sec): 5', '     5- Accept Changes      : ', '', '     ?- 
> Help, <ESC>- Back, <ENTER>- Refresh, <CTRL-L>- Event Log']
>
>
>
>
> <cluster config_version="143" name="gfs_cluster">
>    <fence_daemon clean_start="0" post_fail_delay="0" 
> post_join_delay="3"/>
>    <clusternodes>
>        <clusternode name="fs101" nodeid="1" votes="1">
>            <fence>
>                <method name="1">
>                    <device name="pdu102.eons.dev" port="12"/>
>                </method>
>            </fence>
>        </clusternode>
>        <clusternode name="fs102" nodeid="2" votes="1">
>            <fence>
>                <method name="1">
>                    <device name="pdu101.eons.dev" port="8"/>
>                </method>
>            </fence>
>        </clusternode>
>        <clusternode name="fs103" nodeid="3" votes="1">
>            <fence>
>                <method name="1">
>                    <device name="pdu101.eons.dev" option="off" 
> port="13"/>
>                    <device name="pdu102.eons.dev" option="off" port="2"/>
>                    <device name="pdu101.eons.dev" option="on" port="13"/>
>                    <device name="pdu102.eons.dev" option="on" port="2"/>
>                </method>
>            </fence>
>        </clusternode>
>    </clusternodes>
>        <fencedevices>
>                <fencedevice agent="fence_apc" ipaddr="10.10.1.200" 
> login="pdu" name="pdu101.eons.dev" passwd="pdu"/>
>                <fencedevice agent="fence_apc" ipaddr="10.10.1.201" 
> login="pdu" name="pdu102.eons.dev" passwd="pdu"/>
>        </fencedevices>
> ...
> </cluster>
>
>
>
>
> [root at fs102 ~]# cat /etc/redhat-release
> CentOS release 5.2 (Final)
> [root at fs102 ~]# rpm -qf /sbin/fence_apc
> cman-2.0.84-2.el5
> [root at fs102 ~]# rpm -q luci
> luci-0.12.0-7.el5.centos.3
>
>
> pdu101:
> American Power Conversion               Network Management Card 
> AOS      v3.5.9
> (c) Copyright 2008 All Rights Reserved  Rack PDU 
> APP                     v3.5.8
>
> pdu102:
> American Power Conversion               Network Management Card 
> AOS      v3.5.9
> (c) Copyright 2008 All Rights Reserved  Rack PDU 
> APP                     v3.5.8
>
> -- 
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster




More information about the Linux-cluster mailing list