[Linux-cluster] fencing failing

Brad Filipek bfilipek at crscold.com
Fri Jul 20 21:26:35 UTC 2007


Hi Jim,

This is in rhel5

APC Firmware:
=======================================================================
Network Management Card AOS      v2.6.4
MSP APP                          v2.6.2
=======================================================================

cluster.conf file:
=======================================================================
<?xml version="1.0"?>
<cluster alias="cluster1" config_version="20" name="cluster1">
        <fence_daemon post_fail_delay="0" post_join_delay="3"/>
        <clusternodes>
                <clusternode name="server1.my.domain.com" nodeid="1"
votes="1">
                        <fence>
                                <method name="1">
                                        <device name="APCMS62" port="7"
switch="0"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="server2.my.domain.com" nodeid="2"
votes="1">
                        <fence>
                                <method name="1">
                                        <device name="APCMS62" port="6"
switch="0"/>
                                </method>
                        </fence>
                </clusternode>
        </clusternodes>
        <cman expected_votes="1" two_node="1"/>
        <fencedevices>
                <fencedevice agent="fence_apc" ipaddr="192.168.1.61"
login="name" name="APCMS61" passwd="pass"/>
                <fencedevice agent="fence_apc" ipaddr="192.168.1.62"
login="name" name="APCMS62" passwd="pass"/>
        </fencedevices>
        <rm>
                <failoverdomains>
                        <failoverdomain name="main" ordered="0"
restricted="0">
                                <failoverdomainnode
name="server1.my.domain.com" priority="1"/>
                                <failoverdomainnode
name="server2.my.domain.com" priority="1"/>
                        </failoverdomain>
                </failoverdomains>
                <resources>
                        <smb name="smb1" workgroup="WKGRP"/>
                        <ip address="192.168.1.20" monitor_link="1"/>
                </resources>
                <service autostart="1" domain="main" name="samba">
                        <smb ref="smb"/>
                        <ip ref="192.168.1.20"/>
                </service>
        </rm>
</cluster>
=======================================================================



Brad Filipek


-----Original Message-----
From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of jim parsons
Sent: Friday, July 20, 2007 4:11 PM
To: linux clustering
Subject: Re: [Linux-cluster] fencing failing

On Fri, 2007-07-20 at 15:55 -0500, Brad Filipek wrote:
> I have an APC MasterSwitch as my fencing device. I configured my
> cluster to use "APC" as the fencing device, and have confirmed that it
> has the correct un, pw, and IP address configured. However, when it
> tries to reboot a failed node, I get this in /var/log/messages:
> 
>  
> 
> Jul 20 15:51:28 server1 fenced[32169]: agent "fence_apc" reports:
> failed: unrecognised menu response
> 
> Jul 20 15:51:28 server1 fenced[32169]: fence "server2.my.domain.com"
> failed
> 
>  
> 
> However, when I run this command from a terminal, it runs fine and the
> failed node reboots:
> 
>  
> 
> fence_apc -a 192.168.1.61 -l ***** -p ***** -n 6 -v

Ooohh...that is not good. Can you please tell me if this is rhel4 or
rhel5?

Can you send your cluster.conf file? If the agent works from the command
line bu not within the cluster code, it could be an error in the conf
file. XXX out all passwords and such that you care about, or course,
before sending to list.

 Can you telnet into the apc switch and see what firmware version it is
using?
There are two version values on the welcome screen that would be nice to
know:

  Network Management Card AOS      vx.x.x
  Rack PDU APP                     vx.x.x

-J


--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster

Confidentiality Notice: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is privileged, confidential and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient or the employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. 

If you have received this communication in error, please notify us immediately by email reply or by telephone and immediately delete this message and any attachments.





More information about the Linux-cluster mailing list