[Linux-cluster] fencing failing

jim parsons jparsons at redhat.com
Mon Jul 23 20:20:23 UTC 2007


On Mon, 2007-07-23 at 12:31 -0700, Aravind Parchuri wrote:
> bfilipek at crscold.com wrote:
> > I have an APC MasterSwitch as my fencing device. I configured my cluster
> > to use "APC" as the fencing device, and have confirmed that it has the
> > correct un, pw, and IP address configured. However, when it tries to
> > reboot a failed node, I get this in /var/log/messages:
> > 
> > Jul 20 15:51:28 server1 fenced[32169]: agent "fence_apc" reports:
> > failed: unrecognised menu response
> > 
> We faced the same problem in FC6, with an APC 7900 switch.
> > Jul 20 15:51:28 server1 fenced[32169]: fence "server2.my.domain.com"
> > failed
> > 
> > However, when I run this command from a terminal, it runs fine and the
> > failed node reboots:
> > 
> > fence_apc -a 192.168.1.61 -l ***** -p ***** -n 6 -v
> 
> In our case, even running it from the command line didn't work. The rpms 
> in the repo have the old perl script - probably the case with the RHEL5 
> rpms too. The python script in CVS seems to work fine though:
> 
> http://sources.redhat.com/cgi-bin/cvsweb.cgi/cluster/fence/agents/apc/fence_apc.py?rev=1.5&content-type=text/x-cvsweb-markup&cvsroot=cluster
> 
> Try replacing /sbin/fence_apc with the python script and see if it helps.
Indeed, the apc version of the agent works much better and has a more
sane way of matching screens. The perl agent was getting downright
undecipherable - even to grizzly old perl veterans :)

There is a new version checked into RHEL5 Head.

I am attaching it here so you have it quick - it handles firmware v3

-J
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fence_apc
Type: text/x-python
Size: 25560 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20070723/7774b1ec/attachment.py>


More information about the Linux-cluster mailing list