[Linux-cluster] fence_apc failure

Chris Harms chris at cmiware.com
Tue Jul 31 15:00:39 UTC 2007


The last error I reported where the node was actually powered off, but 
the cluster thought it failed was with the CVS python script. 

 > /sbin/fence_apc -V
fence_apc New APC Agent - test release  September 21, 2006

We noticed the asterisk as well and thought that might be problematic.  
A co-worker has hacked on the python script some and reports that it now 
functions properly from the command line.  I have yet to begin testing 
with the cluster though.

The Perl script (renamed to test the python script) reports:

 > /sbin/fence_apc.pl.bak -V
fence_apc.pl.bak 2.0.64 (built Mon Jun 25 14:34:20 EDT 2007)
Copyright (C) Red Hat, Inc.  2004  All rights reserved.

I'd be happy to test for you, since that is what I'll be doing anyway.

Thanks,
Chris

James Parsons wrote:
> Chris,
>
> If you run /sbin/fence_apc -V, and the response is "fence_apc 
> 1.32.45", then you are using the latest
> apc agent that has been updated to handle the 3.x apc firmware.
>
> If you are running the latest agent, then you are obviously 
> encountering a problem. It is possible that it has something to do 
> with the asterisk that is appearing after an outlet group. Please let 
> me know if you are running 1.32.45...this should be able to be fixed 
> today, if you can help with some testing.
>
> Some background on the fence_apc agent for those who care:
> For years, the apc agent was written in perl. As the apc firmware 
> evolved from release to release, the regexp match patterns for screen 
> scraping the telnet session grew uglier and uglier. In addition, the 
> perl agent did not support outlet naming or grouping, and did not 
> handle the larger switches that took 2 or more screens to list out the 
> available outlets. To add these features and to make the agent easier 
> to maintain, it was decided to rewrite the agent in python. This was 
> done while apc firmware was still in the 2.x series.
>
> After some errors with MasterSwitchPlus switches were fixed, the agent 
> worked well in the field...until version 3.x firmware was released. 
> This firmware release changed ALOT of things, including screen order.
>
> The current agent version worked well on my older apc switches as well 
> as the ones with newer firmware, so I released it into the beta and 
> plan on releasing it as an async errata release for RHEL4.5 Cluster 
> Suite.
>
> I will try and reproduce this issue this morning and have something 
> for you to test today.
>
> By the way, there is also a fence_apc_snmp agent. It works great, but 
> we have not switched to using it exclusively yet, because some admins 
> don't like having snmp packages on their systems - but with the pain 
> that trying to maintain the telnet version of this agent is causing, 
> it is making me lean more and more towards including just one apc 
> solution - snmp. :-/
>
> Thanks for your patience,
>
> -Jim
>
> Chris Harms wrote:
>
>> This appears to be a "no."
>> fence_node[26827]: agent "fence_apc" reports: Power Off 
>> unsuccessfulStatus check successful. Port 4 is ON
>>
>> However the node was powered off so something worked.  Also, it 
>> appears to have sent an Off command instead of Reboot.
>>
>>
>> Chris Harms wrote:
>>
>>> We now have some APC 7931 units at our disposal, however the 
>>> fence_apc perl script fails with "unrecognized menu response."  This 
>>> appears to be from new firmware on the APC units.  There is a python 
>>> script in CVS that looks like it may operate correctly with the new 
>>> firmware menus, is this correct?  Also, is it a drop-in replacement 
>>> for the perl script, i.e. will saving it as /sbin/fence_apc work 
>>> with RHCS 5?
>>>
>>> Thanks,
>>> Chris
>>>
>>> -- 
>>> Linux-cluster mailing list
>>> Linux-cluster at redhat.com
>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>>
>> -- 
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>
>
>
> -- 
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster




More information about the Linux-cluster mailing list