[Linux-cluster] Fencing issues with fence_apc_snmp (APC Firmware 6.x)
Marek "marx" Grac
mgrac at redhat.com
Wed Oct 15 14:12:13 UTC 2014
On 10/13/2014 09:10 PM, Thomas Meier wrote:
> When configuring PDU fencing in my 2-node-cluster I ran into some problems with
> the fence_apc_snmp agent. Turning a node off works fine, but
> fence_apc_snmp then exits with error.
> When I do this manually (from node2):
> fence_apc_snmp -a node1 -n 1 -o off
> the output of the command is not an expected:
> Success: Powered OFF
> but in my case:
> Returned 2: Error in packet.
> Reason: (genError) A general failure occured
> Failed object: .220.127.116.11.4.1.318.104.22.168.22.214.171.124.21
> When I check the PDU, the port is without power, so this part works.
> But it seems that the fence agent can't read the status of the PDU
> and then exits with error. The same seems to happen when fenced
> is calling the agent. The agent also exits with an error and fencing can't succeed
> and the cluster hangs.
Yes, this is known bug as APC in 6.x firmware has changed a table with
> I've already found the fence-agents repo: https://git.fedorahosted.org/cgit/fence-agents.git/
> Here https://git.fedorahosted.org/cgit/fence-agents.git/commit/?id=55ccdd79f530092af06eea5b4ce6a24bd82c0875
> it says: "fence_apc_snmp: Add support for firmware 6.x"
yes, this should fix the issue
> I've managed to build fence-agents-4.0.11.tar.gz on a CentOS 6.5 test box, but my build
> of fence_apc_snmp doesn't work.
> It gives:
> [root at box1]# fence_apc_snmp -v -a node1 -n 1 -o status
> Traceback (most recent call last):
> File "/usr/sbin/fence_apc_snmp", line 223, in <module>
> File "/usr/sbin/fence_apc_snmp", line 197, in main
> options = check_input(device_opt, process_input(device_opt))
> File "/usr/share/fence/fencing.py", line 705, in check_input
> TypeError: __init__() got an unexpected keyword argument 'stream'
Feel free to remove logging if it does not work. The other option is to
just take a patch from git and backport it. There should be no big
differences (I expect only very minor changes).
> I'd really like to see if a patched fence_apc_snmp agent fixes my problem, and if so,
> install the right version of fence_apc_snmp on the cluster without breaking things,
> but I'm a bit clueless how to build me a working version.
Sure, there will be a new official release for RHEL 6.7 (as 6.6 was
released few days ago). So until that time only upstream or patches.
More information about the Linux-cluster