[Linux-cluster] Fencing trouble

James Parsons jparsons at redhat.com
Mon May 22 13:01:41 UTC 2006


Olivier Thibault wrote:

> Hello,
>
> I'm testing RHCS on Fedora Core 5.
> I have 2 HP DL380G4 and a MSA500G2 shared between them.
> I have a gfs filesystem on the MSA500, with lock_dlm.
> This works fine.
> I created a IP adress service with system-config-cluster, and it also 
> works fine, ie if i shutdown one node, the service start on the other 
> one.
> But I have a trouble with fencing. I use iLo fencing. The fence_ilo 
> script works if I call it directly with the params, but if I run 
> "fence_node node2", I get alternatively the following :
>
> [root at filer1 ~]# fence_node filer2-ha
> agent "fence_ilo" reports: error: User login name was not found
> power_status: unexpected error
>
> [root at filer1 ~]# fence_node filer2-ha
> agent "fence_ilo" reports: error: Syntax error: Line #1: syntax error 
> near "?>" in the line: "?xml version="1.0"?>"
> power_status: unexpected error
>
> [root at filer1 ~]# fence_node filer2-ha
> agent "fence_ilo" reports: error: User login name was not found
> power_status: unexpected error
>
> [root at filer1 ~]# fence_node filer2-ha
> agent "fence_ilo" reports: error: Syntax error: Line #1: syntax error 
> near "?>" in the line: "?xml version="1.0"?>"
> power_status: unexpected error
>
> Here is my cluster.conf file :
> <?xml version="1.0"?>
> <cluster config_version="6" name="filer">
>         <fence_daemon post_fail_delay="0" post_join_delay="3"/>
>         <clusternodes>
>                 <clusternode name="filer1-ha" votes="1">
>                         <fence>
>                                 <method name="1">
>                                         <device name="filer1-ilo"/>
>                                 </method>
>                         </fence>
>                 </clusternode>
>                 <clusternode name="filer2-ha" votes="1">
>                         <fence>
>                                 <method name="1">
>                                         <device name="filer2-ilo"/>
>                                 </method>
>                         </fence>
>                 </clusternode>
>         </clusternodes>
>         <cman expected_votes="1" two_node="1"/>
>         <fencedevices>
>                 <fencedevice agent="fence_ilo" hostname="filer1-ilo" 
> login="Administrator" name="filer1-ilo" passwd="xxxxxx"/>
>                 <fencedevice agent="fence_ilo" hostname="filer2-ilo" 
> login="Administrator" name="filer2-ilo" passwd="xxxxxx"/>
>         </fencedevices>
>         <rm>
>                 <failoverdomains>
>                         <failoverdomain name="test" ordered="1" 
> restricted="1">
>                                 <failoverdomainnode name="filer1-ha" 
> priority="1"/>
>                                 <failoverdomainnode name="filer2-ha" 
> priority="2"/>
>                         </failoverdomain>
>                 </failoverdomains>
>                 <resources/>
>                 <service autostart="1" domain="test" name="test_ip" 
> recovery="relocate">
>                         <ip address="10.68.5.7" monitor_link="1"/>
>                 </service>
>         </rm>
> </cluster>
>
>
> What-s wrong ?
>
> Thanks for your help.
>
> Best regards,
>
> Olivier
>
>
Well, the place to begin is with the version of iLO your machines are 
currently running. Can you check this rev number and post here?

The conf file looks good. The fact that the CLI is working but the 
fence_node command is NOT would suggest that either one of the params in 
the conf file is incorrect (should Administrator be capitalized?) or 
that the param name itself is incorrect. The param names all look ok.

Please let us know the iLO firmware version, Olivier. This is troubling 
to me -- iLO is our most stable supported form of baseboard management 
fencing.

-J




More information about the Linux-cluster mailing list