[Linux-cluster] PowerEdge R610 idrac express fencing

Digimer lists at alteeve.ca
Thu Feb 20 01:24:48 UTC 2014


On 19/02/14 07:30 PM, Michael Mendoza wrote:
> Good afternoon.
>
> We are trying to configure 2 dell R610 with idrac6 EXPRESS in cluster
> with redhat 5.10 x64.
>
> For testing we are using the command fence_ipmilan. We can ping idrac on
> the remote host.
>
> fence_ipmilan -a X.X.X.X -l usern -p xxxx  -t 200  -o status  -v   <-- works

Over 3 minutes to confirm a fence action is extremely log time!

> fence_ipmilan -a X.X.X.X -l usern -p xxxx  -t 200  -o reboot -v
>
> The problem is the server reboot, but while it reboot the idrac6 reboot
> too. so the host A after 120 seconds aprox lost connection and get the
> follow message.

So you're saying that the IPMI interface, after rebooting the host, 
fails to respond for two full minutes? That strikes me as a reason to 
call Dell and ask for help. That can't be normal.

>     Spawning: '/usr/bin/ipmitool -I lan -H 'X.X.X.X' -U 'usern' -P
>     '[set]' -v -v -v chassis power on'...
>     Spawned: '/usr/bin/ipmitool -I lan -H 'X.X.X.X' -U 'usern' -P
>     '[set]' -v -v -v chassis power on' - PID 10104
>     Looking for:
>     'Password:', val = 1
>     'Unable to establish LAN', val = 11
>     'IPMI mutex', val = 14
>     'Unsupported cipher suite ID', val = 2048
>     'read_rakp2_message: no support for', val = 2048
>     'Up/On', val = 0
>     ExpectToken returned 11
>     Reaping pid 10104
>     Failed
>
>
> cman version is CMAN-2.0.115.118.e15_10.3

This is an old existing cluster, or a new one you're trying to build?

> however I have other host with centos 6.4 and CMAN3.0... and the
> connection is not lost. I run the same command, the server reboot as
> well as idrac, the ping is back and the ipmi connection is not lost..

Are these nodes in the same cluster? cman 2 and 3 only are designed to 
work in maintenance mode for rolling upgrades.

> am I doing something wrong? I used the -t and -T option even 300 / 400
> and it doesnt matter, the connection is shut after 120secounds. in
> centos work fine. ( I already opened a case with redhat and am waiting
> answer.)
> Thanks

It might be that the 120 second upper limit is a bug.

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without 
access to education?




More information about the Linux-cluster mailing list