[Linux-cluster] One question about IPMI fencing with Cluster Suite v5.1

Celso K. Webber celso at webbertek.com.br
Fri Sep 12 22:21:13 UTC 2008


Hello all,

Sorry if this question has been answered before, but I didn't find anything 
in the archives.

We deployed a Red Hat Cluster Suite on a customer, and apparently everything 
goes fine until there's a need for one node to fence the other (for 
instance, we turn it off to test failover).

As usual for us, we configured the fencing using IPMI, which is available on 
every modern branded server.

It seems that sometimes, one machine can't fence the other. Although we can 
see the Cluster starting "ipmitool -I lanplus -H xxx -U xxx -P xxx chassis 
power off", it times out while trying to power off the other machine.

The more incredible thing is that if, at this exact moment, we issue an 
"ipmitool ... chassis power status" at the command line, it works ok with 
the same node failing.

So I have a few questions:
* can a problem like this (fencing agent not being able to fence) cause 
instability on the cluster? In our case, the clusters gets crazy even if we 
reboot the failed node, it does join the cluster, but rgmanager never gets 
started;

* has anyone faced this problem with IPMI? We have used IPMI as a fence 
agent on tenths of implementations with Red Hat Cluster Suite, since version 
3, and we have never had this kind of problem. The servers in question are 
Dell PowerEdges 2900, and there is a crossover cable beetween both onboard 
#1 NICs of the server, so that we have a dedicated network path for one 
machine turning off the other.


Thank you all for your support.

Regards,

Celso.

-- 
Esta mensagem foi verificada pelo sistema de antivírus e
 acredita-se estar livre de perigo.




More information about the Linux-cluster mailing list