[Linux-cluster] Fencing problem

Eric Kerin eric at bootseg.com
Mon May 29 13:18:24 UTC 2006


On Mon, 2006-05-29 at 10:41 +0200, Tomasz Koczorowski wrote:
> Hi,
>  
> I have a problem with RHCS 4 in two node configuration (wayne and
> eastwood).
> Service ucpgw is running on wayne and httpd on eastwood.
> Every node is a Sun V40z server, fencing is done by IPMI.
> During cluster tests I unplug both power cables from one server (wayne),
> thus
> simulating unexpected poweroff (IPMI interface is also unavailable while
> server
> is out of power). 
> <SNIP>
> Is this cluster misconfigured or is it a bug in fenced/ccsd subsystem?
> How can I solve this problem?
> 

This is an inherent flaw in using the on-board control devices (ILO,
IPMI, etc) as fence devices.  Since the remaining node(s) can't
successfully fence the failed node, they won't continue.

Fenced also can't assume the machine is already powered down, since it
could be a network problem keeping it from accessing the other node (and
it's IPMI device)

I use two network accessible power controllers for fencing my cluster.
With each power supply hooked up to a different controller, providing
redundant power paths.

Thanks, 
Eric Kerin
eric at bootseg.com






More information about the Linux-cluster mailing list