[Linux-cluster] What if the fence device doesn't work?
Eric Kerin
eric at bootseg.com
Tue Nov 21 14:20:20 UTC 2006
Janne Peltonen wrote:
> On Tue, Nov 21, 2006 at 08:26:20AM -0500, Eric Kerin wrote:
>
>> So to keep that scenario from happening, the cluster software
>> ensures that a successful fence occurs before continuing operation.
>> It's a fail-safe style setup. Better to take 30 minutes downtime for an
>> admin to make the right decision than corrupt your filesystems and have
>> to take 8 -24 hours downtime to restore the system.
>>
>
> I do understand the basics. I wouldn't want the cluster suite to think
> that a node couldn't access a resource such as an FS when it can. It
> would just be nice to configure the cluster suite so that if one method
> of fencing fails, it tries another, <SNIP>
Actually, that's entirely possible.
See: http://sources.redhat.com/cluster/faq.html#fence_levels
And here's an example block from the cluster.conf file first it tries
ilo (HP Lights Out), and if that fails, apc (APC Network power
controller) (the ILO sections is probably not correct, I just thew it in
there as an example of how you'd setup the method tags):
<fence>
<method name="ilo">
<device name="server1-ilo"
option="off"/>
<device name="server1-ilo"
option="on"/>
</method>
<method name="apc">
<device name="APC01a" port="1"
option="off"/>
<device name="APC01b" port="1"
option="off"/>
<device name="APC01a" port="1"
option="on"/>
<device name="APC01b" port="1"
option="on"/>
</method>
</fence>
Thanks,
Eric Kerin
eric at bootseg.com
More information about the Linux-cluster
mailing list