[Linux-cluster] ipmi fencing

Jie Gao J.Gao at isu.usyd.edu.au
Wed Jul 19 22:11:31 UTC 2006




On Wed, 19 Jul 2006, Lon Hohberger wrote:

> Date: Wed, 19 Jul 2006 11:00:24 -0400
> From: Lon Hohberger <lhh at redhat.com>
> Reply-To: linux clustering <linux-cluster at redhat.com>
> To: linux clustering <linux-cluster at redhat.com>
> Subject: Re: [Linux-cluster] ipmi fencing
>
> On Wed, 2006-07-19 at 15:57 +1000, Jie Gao wrote:
> > Hi All
> >
> > I am trialing clustering and GFS on RHEL AS U4. I am using the ipmi
> > agent for fencing in a two-node setup.
> >
> > I have noticed that the agent sends "power off" to the node to be
> > fenced. This is causes the node fenced to shut down uncleanly.
> >
> > I'd rather that it used the "power soft" option first and then used
> > "power off" as the last resort.
> >
> > After all, what good use a corrupt system can serve?
>
> Linux-cluster's I/O fencing is a very paranoid action taken to cut a
> node off from shared data.
>
> Letting a node try to gracefully shutdown - when we are not aware of why
> the node is misbehaving - goes against the 'very paranoid' approach.
>
> Consider the case where a node has a double-bit memory problem, causing
> quiet data corruption.  Letting it live a few extra seconds increases
> the chance for more corrupt data.  Sure, it is rare, but how can we be
> sure this *is not* the case when a node misbehaves?

For me, this is a difference between corrput system that could have been
avoided and corrupt data that get more corrupt.

I'd prefer the former. At least there should be room for us users to choose.

Anyway, I've set up a point-to-point connection between the nodes, and
that will forestall false alarms due to potential transient network
problems.

Regards,



Jie




More information about the Linux-cluster mailing list