[Linux-cluster] Fencing agents

Adam Manthei amanthei at redhat.com
Wed Aug 3 22:11:44 UTC 2005


On Wed, Aug 03, 2005 at 04:42:23PM -0500, JACOB_LIBERMAN at Dell.com wrote:
> Hi Adam,
> 
> I noticed that you updated this script quite a bit from previous versions. If I'm not mistaken, the previous version actually used the "racadm serveraction powercycle/shutdown/etc" commands. This version uses telnet exclusively. How about adding some logic that checks whether racadm is installed locally and uses that if it is, and then uses telnet if it is not?

The problem that I experienced with the racadm utility is that there where
times that there was no way of querying what the power status of a node was.
I know that I am unable to do that at all with the firmware that I have 
installed for my PowerEdge 750's.  Another drawback to the racadm approach
is that `serveraction` returns right away before waiting to for that command
to complete.  Given the combination of the two issues, it makes using racadm
difficult to rely upon for a fencing agent because it's possible for the
fencing agent to report success before the machine is powered off.  If that
were to happen, corruption in the filesystem could occur.

I've emailed a couple people at Dell and the linux-poweredge list and have
not been able to get an adequate response as to how to use racadm reliably.
As such, we only support the telnet interface.

> I think that adding the racadm commands to enable telnet on the rac is a good idea, but if they can use racadm to configure telnet access, they should also be able to use racadm to fence the node.

I thought about adding that functionality, but forgot about it shortly after
getting the telnet interface enabled on my DRAC card ;)  Thanks for the
reminder, I'll look into adding that feature.  In the meantime, the commands
for enabling it are documented in the man page.

       [root]# racadm config -g cfgSerial -o cfgSerialTelnetEnable 1

       [root]# racadm racreset


> Just my 2 cents. I think its great that you wrote an agent for the drac.
 
:)

Adam

> > -----Original Message-----
> > From: linux-cluster-bounces at redhat.com 
> > [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Adam Manthei
> > Sent: Wednesday, August 03, 2005 4:30 PM
> > To: linux clustering
> > Subject: Re: [Linux-cluster] Fencing agents
> > 
> > On Wed, Aug 03, 2005 at 11:58:47AM +0000, "Sævaldur Arnar 
> > Gunnarsson [Hugsmiðjan]" wrote:
> > > I'm implementing a shared storage between multiple (2 at 
> > the moment) 
> > > Blade machines (Dell PowerEdge 1855) running RHEL4 ES 
> > connected to a 
> > > EMC AX100 through FC.
> > >
> > > The SAN has two FC ports so the need for a FC Switch has 
> > not yet come 
> > > however we will add other Blades in the coming months.
> > > The one thing I haven't got figured out with GFS and the 
> > Cluster-Suite 
> > > is the whole idea about fencing.
> > 
> > Funny timing :)  I just checked in the fencing agent for the 
> > PowerEdge 1855's a couple days ago!  
> > 
> > (http://sources.redhat.com/cgi-bin/cvsweb.cgi/cluster/fence/ag
> > ents/drac/fence_drac.pl?rev=1.3.4.2&content-type=text/x-cvsweb
> > -markup&cvsroot=cluster)
> > 
> > > The fencing agents in that setup is manual fencing.
> > 
> > I would strongly discourage this.
> > 
> > > What does "automatic" fencing have to offer that the manual 
> > fencing lacks.
> > > If we decide to buy the FC switch right away is it 
> > recomended that we 
> > > buy one of the ones that have fencing agent available for the 
> > > Cluster-Suite ?
> > 
> > In this case, you already have a fencing agent (fence_drac) 
> > that works with the PE 1855 blades so there is no need for 
> > further fencing hardware (unless you are going to be 
> > connecting other machines to the cluster that aren't going to 
> > have any other form of fencing)
> > 
> > The main advantage that "automatic" fencing gives you over 
> > manual fencing is that in the event that a fencing operation 
> > is required, your cluster can automatically recover (on the 
> > order of seconds to minutes) instead of waiting for user 
> > intervention (which can take minutes to hours to days depending on
> > how attentive the admins are :).    
> > 
> > --
> > Adam Manthei  <amanthei at redhat.com>
> > 
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > http://www.redhat.com/mailman/listinfo/linux-cluster
> > 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> http://www.redhat.com/mailman/listinfo/linux-cluster

-- 
Adam Manthei  <amanthei at redhat.com>




More information about the Linux-cluster mailing list