[Linux-cluster] Rhcs4 fence with no hardware power switch

Lon Hohberger lhh at redhat.com
Wed Aug 31 16:27:44 UTC 2005


On Tue, 2005-08-30 at 02:00 +0000, nattapon viroonsri wrote:
> >On Mon, 2005-08-29 at 02:26 +0000, nattapon viroonsri wrote:
> > > I use RedHat cluster v4 for HA and have no power switch hardware
> > > When i disconnected  network cable for client access resource , Failover
> > > have occur as expect
> > > But the server that was disconnect network cable not be rebooted or 
> >fenced
> > > like rhcs V3(even no power switch).  And the failed server  can't not 
> >join
> > > cluster again.
> >
> >Eep, no failover should have occurred if you don't have fencing.
> 
> I see resource(virtual ip, serviice) tranfer to another node. but dont see 
> any mesage in /var/log/message said about fencing, may be i miss something .
> and i try to reboot failnode with command "init 6" manually but it show that 
> cant stop rgmanager service.

Interesting... the event that the node has died, failover shouldn't
occur until the user-level service group recovers.

The user-level service groups should not recover until *after* fencing
recovers and completes.  Without fencing, there should never be a
failover (because fencing *fails* unlike in RHCS3, where it just
complained and continued), unless something's been done that I'm not
aware of?


> > > So, Have anyway to reboot failed node(network problem or server problem)
> > > without use hardware power switch to make failnode to clean state again 
> >?
> >
> >You need fencing hardware with RHCS4/RHEL4.  I'd check ebay if you're
> >not looking to spend a lot, you can get a WTI NPS115 for pretty cheap
> >nowadays (under $100!).
> >
> Have any addition module or modify version of  perl script "fence_xxx"  or 
> use wathdog with rhcs v4. ?

We are looking in to using them as an additional means, though I doubt
we'll allow reliance on them like we did in RHEL2.1.  That is, I suspect
that there won't be a "fence_watchdog" driver, but you never know...

On RHCS3, they were used as an additional means of protection, but the
cluster software complained about lack of "STONITH devices" when only
using the watchdog...

I will state this plainly: Fencing hardware is *required* for RHCS4 (the
Red Hat product/support offering). 


> >You can run with manual fencing, but no automatic recovery will occur,
> >and this isn't a supported solution.
> >
> 
> Thanks for your suggestion. :)

To clarify - my suggestion was to get a supported fence device, and I
noted one which could be had cheaply.  Personally, I recommend against
using manual fencing for anything, including evaluation testing.

You can also use GNBD + fence_gnbd -- no fencing hardware required! :)

-- Lon




More information about the Linux-cluster mailing list