[Linux-cluster] Graceful Degradation

Roger Peña orkcu at yahoo.com
Fri Dec 14 16:33:35 UTC 2007


--- gordan at bobich.net wrote:

> Hi,
> 
> I've got most of my cluster pretty much sorted out,
> apart from kicking 
> nodes from the cluster when they fail.
> 
> Is there a way to make the node-kicking automated? I
> have 4 nodes. They 
> are sharing 2 GFS file systems, a root FS and a data
> FS. If I pull the 
> network cable from one of them, or just power it
> off, the rest of the 
> cluster nodes just stop. The only way to get them to
> start responding 
> again is to bring the missing node back, even if
> there are still enough 
> nodes to maintain quorum (3 nodes out of 4).
> 
> Can anyone suggest a way around this? How can I make
> the 3 remaining nodes 
> just kick the missing node out of the cluster and
> DLM group (possibly 
> after some timeout, e.g. 10 seconds) and resume
> operation until the node 
> rejoins?
> 
> This may or may not be related to the fact that I'm
> running a shared GFS 
> root, but any pointers would be welcome.
> 
I thinks this is question #1 in the FAQs and in this
list :-)

the short anwser and the first place to look at is: 
1- fencing not configured or configured as manual
2- fencing problems, the devices not working as they
should

cu
roger

__________________________________________
RedHat Certified ( RHCE )
Cisco Certified ( CCNA & CCDA )


      ____________________________________________________________________________________
Be a better friend, newshound, and 
know-it-all with Yahoo! Mobile.  Try it now.  http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ 




More information about the Linux-cluster mailing list