[Linux-cluster] Graceful Degradation
Roger Peña
orkcu at yahoo.com
Fri Dec 14 16:33:35 UTC 2007
--- gordan at bobich.net wrote:
> Hi,
>
> I've got most of my cluster pretty much sorted out,
> apart from kicking
> nodes from the cluster when they fail.
>
> Is there a way to make the node-kicking automated? I
> have 4 nodes. They
> are sharing 2 GFS file systems, a root FS and a data
> FS. If I pull the
> network cable from one of them, or just power it
> off, the rest of the
> cluster nodes just stop. The only way to get them to
> start responding
> again is to bring the missing node back, even if
> there are still enough
> nodes to maintain quorum (3 nodes out of 4).
>
> Can anyone suggest a way around this? How can I make
> the 3 remaining nodes
> just kick the missing node out of the cluster and
> DLM group (possibly
> after some timeout, e.g. 10 seconds) and resume
> operation until the node
> rejoins?
>
> This may or may not be related to the fact that I'm
> running a shared GFS
> root, but any pointers would be welcome.
>
I thinks this is question #1 in the FAQs and in this
list :-)
the short anwser and the first place to look at is:
1- fencing not configured or configured as manual
2- fencing problems, the devices not working as they
should
cu
roger
__________________________________________
RedHat Certified ( RHCE )
Cisco Certified ( CCNA & CCDA )
____________________________________________________________________________________
Be a better friend, newshound, and
know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ
More information about the Linux-cluster
mailing list