[Linux-cluster] Repeated fencing

ESGLinux esggrupos at gmail.com
Wed Feb 24 09:34:22 UTC 2010


Hi Doug,

the split brain is what is happening to you ;-)

>From wikipedia: http://en.wikipedia.org/wiki/High-availability_cluster
"HA clusters usually use a *heartbeat* private network connection which is
used to monitor the health and status of each node in the cluster. One
subtle, but serious condition every clustering software must be able to
handle is split-brain. Split-brain occurs when all of the private links go
down simultaneously, but the cluster nodes are still running. If that
happens, each node in the cluster may mistakenly decide that every other
node has gone down and attempt to start services that other nodes are still
running. Having duplicate instances of services may cause data corruption on
the shared storage."

The qourum disk is a good choice to avoid it as they have told you.

Good luck,

Greetings,

ESG



2010/2/23 Doug Tucker <tuckerd at lyle.smu.edu>

> > Hi Doug, maybe you can avoid this kind of problem using a quorumdisk
> partition. a two node cluster is split-brain prone and with a quorumdisk
> partition you can avoid split-brain situations, which probably is causing
> this behavior.
> >
> > So, about use a cross-over (or straight) cable, I don't know any issue
> about it, but, try to check if it's using full-duplex mode. half-duplex mode
> on cross-over linked machines probably will cause heartbeat problems.
> >
> > cya..
>
> Can you give me a little info about "split-brain" issues?  I don't
> understand what you mean by that, and what I'm solving with a
> quorumdisk.  And it has always worked fine, it just started happening
> after the install of this newer kernel/gfs module.  The other 2 node
> cluster is still rock solid.  Also, the network interfaces on the
> crossover are full duplex.  Thanks for writing back, you're the first
> person who offered anything.
>
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100224/8b7d7b54/attachment.htm>


More information about the Linux-cluster mailing list