[Linux-cluster] Workings of Tiebreaker IP (RHCS)
rodgersr at yahoo.com
Sun Sep 24 00:20:57 UTC 2006
I pulled a message from 2005 about tiebreakers. I have some questions and it does not seem to agree with what I see culmanger do.
>> To completely understand what the role of a tiebreaker IP within a two
>> or four node RHCS cluster is, I've searched redhat and Google. I can't
>> however find anything describing the precise workings of the
>> tiebreaker-IP. I would really like to know what happens excactly when
>> the tiebreaker is used an how (maybe even somekind of flow diagram).
>> Can anyone here maybe explain that to me, or point me in the direction
>> of more specific information regarding tiebreaker?
>The tiebreaker IP address is used as an additional vote in the event
>that half the nodes become unreachable or dead in a 2 or 4 node >cluster
>The IP address must reside on the same network as is used for cluster
>communication. To be a little more specific, if your cluster is using
>eth0 for communication, your IP address used for a tiebreaker must be
>reachable only via eth0 (otherwise, you will end up with a split >brain).
>When enabled, the nodes ping the given IP address at regular >intervals.
>When the IP address is not reachable, the tiebreaker is considered
>"dead". When it is reachable, it is considered "alive".
>It acts as an additional vote (like an extra cluster member), except >for
>one key difference: Unless the default configuration is overridden, >the
How does this work? Does the node trying to become the active node access the tiebreaker and put a lock on it? How does it reseve it?
Just pinging it would not prevent the other node from doing the same.
>IP tiebreaker may not be used to *form* a quorum where one did not >exist
>So, if one node of a two node cluster is online, it will never become
>quorate unless the other node comes online (or administrator override,
>see man pages for "cluforce" and "cludb").
>So, in a 2 node cluster, if one node fails and the other node is >online
>(and the tiebreaker is still "alive" according to that node), the
>remaining node considers itself quorate and "shoots" (aka STONITHs, >aka
>fences) the dead node and takes over services.
>If a network partition occurs such that both nodes see the tiebreaker
>but not each other, the first one to fence the other will naturally >win.
>Ok, moving on...
>The disk tiebreaker works in a similar way, except that it lets the
>cluster limp in along in a safe, semi-split-brain (split brain) in a
>network outage. What I mean is that because there's state information
>written to/read from the shared raw partitions, the nodes can actually
>tell via other means whether or not the other node is "alive" or not >as
>opposed to relying solely on the network traffic.
>Both nodes update state information on the shared partitions. When >one
>node detects that the other node has not updated its information for a
>period of time, that node is "down" according to the disk subsystem. >If
>this coincides with a "down" status from the membership daemon, the >node
>is fenced and services are failed over. If the node never goes down
>(and keeps updating its information on the shared partitions), then >the
I do not use a IP tiebreaker. I have a two nodes system. When the active node shows it is down via memebership but up via disk then
Clumanager determines it is in an uncertain state and shoots it.
>node is never fenced and services never fail over.
Talk is cheap. Use Yahoo! Messenger to make PC-to-Phone calls. Great rates starting at 1¢/min.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Linux-cluster