[Linux-cluster] Tiebreaker IP Address

Harun harun at mhd.co.om
Wed Jan 23 03:48:49 UTC 2008


Dear Barry,

As you said, Fencing is a nice way of saying "make sure the non-responsive
node can not write anything to our disks, by whatever means necessary".
This usually involves the equivalent of pulling the power plug out of the 
non-responsive node.  Why be so harsh?  Why not do a normal shutdown?

So does that means that even in any case of cluster failure (suppose a
network fail), the node will shutdown abnormally only, or it will be a clean
shutdown. And once a node is shutdown due to a failure, will the node
automatically come up or does it need to be manually brought up.

Regards,
Harun

-----Original Message-----
From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of Barry Brimer
Sent: Sunday, January 20, 2008 7:31 PM
To: linux clustering
Subject: Re: [Linux-cluster] Tiebreaker IP Address

> Can any one explain me what exactly is the tiebreaker IP and how does it
> function? What is the use if we set the tiebreaker IP as the Default
Gateway
> address?

In clustering, it is important that the cluster nodes are able to 
communicate with one another.  It is also important that the cluster nodes 
agree on the status of the cluster.  To acheive this, various methods are 
used to communicate between cluster nodes to inform the other nodes that 
this node is active and participating in the cluster.  Quorum is usually 
defined as "greater than one half".  In a cluster larger than 2 nodes, 
the cluster nodes can determine that if they stop receiving cluster 
communications (usually referred to as heartbeat) from a particular node, 
they assume that the non-responsive node is not functioning correctly, and 
one of the remaining nodes in the cluster will fence the non-responsive 
node.  Fencing is a nice way of saying "make sure the non-responsive node 
can not write anything to our disks, by whatever means necessary".  This 
usually involves the equivalent of pulling the power plug out of the 
non-responsive node.  Why be so harsh?  Why not do a normal shutdown?  If 
the non-responsive node has data in buffers that has not been written to 
disk, and the other cluster nodes feel that this node is having a problem, 
they want to ensure that the non-responsive node can not write its buffers 
out to disk, in order to make sure that the non-responsive node has no 
chance of corrupting the data used by the cluster.  This is all fine, 
because if you have greater than 2 nodes, you should be able to get 
agreement by a majority on whether a node is functioning, and therefore 
whether the cluster is allowed to operate.  In a two-node cluster, we need 
to have some other way to determine which cluster member is healthy, and 
which one isn't.  If a cluster node were functioning correctly, it would 
be able to reach its default gateway.  Therefore the tiebreaker IP address 
is the default gateway because both machines should be able to reach it if 
they were functioning properly.  Therefore if one node is able to reach 
the tiebreaker IP address, and one isn't, it is assumed that the properly 
running node is the one that can reach the default gateway, and that 
allows the tie to be broken and allows that node to fence the other node.

Barry

--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster


<<<<   Disclaimer Message  >>>>
"This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you are not the named addressee, please notify the sender immediately after deleting this e-mail from your system and do not disseminate, distribute or copy this e-mail. The sender does not accept liability for any errors or omissions in the contents of this message, which arise as a result of erroneous e-mail transmission."
[Mohsin Haider Darwish LLC & Group Companies, PO.Box 880, Ruwi-112, Oman]




More information about the Linux-cluster mailing list