[Linux-cluster] Quorum Disk on 2 nodes out of 4?

Thu Nov 19 19:16:01 UTC 2009

On Wed, 2009-11-18 at 11:08 +0000, Karl Podesta wrote:
> On Wed, Nov 18, 2009 at 06:32:25AM +0100, Fabio M. Di Nitto wrote:
> > > Apologies if a similar question has been asked in the past, any inputs, 
> > > thoughts, or pointers welcome. 
> > 
> > Ideally you would find a way to plug the storage into the 2 nodes that
> > do not have it now, and then run qdisk on top.
> > 
> > At that point you can also benefit from "global" failover of the
> > applications across all the nodes.
> > 
> > Fabio
> 
> Thanks for the reply and pointers, indeed the 4 nodes attached to storage
> with qdisk sounds best... I believe in the particular scenario above, 
> 2 of the nodes don't have any HBA cards / attachment to storage. Maybe
> an IP tiebreaker would have to be introduced if storage connections could
> not be obtained and the cluster was to split into two. 
> 
> I wonder how common that type of quorum disk setup would be these days, 
> I gather most would use GFS in this scenario with 4 nodes, eliminating 
> the need for any specific failover of an ext3 disk mount etc., and merely
> failing over the services accross all cluster nodes instead.  

We don't have an IP tiebreaker in the traditional sense.

I wrote a demo IP tiebreaker which works for 2 node clusters, but it
does not work in 4 node clusters since there is no coordination about
whether other nodes in a partition can "see" the tiebreaker in the demo
application.

You can use a tweaked version of Carl's weighted voting scheme to be
able to sustain 2 node failures 1/2 the time in a 4 node cluster:

node# 1 2 3 4
votes 1 3 5 4

Votes = 13
Quorum = 7

Any 1 node can fail:

Nodes 1 2 3 = 9 votes
Nodes 2 3 4 = 12 votes
Nodes 1 3 4 = 10 votes
Nodes 1 2 4 = 8 votes

Half of the time, 2 nodes can fail (ex: if you were worried about a
random partition between 2 racks):

Nodes 2 3 = 8 votes
Nodes 3 4 = 9 votes
Nodes 2 4 = 7 votes

Obviously in the other half of the possible failure permutations, 2
nodes failing would mean loss of quorum:

Nodes 1 2 = 4 votes -> NO QUORUM
Nodes 1 3 = 6 votes -> NO QUORUM
Nodes 1 4 = 5 votes -> NO QUORUM

If you do this, put your critical applications on nodes 1 and 2.  In the
event of a failure, nodes 3 and 4 can pick up the load without losing
quorum.  Well, in theory ;)

-- Lon