[Linux-cluster] Re: CS5 two-nodes with quorum disk

Alain Moulle Alain.Moulle at bull.net
Thu Dec 13 13:00:54 UTC 2007


Hi Lon

I've carefully read your last detailed information. I've a
better understanding but something is again not clear for me :
in my two node cluster node1/node2, with quorum disk , without any heuristic,
when I do on node2 ifdown on eth if of heart-beat, what is the
mechanism via the quorum disk that assure that it will always
be node1 which will fence node2 and never in the other way ?
I think all will be clear for me if I understand this case ...
Thanks
Regards
Alain

>> Thanks for your information about votes values with quorumd.
>>
>> Another question about my tests :
>> Now I have the quorum disk working correctly, and so I wanted
>> to do this test : ifdown on the heart beat interface, to simulate
>> a heart beat network breakdown. I expected the cluster NOT to failover
>> because of quorum disk always available, but in fact after the
>> 21s the node where I've stopped the if eth has been fenced despite
>> the quorumdisk ...
>>
>> Where is my misunderstanding ?


QDisk provides additional votes based on user-defined heuristics (or, no
heuristics, depending).  The combination of the heuristics + votes can
be used to:

* prevent even-split fence races in the event of a network partition -
one cluster partition can, given well-defined heuristics, decide it is
unfit for cluster participation (and usually remove itself), while the
other remains "fit" and therefore fences the bad partition

* allow a minority partition to become the surviving partition a split -
similar to the above - given a 4-node cluster, 3 nodes in a majority
partition could decide that they are *all* unfit for cluster
participation and remove themselves - while the 1-node minority
partition continues to operate

* prevent a partition from becoming quorate after being fenced - on
boot, if a node does not meet its heuristic requirements and a master
node exists in the cluster, it cannot become quorate unless it has
communications with the master qdisk node (optionally, you can have
qdisk stop CMAN in this case)

... and possibly other things, but those are the main ones.

It's not a replacement for cluster communications nor is it a
replacement for CMAN's membership (in fact, it relies on CMAN's
membership - and fencing - to do its job).

Even if qdiskd told CMAN which nodes were online, much of the internal
network traffic (for example, DLM traffic) cannot be pushed through the
disk in a meaningful way, meaning GFS access would be blocked.

-- Lon




More information about the Linux-cluster mailing list