[Linux-cluster] Qdisk question
lhh at redhat.com
Thu Aug 13 14:39:46 UTC 2009
On Thu, 2009-08-13 at 00:45 +0200, brem belguebli wrote:
> My understanding of qdisk is that it is used as a tie-breaker, but it
> looks like it is more a heatbeat vector than a simple tie-breaker.
Right, it's a secondary membership algorithm.
> Until here, no real problem indeed, if the site gets apart from the
> other prod site and also from the third site (hosting the iscsi target
> qdisk) the 2 nodes from the failing site get evicted from the cluster.
> But, what if my third site gets isolated while the 2 prod ones are
> fine ?
Qdisk votes will not be presented to CMAN any more, but the two sites
should remain online if they still have a "majority" of votes.
> The real question is what happens in case all the nodes loose access
> to the qdisk while they're still able to see each others ?
Qdisk is just a vote like other voting mechanisms. If all nodes lose
access at the same time, it should behave like a node death. However,
the default action if _one_ node loses access is to kill that node (even
if CMAN still sees it).
> The 4 nodes have each 1 vote and the qdisk 1 vote. The expected quorum
> is 3.
> If I loose the qdisk, the number of votes falls to 4, the cluster is
> quorate (4>3) but it looks like everything goes bad, each node
> deactivate itself as it can't write its alive status (--> heartbeat
> vector) to the qdisk even if the network heartbeating is working
What happens specifically? Most of the actions qdiskd performs are
configurable. For example, if the nodes are rebooting, you can turn
that behavior off.
I wrote a simple 'ping' tiebreaker based the behaviors in RHEL3. It
functions in many ways in the same manner as qdiskd with respect to vote
advertisement to CMAN, but without needing a disk - maybe you would find
More information about the Linux-cluster