[Linux-cluster] Qdisk question
brem.belguebli at gmail.com
Thu Aug 13 09:48:01 UTC 2009
I have a question about the qdisk concept.
My understanding of qdisk is that it is used as a tie-breaker, but it looks
like it is more a heatbeat vector than a simple tie-breaker.
My setup consists of 4 nodes located on 2 different production sites (2+2)
using SAN shared storage (2 disk frames, 1 per site).
The qdisk is a iscsi shared lun from a third site that I expected to use as
a tie-breaker in case 1 of my 2 prod sites was experiencing network problems
and gets completely isolated.
Until here, no real problem indeed, if the site gets apart from the other
prod site and also from the third site (hosting the iscsi target qdisk) the
2 nodes from the failing site get evicted from the cluster.
But, what if my third site gets isolated while the 2 prod ones are fine ?
The real question is what happens in case all the nodes loose access to the
qdisk while they're still able to see each others ?
The 4 nodes have each 1 vote and the qdisk 1 vote. The expected quorum is 3.
When the cluster is running with all of its nodes and the qdisk, the number
of votes is 5.
If I loose the qdisk, the number of votes falls to 4, the cluster is quorate
(4>3) but it looks like everything goes bad, each node deactivate itself as
it can't write its alive status (--> heartbeat vector) to the qdisk even if
the network heartbeating is working fine.
I have tried to configure heuristics (ping a node on the third site) without
qdisk device but they seem to be ignored.
Any comments or tips ?
PS: added the [Linux-cluster] flag :-)
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Linux-cluster