[Linux-cluster] CS5 / quorum disk and heuristics

Lon Hohberger lhh at redhat.com
Mon Jun 9 21:35:54 UTC 2008


On Mon, 2008-06-09 at 14:23 +0200, Alain Moulle wrote:

> My last understanding was that quorum disk was NOT a redundancy of heart-beat,
> meaning that if heart-beat interface fails, there is a failover but it is
> always the node with the expected min_score in quorum disk which fence the
> other.

Qdiskd can never tell CMAN or openais that a computer is a member of the
cluster, but it can remove nodes from the cluster.

> So I thought that the quorum disk check was operationnal only if the node
> detects a problem on heart-beat interface ... but when I set down the interface
> on the third machine, and after a few seconds, both nodes node1/node2
> are killed !!! 

Think of the heuristics as asking the question:

  "Am I fit to participate in the cluster?"

If the answer is "yes" and suddenly changes to "no", the node removes
itself.

> Whereas heart-beat interface was working fine.

You can disable these by setting allow_kill="0" and/or reboot="0" (see
qdisk(5)).


> And after reboot, I can see "cluster not quorate" etc.

This happens after both nodes boot, or just one?  If both nodes boot up
with the third node off, they should still be able to form a quorum by
themselves, even if qdiskd isn't running or its score isn't sufficient.

-- Lon




More information about the Linux-cluster mailing list