[Linux-cluster] qdiskd master election and loss of quorum

Lon H. Hohberger lhh at redhat.com
Wed Nov 11 16:49:29 UTC 2009

On Thu, 2009-11-05 at 15:28 +0100, Gianluca Cecchi wrote:

> Nov  5 12:52:53 mork clurgmgrd[2633]: <notice> Member 2 shutting down 
> Nov  5 12:52:57 mork qdiskd[2214]: <info> Node 2 shutdown 

> Nov  5 12:55:41 mork openais[2185]: [TOTEM] The token was lost in the

That's very interesting.  It looks like the what happened to cause the
state change failures was the huge lag time between when rgmanager sent
its "good bye kiss" and the time openais noticed the node was offline.
The timeout was large enough that rgmanager gave up.

This isn't actually the quorum disk master election problem at all...
It's also very strange.

- rgmanager should have known this was unnecessary.  The other node said
it was going away.
- cman probably should have caused a transition sooner, I think (??)

-- Lon

