[Linux-cluster] GFS+DRBD+Quorum: Help wrap my brain around this

Mon Nov 22 21:21:50 UTC 2010

On Sun, 21 Nov 2010 21:46:03 +0000, Colin Simpson wrote:

> I suppose what I'm saying is that there is no real way to get a quorum
> disk with DRBD. And basically it doesn't really gain you anything
> without actual shared storage.

I understand that.  That's why I'm looking for that "external" solution 
(ie. a separate iSCSI volume from a third machine) to act as a quorum 
disk (effectively making that third machine a quorum server).

But I'm not clear how important this is.  I think the problem is that, 
while I've some familiarity with clustering, I've less with DRBD.  I 
don't understand how DRBD handles the matter of quorum given only two 
potential voters.

[...]
> The scenario is well mitigated by DRBD on two nodes already without
> this. The system will not, if you config properly,  start DRBD (and all
> the cluster storage stuff after, presuming your start up files are in
> the right order) until it sees the second node. 

So if one node fails, the mirror is broken but storage is still 
available?  But if both nodes go down, storage only becomes available 
again once both nodes are up?  I've missed this in the documentation, I'm 
afraid.

[...]
> The situation of two nodes coming up when the out of date one comes up
> first should never arise if you give it sufficient time to see the other
> node (it will always pick the new good one's data), you can make it wait
> forever and then require manual intervention if you prefer (should a
> node be down for an extended period). 

Waiting forever for the second node seems a little strict to me, though I 
suppose if the second node is the node with the most up-to-date data then 
this is the proper thing to do.  But waiting forever for the node that 
has outdated information seems inefficient, though I see it is caused by 
the fact that DRBD has no way to know which node is more up-to-date.

Am I understanding that correctly?

> For me a couple of minutes waiting
> for the other node is sufficient if it was degraded already, maybe a bit
> longer if the DRBD was sync'd before they went down.

I'm afraid I'm not clear what you mean by this.  Isn't the fact that each 
node cannot know the state of the other the problem?  So how can wait 
times be varied as you describe?

> I can send you config's I believe are correct from the Linbit docs of
> using DRBD Primary/Primary with GFS, if you like.

Something more than http://www.drbd.org/users-guide/s-gfs-create-
resource.html ?  That would be welcome.

> 
> But I'm told (from a thread I posted at DRBD) that this should always
> work. 

This is something I'm realizing: that I need to ask some of my questions 
on that list rather than here, since my questions right now are more down 
at that layer.

Thanks...
	- Andrew