[Linux-cluster] GFS+DRBD+Quorum: Help wrap my brain around this

Tue Nov 30 09:22:22 UTC 2010

On Mon, 29 Nov 2010 21:40:42 +0000 (UTC), A. Gideon wrote
> On Fri, 26 Nov 2010 15:04:40 +0000, Colin Simpson wrote:
> 
> >> but when I break the DRBD connection between two primary nodes,
> >> "disconnected" apparently means that the nodes both continue as if
> >> they've UpToDate disks.  But this lets the data go out of sync.  Isn't
> >> this a Bad Thing?
> > 
> > Yup that could be an issue, however you should never be in a situation
> > where you break the connection between the two nodes. This needs to be
> > heavily mitigated, I'm planning to bond two interfaces on two different
> > cards so this doesn't happen (or I should say is highly unlikely).
> 
> Since I'll be a person tasked with cleaning up from this situation, and 
> given that I've no idea how to achieve that cleanup once writes are 
> occurring on both sides independently, I think I'll want something more 
> than "highly unlikely".  That's rather the point of these tools, isn't it?
> 
> [...]
> > 2/ The node goes down totally so DRBD loses comms. But as all the comms
> > are down the other node will notice and Cluster Suite will fence the bad
> > node. Remember that GFS will suspend all operations (on all nodes) until
> > the bad node is fenced.
> 
> Does it make sense to have Cluster Suite do this fencing, or should DRBD 
> do it?  I'm thinking that DRBD's resource-and-stonith gets me pretty 
> close.
> 
> > I plan to further help the first situation by having my cluster comms
> > share the same bond with the DRBD. So if the comms fail, cluster suite
> > should notice, both the DRBD's on each node shouldn't change as GFS will
> > have suspended operations. Assuming the fence devices are reachable then
> > one of the nodes should fence the other (it might be a bit of a shoot
> > out situation) and then GFS should resume on the remaining node.
> 
> This "shoot out situation" (race condition) is part of my worry.  A third 
> voter of any form eliminates this, in that it can arbitrate the matter of 
> which of the two nodes in a lost-comm situation should be "outdated" and 
> fenced.
> 
> And if the third voter can solve the "wait forever on startup", so much 
> the better.
> 
> I'm looking at how to solve this all at the DRBD layer.  But I'm also 
> interested in a more Cluster-Suite-centric solution.  I could use a 
> quorum disk, but a third node would also be useful.  I haven't figured 
> out, though, how to run clvmd with the shared storage available on only 
> two of three cluster nodes.  Is there a way to do this?
> 
> 	- Andrew
> 

reading your thoughts i guess you didn't got (or read) my email to this thread
some time ago (that is pretty much the same) which can be found here
https://www.redhat.com/archives/linux-cluster/2010-November/msg00136.html


> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster