[Linux-cluster] GFS+DRBD+Quorum: Help wrap my brain around this
ag8817282 at gideon.org
Thu Nov 25 16:39:19 UTC 2010
On Tue, 23 Nov 2010 12:28:41 +0000, Colin Simpson wrote:
> Since the third node is ignorant about the status of DRBD, I don't
> really see what help it gives for it to decide on quorum.
I've just read through the "Best Practice with DRBD RHCS and GFS2" thread
on the drbd-users list. And I'm still missing what seems to me to be a
First: It seems like you no longer (since 8.3.8) need to have GFS startup
await the DRBD sync operation. That's good, but is this because DRBD
does the proper thing with I/O requests during a sync? That's what I
think is so, but then I don't understand why you'd an issue with 8.2. Or
am I missing something?
But the real issue for me is quorum/consensus. I noted:
wfc-timeout 0 ; # Wait forever for initial connection
degr-wfc-timeout 60; # Wait only 60 seconds if this node
# was a degraded cluster
but when I break the DRBD connection between two primary nodes,
"disconnected" apparently means that the nodes both continue as if
they've UpToDate disks. But this lets the data go out of sync. Isn't
this a Bad Thing?
Clearly, if there were some third party (ie. a quorum disk or a third
node), this could be resolved. But these don't seem to be required in
the DRBD world, so how is this situation resolved?
DRBD supports fencing, so perhaps that is the answer? I'm reluctant to
make use of the cluster's fencing as - as described in the thread you
referenced - cluster suite starts after DRBD.
I'm thinking of trying a fencing policy of resource-and-stonith where the
the handler tries to get a shared semaphore (ie. connect to a port on a
third server that accepts only a single connection at a time, or perhaps
even just a lock on an file mounted via NFS from a third server). If it
raises the semaphore/gets the lock, it fences the DRBD peer. If it
doesn't, it either waits forever or marks itself as outdated.
This may also work to solve the startup "wait forever" problem, in that
the starting node in WaitForConnect which gets the shared lock first gets
to come up while the other is blocked. I'm not yet sure how to implement
this from DRBD's perspective, though. I'm not clear that there's a
handler that's called if DRBD starts and cannot establish an initial
That I've found no mention of this idea leaves me suspicious that it
won't work or that it's overkill. Yet I cannot see why. It follows the
same model of quorum as the cluster software.
More information about the Linux-cluster