[Linux-cluster] GFS + DRBD Problems

gordan at bobich.net gordan at bobich.net
Mon Mar 3 16:22:51 UTC 2008


On Mon, 3 Mar 2008, Lon Hohberger wrote:

>> I have a 2-node cluster with Open Shared Root on GFS on DRBD.
>
> Last week, I saw a car with a license plate from 'Wyoming'.  Now,
> someone's running GFS on shared root DRBD.  My world's turning upside
> down.

LOL! We live in interesting times. :)
And anyway, what's wrong with GFS shared root on DRBD? :)

>> A single
>> node mounts GFS OK and works, but after a while seems to just block for
>> disk. Very much as if it started trying to fence the other node and is
>> waiting for acknowledgement.
>
> If CMAN was trying to fence, you'd see it in /var/log/messages.  I'm not
> sure about DRBD.

I can't see any evidence of that, and I'd expect to see something on the 
console about it, too. I'll set up a remote syslog to double-check.

>> There are no fence devices defined (so this
>> could be a possibility),
>
> Unlikely.  Even if this was the cause, you'd still see it (and you could
> work around it).
>
>
>> Unfortunately, it doesn't end there. When an attempt is made to dual-mount
>> the GFS file system before the secondary is fully up to date (but is
>> connected and syncing), the 2nd node to join notices an inconsistency, and
>> withdraws from the cluster. In the process, GFS gets corrupted, and the
>> only way to get it to mount again on either node is to repair it with
>> fsck.
>
> Off the top of my head, this sounds like a DRBD thing.  If sync's
> completed, it works, right?

Not quite - it works in as far as it gets as far as mounting the file 
system without noticing it to be inconsistent (presumably because it isn't 
changing underneath it). But the FS gets corrupted.

I cannot be sure right now, but I have a suspicion that both machines 
might be trying to mount the FS with the same journal. I could be 
mis-remembering and/or mis-interpreting what mount output says when it's 
connecting, though. I'll check it via the remote console in a bit and 
paste the output from each node.

Gordan




More information about the Linux-cluster mailing list