[Linux-cluster] GFS on 3-node cluster corrupted after full network outage

Robert Peterson rpeterso at redhat.com
Fri Dec 8 17:47:29 UTC 2006


Klaas wrote:
> Hi David;
>
> Well the problem is that we *did* configure fencing,
> but as far as I understand (from the docs, FAQ and
> mailing lists archive),
>
> when 3 nodes cannot communicate anymore,
> they all are inquorate,
> and therefore they will not even commence fencing.
>
> So, GFS will continue to run on all three inquorate
> nodes. This seems to be acknowledged in
> http://www.webservertalk.com/archive391-2006-3-1430956.html
>
> But it would mean GFS corruption, in the end.
> What's wrong?
>
> Please advise,
> Klaas
>   
Hi Klaas,

I don't know what's happening at your shop, but here's what is supposed 
to happen:

http://sources.redhat.com/cluster/faq.html#gfs_fencefreeze

So cluster nodes may continue to operate on files, directories and so 
forth, for
already acquired GFS locks.  All new GFS locks should hang/be prevented 
until the
nodes can communicate again and quorum is reestablished (unless you've
skewed the quorum rules with high quorum disk vote counts).

This assumes you did not specify lock_nolock for your file system (if 
you did,
I'd expect total chaos and rampant corruption if the storage is shared.)
If the storage isn't shared, then it doesn't matter.

Regards,

Bob Peterson
Red Hat Cluster Suite




More information about the Linux-cluster mailing list