[Linux-cluster] GFS on 3-node cluster corrupted after full network outage
Robert Peterson
rpeterso at redhat.com
Fri Dec 8 17:47:29 UTC 2006
Klaas wrote:
> Hi David;
>
> Well the problem is that we *did* configure fencing,
> but as far as I understand (from the docs, FAQ and
> mailing lists archive),
>
> when 3 nodes cannot communicate anymore,
> they all are inquorate,
> and therefore they will not even commence fencing.
>
> So, GFS will continue to run on all three inquorate
> nodes. This seems to be acknowledged in
> http://www.webservertalk.com/archive391-2006-3-1430956.html
>
> But it would mean GFS corruption, in the end.
> What's wrong?
>
> Please advise,
> Klaas
>
Hi Klaas,
I don't know what's happening at your shop, but here's what is supposed
to happen:
http://sources.redhat.com/cluster/faq.html#gfs_fencefreeze
So cluster nodes may continue to operate on files, directories and so
forth, for
already acquired GFS locks. All new GFS locks should hang/be prevented
until the
nodes can communicate again and quorum is reestablished (unless you've
skewed the quorum rules with high quorum disk vote counts).
This assumes you did not specify lock_nolock for your file system (if
you did,
I'd expect total chaos and rampant corruption if the storage is shared.)
If the storage isn't shared, then it doesn't matter.
Regards,
Bob Peterson
Red Hat Cluster Suite
More information about the Linux-cluster
mailing list