[Linux-cluster] GFS on 3-node cluster corrupted after full network outage

Klaas klaas at klaas.nl
Fri Dec 8 15:49:09 UTC 2006


Hi David;

Well the problem is that we *did* configure fencing,
but as far as I understand (from the docs, FAQ and
mailing lists archive),

when 3 nodes cannot communicate anymore,
they all are inquorate,
and therefore they will not even commence fencing.

So, GFS will continue to run on all three inquorate
nodes. This seems to be acknowledged in
http://www.webservertalk.com/archive391-2006-3-1430956.html

But it would mean GFS corruption, in the end.
What's wrong?

Please advise,
Klaas



> On 12/7/06, Klaas <klaas at klaas.nl> wrote:
>> Hi;
>>
>> We have Cluster software running because we need GFS.
>>
>> Lately, we had a 100% network outage and all three
>> nodes kept on running (of course inquorate)
>> and the GFS stayed available too.
>>
>> No fencing (I understand this is documented behaviour) but
>> this way our GFS will get corrupted if the rest of the
>> software on the nodes keeps accessing the GFS...
>>
>> Please advise,
>> Klaas
>>
>
> http://sources.redhat.com/cluster/faq.html#fence_what
>
>>From the FAQ:
>
> #  Can't I just use my own watchdog or manual fencing?
>
> No. Fencing is absolutely required in all production environments.
> That's right. We do not support people using only watchdog timers
> anymore.
>
> Manual fencing is absolutely not supported in any production
> environment, ever, under any circumstances.
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>




More information about the Linux-cluster mailing list