[Linux-cluster] Restarting GFS2 without reboot
Digimer
lists at alteeve.ca
Tue Nov 26 15:44:39 UTC 2013
On 26/11/13 07:43, Vladimir Melnik wrote:
> On Tue, Nov 26, 2013 at 12:34:35PM +0000, Steven Whitehouse wrote:
>>> I have to admit that fencing hasn't been enabled in this cluster, 90% of
>>> jobs on these 2 nodes are working with other storage that is accessible
>>> by NFS. So it wouldn't be okay to reboot a node due to any problems with
>>> GFS2.
>> In which case, get fencing configured first. Otherwise the first time
>> there is a problem, you risk data corruption. There is a very good
>> reason that fencing is required. It sounds like your overall config
>> needs a bit of a rethink,
>
> Yes, I'm going to move GFS2 on separate cluster which will have fencing,
> because I understand there's a huge risk to corrupt all the data.
>
> But are there any suggestions on how to remount GFS2 now?
>
> Thank you!
It's not just data corruption risk.
As I understand the mechanics (and Steven would know better);
Node fails, peer calls fenced.
fenced informs DLM, dlm blocks
fence loops until it succeeds
fenced informs DLM, locks on now-fenced node are reaped and
cleanup/recover begins.
Without fencing, it will enter that loop and not recover, leaving your
cluster blocked.
--
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?
More information about the Linux-cluster
mailing list