[Linux-cluster] Restarting GFS2 without reboot

Vladimir Melnik v.melnik at uplink.ua
Tue Nov 26 11:13:48 UTC 2013


On Tue, Nov 26, 2013 at 09:59:34AM +0000, Steven Whitehouse wrote:
> Looking at the logs, I see that it looks like recovery has got stuck for
> one of the nodes, since the log is complaining that it has taken a long
> time for kslowd to run.
> So that suggests that the other node is currently fenced, and only one
> node is working anyway. If that is not the case then something has got
> rather confused somehow. What kind of fencing is in use here?

Thank you very much, Steven!

I have to say that it doesn't look like it's fenced:

Node  Sts   Inc   Joined               Name
   1   M    364   2013-11-11 07:39:22  ***
   2   M    388   2013-11-26 03:43:01  ***

Or shall I check somewhere else? Sorry if this question is a bit dumb.

> I also noticed that gfs2_quotad was complaining too - that tends to be
> the first thing to complain when it cannot make progress. It is used for
> both statfs and quota, so runs periodically even when quotas are not in
> use. So that is just an indicator that things are slow, and the cause is
> most likely to be elsewhere.
> The other question is also what caused the node to try and fence the
> other one in the first place? That is not immediately clear from the
> logs.

It seems that it has happened due to some traffic congestion.

> However you may well have to reboot one or more nodes in order to clear
> this condition, depending on exactly what the problem is.

That's what I'd love to avoid. :) I can't reboot nodes, I need to find
out how to restart GFS2 without any reboot. Is it ever possible?

I have several processes in the state if "D" on both nodes, so I
understand I couldn't just unmount the stalled filesystem.

> I did spot a note in the logs about the connection to the storage being
> lost, and that would certainly be enough to cause a problem on whichever
> node lost access. Are you running qdisk on that iSCSI storage? It would
> help if you could post your configuration,

No, the storage itself is not a part of a cluster, it just an
iSCSI-target for 2 nodes. Is it a bad idea?

Thank you.

-- 
V.Melnik




More information about the Linux-cluster mailing list