[Linux-cluster] Fwd: GFS volume hangs on 3 nodes after gfs_grow

Fri Sep 26 17:44:39 UTC 2008

----- "Alan A" <alan.zg at gmail.com> wrote:
| Thanks again, Bob.
| 
| No kernel-panic on any of the nodes. I had to cold boot all 3 nodes in
| order
| to get the cluster going (might have been a fence issue but am not
| 100%
| sure, since we use only SCSI fencing until we agree on secondary
| fencing
| method). What is 'scary' is that gfs_grow command paralized that
| volume on
| all 3 nodes, and I coldn't access, nor unmount, nor run gfs_fsck, from
| any
| of the nodes. We will do more testing on this, btw do you have
| suggested
| "safe" method of growing and shrinking the volume other than what is
| noted
| in 5.2 documentation (since we followed the RHEL manual). If the GFS
| volume
| hangs - what is the best way to try and unmount it from the node, 
| would
| 'gfs_freeze' helped)?

Hi Alan,

No, gfs_freeze won't help.  In these cases, it's probably best to
reboot the node that caused the problem, by /sbin/reboot -fin or
throwing the power switch I think.  I suspect that clvmd status
hung because of the earlier problem.

I'm not aware of any problems in your version of gfs_grow that can
cause this kind of lockup.  It's designed to be run seamlessly while
other processes are using the file system, and that's the kind of
thing we test regularly.

If you figure out how to recreate the lockup, let me know so I
can try it out.  Of course, if this is a production cluster, I
would not take it out of production a long time to try this.
But if I can recreate the problem here, I'll file a bugzilla
record and get it fixed.

Regards,

Bob Peterson
Red Hat Clustering & GFS