[Linux-cluster] GFS volume hangs on 3 nodes after gfs_grow

Thu Sep 25 20:11:54 UTC 2008

Hello Bob, and thanks for the reply.

I am using RHEL5, 3 node cluster - node2, node3 and node4. Node2 is also a
lucy box.
Now that I look at the versions of the GFS, I do see some differences. We
are not running Xen kernel.

The command gfs_grow was launched from the Cluster node4 and here is what I
have found issuing rpm -qa | grep gfs:
2.6.18-92.1.10.el5 #1 SMP Wed Jul 23 03:55:54 EDT 2008 i686 i686 i386
GNU/Linux

gfs2-utils-0.1.44-1.el5
gfs-utils-0.1.17-1.el5
kmod-gfs-0.1.23-5.el5
kmod-gfs2-1.92-1.1.el5
kmod-gfs2-xen-1.92-1.1.el5
kmod-gfs-xen-0.1.23-5.e

Node 3:
2.6.18-92.1.10.el5 #1 SMP Wed Jul 23 03:55:54 EDT 2008 i686 i686 i386
GNU/Linux

gfs2-utils-0.1.44-1.el5_2.1
gfs-utils-0.1.17-1.el5
kmod-gfs-0.1.23-5.el5
*kmod-gfs-0.1.23-5.el5_2.2*
kmod-gfs2-1.92-1.1.el5
*kmod-gfs2-1.92-1.1.el5_2.2*
kmod-gfs2-xen-1.92-1.1.el5
kmod-gfs2-xen-1.92-1.1.el5_2.2
kmod-gfs-xen-0.1.23-5.el5
kmod-gfs-xen-0.1.23-5.el5_2.2

Node2 - lucy node:
2.6.18-92.1.10.el5 #1 SMP Wed Jul 23 03:55:54 EDT 2008 i686 i686 i386
GNU/Linux

gfs2-utils-0.1.44-1.el5_2.1
kmod-gfs-0.1.23-5.el5_2.2
gfs-utils-0.1.17-1.el5
kmod-gfs-0.1.23-5.el5

Let me know if you need any additional information. What would be suggested
path to recovery. I tried gfs_fsck but I get:
Initializing fsck
Unable to open device: /lvm_test2

On Thu, Sep 25, 2008 at 2:46 PM, Bob Peterson <rpeterso at redhat.com> wrote:

> ----- "Alan A" <alan.zg at gmail.com> wrote:
> | Hi all!
> |
> | I have 3 node test cluster utilizing SCSI fencing and GFS. I have made
> | 2 GFS
> | Logical Volumes - lvm1 and lvm2, both utilizing 5GB on 10GB disks.
> | Testing
> | the command line tools I did lvextend -L +1G /devicename to bring lvm2
> | to
> | 6GB. This went fine without any problems. Then I issued command
> | gfs_grow
> | /mountpoint and the volume became inaccessible. Any command trying to
> | access
> | the volume hangs, and umount returns: /sbin/umount.gfs: /lvm2: device
> | is
> | busy.
> |
> | Few questions - Since I have two volumes on this cluster and the lvm1
> | works
> | just fine, would there be any suggestions to unmounting lvm2 in order
> | to try
> | and fix it?
> | Is gfs_grow - bug free or not (use/do not use)?
> | Is there any other way besides restarting the cluster/ nodes to get
> | lvm2
> | back in operational state?
> | --
> | Alan A.
>
> Hi Alan,
>
> Did you check in dmesg for kernel messages relating to the hang?
>
> I have seen some bugs in gfs_grow, and there are some fixes that
> haven't made it out to all users yet, but you did not tell us which
> version of the software you're using.  You didn't even say whether
> this is RHEL4/CentOS4 or RHEL5/Centos5 or another distro.
>
> I'm not aware of any bugs in the most recent gfs_grow that appears
> in the cluster git repository.  These gfs_grow fixes will trickle
> out to various releases if you're not compiling from the source code,
> so you may or may not have the fixed code.
>
> If your software is not recent, it's likely that an interrupted or
> hung gfs_grow will end up corrupting the GFS file system.  There is
> a new, improved version of gfs_fsck that can repair the damage, but
> again, you need a recent version of the software.
>
> Regards,
>
> Bob Peterson
> Red Hat Clustering & GFS
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>

-- 
Alan A.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080925/2791f338/attachment.htm>