[Linux-cluster] Failed gfs_grow causing corrupt volume

Steffen Plotner swplotner at amherst.edu
Fri Jan 25 14:56:59 UTC 2008


 

> -----Original Message-----
> From: linux-cluster-bounces at redhat.com 
> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Bob Peterson
> Sent: Friday, January 25, 2008 9:28 AM
> To: linux clustering
> Subject: Re: [Linux-cluster] Failed gfs_grow causing corrupt volume
> 
> On Fri, 2008-01-25 at 12:08 +0000, Ben Yarwood wrote:
> > Trying to grow a 15TB file system to 20TB this morning, 
> using RHEL4.4 
> > I got an error and the grow failed.  The file system will 
> still mount but when accessed gives the following error and withdraws:
> > 
> > Jan 25 11:32:49 jrmedia-c kernel: GFS: 
> fsid=alpha_cluster:wav.0: fatal: invalid metadata block
> > Jan 25 11:32:49 jrmedia-c kernel: GFS: 
> fsid=alpha_cluster:wav.0:   bh = 465407847 (type: exp=4, found=3)
> > Jan 25 11:32:49 jrmedia-c kernel: GFS: 
> fsid=alpha_cluster:wav.0:   function = gfs_get_meta_buffer
> > Jan 25 11:32:49 jrmedia-c kernel: GFS: 
> fsid=alpha_cluster:wav.0:   file =
> > 
> /builddir/build/BUILD/gfs-kernel-2.6.9-75/smp/src/gfs/dio.c, 
> line = 1223
> > Jan 25 11:32:49 jrmedia-c kernel: GFS: 
> fsid=alpha_cluster:wav.0:   time = 1201260769
> > Jan 25 11:32:49 jrmedia-c kernel: GFS: 
> fsid=alpha_cluster:wav.0: about 
> > to withdraw from the cluster Jan 25 11:32:49 jrmedia-c kernel: GFS: 
> > fsid=alpha_cluster:wav.0: waiting for outstanding I/O Jan 
> 25 11:32:49 
> > jrmedia-c kernel: GFS: fsid=alpha_cluster:wav.0: telling LM to 
> > withdraw Jan 25 11:32:50 jrmedia-c kernel: lock_dlm: withdraw 
> > abandoned memory Jan 25 11:32:50 jrmedia-c kernel: GFS: 
> > fsid=alpha_cluster:wav.0: withdrawn
> 
> Hi Ben,
> 
> It sounds like you found a bug in gfs_grow.  It should 
> probably have cleaned up after itself when it failed.  Can 
> you tell me more about the gfs_grow error and possibly open a 
> bugzilla record for it?
> Nobody else has reported a problem like this to my knowledge.
> 
> Unfortunately, as far as your file system is concerned, there 
> is not much that can be done.  I tried to put a lot of smarts 
> into gfs_fsck to repair weird and damaged RG conditions (thus 
> the 3 levels of RG repair).
> Unfortunately, gfs_grow throws the normal ("mkfs") rules out 
> and can put file system metadata in places that gfs_fsck 
> can't reasonably predict.
> 
> (I did my best to remedy that with gfs2 (gfs2_grow) but we 
> can't change the on-disk format of gfs1, so we can't change it.)
> 
> Regards,
> 
> Bob Peterson
> Red Hat GFS

My question would be, if Bob had done a gfs_fsck before attempting to
grow the gfs space, what would that have returned? Would that have
prevented the gfs_grow issue?

Steffen




More information about the Linux-cluster mailing list