[Linux-cluster] GFS2 corruption/withdrawal/crash

Tue Jul 28 10:15:40 UTC 2009

Hi,

On Mon, 2009-07-27 at 13:23 -0400, Wendell Dingus wrote:
> This recently happened to us and I'm wondering if there's anything else we can do to prevent it and/or more fully recover from it.
> 
> 3 physical nodes, 64-bit, 2.6.18-128.2.1.el5xen, 3 GFS2 filesystems mounted by each. 
> 
> A zero-byte file in a subdir about 3 levels deep that when touched in any way causes total meltdown. Details below...
> 
I'd be very interested to know about the circumstances which led up to
this file getting into that state. Was it always a zero byte file, or
has it been truncated at some stage from some larger size? Was there a
prior fs crash at some time, or has the fs been otherwise reliable since
mkfs time?

Anything that you can tell us about the history of this file would be
very interesting to know.

> We took the filesystem offline (all nodes) and ran gfs2_fsck against it. The FS is 6.2TB in size, living on 2gb/sec fibrechannel array. It took 9 hours to complete which was not as bad as I had feared it would be. Afterwards the filesystem was remounted and that zero-byte file was attempted to be removed and the same thing happened again. So it appears gfs2_fsck did not fix it. Since there are 3 GFS filesystems the bad part was that access to all 3 went away when one of them had an issue because GFS itself appears to have crashed. That's the part I don't understand and am pretty sure was not what should have happened.
> 
> After a full reboot we renamed the directory holding the "bad" zero-byte file to a directory in the root of that GFS filesystem and are simply avoiding it at this point. 
> 
> Thanks...
> 
> Description from a co-worker on what he found from researching this:
> 
> While hitting something on the filesystem, it runs in to an invalid metadata block, realizes the error & problem, and attempts to take the FS offline because it's bad. (To not risk additional corruption) 
> 
> Jul 22 04:11:48 srvname kernel: GFS2: fsid=cluname:raid1.1: fatal: invalid metadata block 
> Jul 22 04:11:48 srvname kernel: GFS2: fsid=cluname:raid1.1: bh = 1633350398 (magic number) 
> Jul 22 04:11:48 srvname kernel: GFS2: fsid=cluname:raid1.1: function = gfs2_meta_indirect_buffer, file = /builddir/build/BUILD/gfs2-kmod-1.92/_kmod_build_xen/meta_io.c, line = 33 
> 4 
> Jul 22 04:11:48 srvname kernel: GFS2: fsid=cluname:raid1.1: about to withdraw this file system 
> Jul 22 04:11:48 srvname kernel: GFS2: fsid=cluname:raid1.1: telling LM to withdraw 
> Jul 22 04:11:57 srvname kernel: GFS2: fsid=cluname:raid1.1: withdrawn 
> Jul 22 04:11:57 srvname kernel: 
> 
> Unfortunately... For some reason, when it completes the withdrawal process, gfs crashes... I'm sure it's not supposed to do that... It should continue allowing access to all of the other GFS filesystems, but since the gfs module is dieing, it kills access to any gfs filesystems. 
> 
Ideally it wouldn't crash. In reality there are cases where what we need
to do in order to recover from an error gracefully cannot be done in the
context in which the error has occurred. The context in this case
usually means the locks which are being held at the time. There is some
ongoing work to try and improve on this, particularly wrt to corrupt
on-disk structures. In some cases we can now just return -EIO to the
user and carry on rather than withdrawing from the cluster.

The interesting thing in this case is that if the file is zero length,
it shouldn't have any indirect blocks at all, so it looks like the inode
height might have become corrupt. If you are able to save the metadata
from this fs, then that is something which we would find very helpful to
have a look at,

Steve.