[Linux-cluster] falure during gfs2_grow caused node crash & data loss

bergman at merctech.com bergman at merctech.com
Mon Mar 22 17:38:37 UTC 2010



In the message dated: Mon, 22 Mar 2010 13:18:31 EDT,
The pithy ruminations from Bob Peterson on 
<Re: [Linux-cluster] falure during gfs2_grow caused node crash & data loss> wer
e:
=> ----- bergman at merctech.com wrote:
=> (snip)
=> | Do you mean "di_size"?
=> 
=> Yes.
=>  
=> | => to be a fairly small multiple of 96 then repeat steps 1 through 4.
=> | 
=> | According to "gfs2_edit -p rindex", the initial value of di_size is:
=> | 
=> | 	di_size               8192                0x2000
=> | 
=> | Does that give any indication of an appropriate "fairly small
=> | multiple"?
=> | 
=> | Thanks,
=> | 
=> | Mark
=> 
=> Hi Mark,
=> 
=> The big question is: Was this file system created with mkfs.gfs2
=> originally?  Or was it created with gfs_mkfs (gfs1) and converted to

Nope.

=> gfs2 by gfs2_convert?  If it was created by gfs_mkfs and converted

Yes.

=> then there's not much hope of recovering because fsck.gfs2 isn't
=> currently smart enough to handle oddly-spaced rgrps left behind by gfs1.

Ouch.

=> 
=> Here's the problem: fsck.gfs2 seems to be claiming that there are six
=> rgrps intact, each of which is around 1GB.  Since the file system was
=> originally much bigger, I'd think there would be more.  Each of the
=> rindex entries is 96 bytes, so you could try 6*96 = 576, or in hex 0x240.
=> So basically you could try setting di_size to 0x240 with gfs2_edit, then
=> mount and run gfs2_grow.  Then unmount and run fsck.gfs2.
=> 
=> As I said, if the file system was originally gfs1, this won't work.


The filesystem was originally gfs1.

I converted it to gfs2. Running fsck.gfs2 was successful, then it crashed when 
trying to grow the gfs2 filesystem.
 

Questions:

	1. Is there any way to tell (ie. "gfs2_edit -p" or "lvs") whether
		a gfs2 volume was originally gfs1? I've got several more
		gfs2 volumes, some of which may have been gfs1 originally.

	2. Is there any safe way to run gfs2_grow on a gfs2 volume that was
		born as gfs1?


=> 
=> If the file system was gfs2 from its conception, hopefully gfs2_grow
=> will rewrite those damaged rgrps starting with the damaged one, and
=> then fsck.gfs2 will take care of finding out what is allocated and not
=> allocated and fix the bitmaps.  If the file system was full before
=> gfs2_grow, you could lose a lot of data.  It's a long shot, really,
=> but I guess you've got nothing more to lose.

OK. I'll proceed with the restore from backups.

Thanks,

Mark

=> 
=> Regards,
=> 
=> Bob Peterson
=> Red Hat File Systems
=> 






More information about the Linux-cluster mailing list