[Linux-cluster] errors with GFS2 and DRBD. Please help..

Bob Peterson rpeterso at redhat.com
Wed Mar 17 13:48:17 UTC 2010


----- "Koustubha Kale" <koustubha_kale at yahoo.com> wrote:
| >Using this new version of fsck.gfs2 I was able to fix a file system
| >restored from the metadata you sent me.
| 
| Hi Bob,
| I am curious as to how this works. The restoremeta options pretty much
| says it will destroy the data on the device. So how did you go about
| resoring the file system with the metadata I sent?
| 
| With warm regards
| Koustubha Kale

Hi Koustubha,

The restoremeta option restores the metadata from a file into a device,
without regard to where things are on disk. Only gfs2 metadata is
restored, so when I restore the metadata on my test system, the files
will all appear to be there in the same locations as your original file
system, but the contents of those files will obviously be trash since
no data is saved; the files will contain whatever happens to be on the
device I'm restoring it to.

So the file names, directories and internal gfs2 data structures are
preserved, but the data blocks are ignored.  If the file system has not
changed AT ALL since the savemeta was done, the restoremeta will
restore the metadata, leaving all the data blocks in their same
locations.  So for users, savemeta/restoremeta is a convenient way
to make a backup of your metadata only, so if fsck.gfs2 makes a fatal
mistake and destroys something, you can immediately do restoremeta and
_sometimes_ get the file system back to its near-original condition
before the fsck.gfs2.  That's not 100% guaranteed.  Take the following
scenario:

1. User does savemeta
2. User runs fsck.gfs2
3. fsck.gfs2 decides a file is damaged and needs to be deleted.
   The file has a dinode is at block 0x1000 and points to a data block
   at 0x1001.  Both blocks are marked free. 
4. Later, fsck.gfs2 finds a file that is orphaned by a damaged directory.
   As a result, fsck.gfs2 creates a "lost+found" directory by allocating
   some free blocks.  Guess which free blocks is uses?  It may (or may
   not) use the block it freed earlier in step 3, 0x1000 and 0x1001.
5. fsck.gfs2 does something stupid and destroys a whole directory
   because of some rare bug.
6. User restores the metadata with restoremeta.

After this sequence of events, the file deleted in step 3 will look
restored, since the dinode block 0x1000 was saved and restored.
However, if block 0x1001 was also used, that file's contents will now
look remarkably like a lost+found directory block.  See what I mean?

Now granted, if fsck.gfs2 decided to delete the file in step 3, it's
most likely due to irreparable damage, so restoring the metadata won't
fix the file in either case.

So you can't really trust the data restored by restoremeta unless
nothing but the metadata has changed.  But since savemeta doesn't save
data blocks, any changes to the file system (including by fsck.gfs2)
will most likely result in damage to the files.

So restoremeta helps me solve gfs2 fsck issues because I can exactly
simulate the failing conditions.  But it can't be relied upon for
much else.

This is why backups are so important.

Regards,

Bob Peterson
Red Hat File Systems




More information about the Linux-cluster mailing list