[Linux-cluster] Re: GFS2 corruption/withdrawal/crash

Steven Whitehouse swhiteho at redhat.com
Mon Aug 10 14:09:42 UTC 2009


Hi,

On Mon, 2009-08-10 at 10:05 -0400, Bob Peterson wrote:
> ----- "Steven Whitehouse" <swhiteho at redhat.com> wrote:
> | Hi,
> | 
> | On Sat, 2009-08-08 at 19:19 -0400, Wendell Dingus wrote:
> | > Well, I just ran fsck.gfs2 against this filesystem twice with a
> | 10-minute 
> | > pause between them. As such:
> | > # fsck -C -t gfs2 -y /dev/mapper/VGIMG0-LVIMG0
> | > 
> | > Output of second run:
> | > fsck 1.39 (29-May-2006)
> | > Initializing fsck
> | > Recovering journals (this may take a while)...
> | > Journal recovery complete.
> | > Validating Resource Group index.
> | > Level 1 RG check.
> | > (level 1 passed)
> | > Starting pass1
> | > Pass1 complete
> | > Starting pass1b
> | > Pass1b complete
> | > Starting pass1c
> | > Pass1c complete
> | > Starting pass2
> | > Pass2 complete
> | > Starting pass3
> | > Pass3 complete
> | > Starting pass4
> | > Pass4 complete
> | > Starting pass5
> | > Unlinked block found at block 37974707 (0x24372b3), left unchanged.
> | > ..snip about 30 total of these..
> | > Unlinked block found at block 96603710 (0x5c20e3e), left unchanged.
> | > Pass5 complete
> | > Writing changes to disk
> | > gfs2_fsck complete
> | > 
> | > When it was done I remounted the filesystem and tried to "rm -rf
> | /raid1/bad"
> | > which is a subdir in the root of this filesystem that contains the
> | zero-byte
> | > file that was the focal point of this grief to start with. 
> | > 
> | > Results:
> | > 
> | That looks like a bug in fsck at least as it should be dealing with
> | the
> | unlinked blocks that it finds, not ignoring them. Chances are that
> | the
> | block which is causing the issues belongs to one of the unlinked
> | blocks
> | (inodes I think it should say)
> | 
> | Steve.
> 
> Hi,
> 
> The "Unlinked block found...left unchanged." messages are harmless.
> This merely means that fsck.gfs2 found some blocks that were
> marked as "unlinked metadata" that should be automatically
> reassigned by gfs2's kernel code when needed.  At some point, we 
> made the decision not to fix the bitmaps for various reasons.  I don't
> remember the details, but I remember discussing it anyway.  Lately
> I've been thinking that we made the wrong decision and I should make
> fsck.gfs2 fix them rather than ignore them.
> 
We should not be ignoring them, certainly. Either they should be checked
just like "normal" inodes, or they should be removed by fsck. I don't
think it matters too much which we do, but I rather suspect that they
are not being checked like the other inodes are,

Steve.





More information about the Linux-cluster mailing list