Fsck takes too long on multiply-claimed blocks
Andreas Dilger
adilger at sun.com
Tue Feb 17 20:47:35 UTC 2009
On Feb 12, 2009 09:19 -0500, Theodore Ts'o wrote:
> On Thu, Feb 12, 2009 at 10:54:40AM +0100, Vegard Svanberg wrote:
> > After a power failure, a ~500G filesystem crashed. Fsck has been running
> > for days. The problem seems to be multiply-claimed blocks. Example:
> >
> > File /directory/file.name/foo (inode #1234567, mod time Tue Feb
> > 10 08:14:40 2008)
> > has 1800000 multiply-claimed block(s), shared with 1 file(s):
> >
> > /directory/file.name/bar
> > (inode #1234567, mod time Wed Dec 1 15:30:00 2008)
> > Clone multiply-claimed blocks? y
> >
> > This takes like forever, probably due to the large number of
> > multiply-claimed blocks.
>
> You are using a version of e2fsprogs/e2fsck newer than 1.28, right?
> If not, there's your problem; upgrade to something newer. Older
> e2fsck's had O(n**2) algorithms that made this very slow, causing this
> pass to be CPU-bound. It could be slow because of memory pressure
> issues; the data structures for keeping track of all of those blocks
> aren't small.
The "inode badness" patch in the Lustre e2fsprogs does a reasonably
good job at handling this. It will automatically mark one/both
of these inodes as "fatally corrupted" and delete it/them. That will
not happen if only a handful of blocks are shared, so would not delete
files in cases with e.g. simple bitflips and such.
Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.
More information about the Ext3-users
mailing list