ext3 fs errors 3T fs

Andreas Dilger adilger at clusterfs.com
Fri Jan 27 01:03:04 UTC 2006


On Jan 23, 2006  09:09 -0800, Dennis Williams wrote:
> I was able to isolate the problem to 2 different directories repeatedly.
> Both of them were in the lost+found directory.  I ran "stat {path}" in
> debugfs. on them but did not see any info that stood out as abnormal.
> When I get access to the system again, I will repost the output.

What would be of interest is the block numbers of the lost+found dir,
and all of the files therein.  Anything with a block number > 250M
(at the 2TB =  4B sector boundary) would be of interest.

> > I think debugging it would be easiest if you had a backup and were
> > willing to overwrite the device with a test pattern.
> 
> I would like to debug this situation when I get backup storage.  What
> steps would you recommend to do this?

If possible, it would be desirable to isolate the exact operation that
is causing the corruption.  Since we are fairly sure it is corrupting
the beginning of the filesystem (which likely aliases to just beyond
the 2TB device boundary) we could do a test like the following:

- do a backup of the first, say, 128kB of the device with dd
- read 50MB of data at 2TB offset
- compare this data - it should probably not be the same
- rewrite out the 50MB of data beyond 2TB
- verify that the first 128kB of data in the device did not change
- do some operation on _one_ file in the lost+found
- verify that the first 128kB of data does not change
- run e2fsck

I don't have anything else specific, just in the nature of "play around"
and see what breaks.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.




More information about the Ext3-users mailing list