ext3 filesystem corruption - more info

Andreas Dilger adilger at clusterfs.com
Thu Apr 13 05:40:56 UTC 2006


On Apr 12, 2006  19:28 -0400, Sev Binello wrote:
[HTML-only email] - it would be preferred if you used plain text, or at
least multipart/mixed for your email to this list...

> //soon as nfs clients start get a TON of errors like this
> Mar 26 00:07:19 acnlin82 kernel: EXT3-fs error (device sd(8,49)):
> ext3_free_blocks: Freeing blocks not in datazone - block = 3443589120, count = 1
> Mar 26 00:07:19 acnlin82 kernel: EXT3-fs error (device sd(8,49)):
> ext3_free_blocks: Freeing blocks not in datazone - block = 2113834232, count = 1
> Mar 26 00:07:22 acnlin82 kernel: EXT3-fs error (device sd(8,49)):
> ext3_free_blocks: bit already cleared for block 49125

> //interspersed with some of these
> Mar 26 00:10:56 acnlin82 kernel: attempt to access beyond end of device
> Mar 26 00:10:56 acnlin82 kernel: 08:31: rw=0, want=1891463980, limit=1722264358
> Mar 26 00:10:56 acnlin82 kernel: attempt to access beyond end of device
> Mar 26 00:10:56 acnlin82 kernel: 08:31: rw=0, want=1824250576, limit=1722264358
> Mar 26 00:10:56 acnlin82 kernel: attempt to access beyond end of device

These indicate that the kernel ext3 code detected serious corruption of the
metadata on the filesystem.  In cases like this, if the filesystem doesn't
remount readonly (i.e. mounted with "-o errors=remount-ro") then it just
makes the corruption progressively worse.

It doesn't point to a root cause, however.

> Would it be a problem if the two 1.8TB systems appeared on one host?

No, some of our customers have hundreds of systems with two ext3 filesystems
of about this size, running on 2.4.21-RHEL3 kernels.  The LUNs exported from
the RAID storage are all under 2TB.  They have never reported similar problems
over several years of usage.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.




More information about the Ext3-users mailing list