Frequent metadata corruption with ext3 + hard power-off
Mats Ahlgren
mats_a at MIT.EDU
Sun Mar 18 01:42:17 UTC 2007
Hello.
I'm having serious issues with ext3; any insight would be greatly appreciated:
_____ Overview:
I believe ext3 is supposed to be recoverable in the case of a power failure by
replaying the log.
However, on two separate computers (running different operatings systems too),
this has been everything but the case.
_____ Specifics:
Sometimes, my kernel will hard-freeze and I'll have to do a hard reboot. When
this happens, sometimes fsck will insist on running and find some orphaned
inodes, which it will proceed to put in the /lost+found directory.
This is unacceptable: The last time this happened, random files in my
operating system were plucked from the file system and stuffed in lost+found,
corrupting the OS and forcing a reinstall. Another time, files I had recently
moved (a final project) a minute before the crash were orphaned and put in
the lost+found, effectively destroying it.
Why should a lost+found folder even be necessary when the file hierarchy is
guaranteed to be consistent?
In response to these problems, I changed the ext3 journaling mode to "journal"
rather than "ordered" (frankly it seems deeply disturbing that "ordered" is
the default). Since then, I've once had to hard-reboot and yet again found
files in the /lost+found folder.
Might anyone know why ext3 is not fulfilling its promise of an
always-consistent file system?
_____ Other interacting issues:
I'm running RAID1 (mirroring) on one computer, but I've had the same issues on
another computer without RAID.
(In response to "you shouldn't hard-reboot your computer": I realize that most
computers are not meant to be hard-rebooted, but I don't have a sysrq key and
xmodmapping it has been difficult. I also realize that kernels shouldn't
crash, but what's a person to do if the computer doesn't respond to
ctrl-alt-f1 and doesn't leave any messages in the logs...)
(In response to "maybe your drive is defective": This is not a problem with a
defective drive; I've tried multiple drives.)
(In response to "you should backup your data": Periodic backups clearly help,
but it's ridiculous to restore a system from backup every week because a
hard-freeze corrupted your filesystem...)
Any insight would be greatly appreciated. These problems have been making me
look for other file systems (such as zfs, which unfortunately I can't use to
boot; or reiser4, which also makes a filesystem-is-always-consistent
guarantee); I would prefer to use ext3, but I've never had these sorts of
problems with old Mac OS, OS X, or Windows.
Thank you,
Mats
More information about the Ext3-users
mailing list