Forced FSCK on Bad Reboot

Mike McCarty mike.mccarty at sbcglobal.net
Thu Jul 14 17:08:17 UTC 2005


James Wilkinson wrote:

>Tony Nelson wrote:
>  
>
>>[Linux no longer does an extended check after power outage boot]
>>
>>You could always switch back to Ext2 if you want long checks.  This is
>>probably not a good idea.
>>    
>>
>
>Mike McCarty wrote:
>  
>
>>Why is this not a good idea? Not that I want to change anything.
>>    
>>
>
>OK: this doesn't look like it's been answered.
>
>The problem is that when a computer powers off (or reboots)
>unexpectedly, data in memory is lost, and the operating system "forgets"
>what it was doing.
>
>With ext2, Linux has to take an extended check of the whole filesystem
>to make sure that everything was left in the right place (or at least,
>in a consistent place). It can't always work out exactly which file and
>which filename(s) go together (this is one reason for the lost+found
>directory in each filesystem).
>
>Ext3 is a "journalling" filesystem (as are Reiser, xfs, jfs, and NTFS).
>Linux keeps a journal on disk of what it's doing, and makes sure that
>the journal reaches disk before it makes any changes to the filesystem
>structure on disk.
>  
>
Hmm. So if a power failure occurs during the update of the journal, the disc
is corrupted anyway. It's like a COBOL programmer back in the bad old
days, who claimed that, since he always used databases which had
journals and a separate commit call, his databases could never get
corrupted. I argued and argued with this guy. Sadly, one day he found out
I was correct, and had no recovery plan.

>So when Linux powers up after an unexpected shutdown, it can simply read
>the journal, find out what it was trying to do, and either "roll back"
>the changes or complete them. It *knows* which parts of the filesystem
>to look at, and how to fix them. So this is a much faster operation,
>taking hundredths of a second.
>  
>
So long as the journal is not corrupt.

It seems to me that you left out a lot of details. Reading between the 
lines,
I'll guess that what you are saying is ext3 uses a lot of disc cache with
write-back rather than write-through policy, and journals what it has done
to the memory copy. Thus unwritten system buffers at power down don't
corrupt the disc.

Frankly, I'd rather use write-through.

In any case, I don't see any argument for not using an extended fsck on
a reboot after improper shutdown, which was my original question.

>  
>
>>Hmm. Yet the boot has an option to do an extended check. If the extended
>>check doesn't do any aditional useful work, then why is it an option?
>>    
>>
>
>In theory, ext3 should never get corrupted. But this theory doesn't
>account for certain factors:
>  
>
Yeah. Like preventing corruption of a hard disc is impossible in 
principle, let
alone in practice.

>All Software Sucks. All Hardware Sucks. Most computers don't have ECC
>RAM, and are going to get the occasional memory error. Hard drives and
>processors (and cables) aren't perfect. There are always software bugs
>to worry about. And someone has to worry about overclocking, or
>heatsinks and fans falling off processors (we had *that* one at work
>just today).
>
>Given all that, it's a Good Idea to have some way of detecting and
>correcting corruption in a filesystem.
>  
>
So now we're back to my question: Why shouldn't the system do a full
fsck on reboot after improper/incomplete shutdown?

Mike

-- 
p="p=%c%s%c;main(){printf(p,34,p,34);}";main(){printf(p,34,p,34);}
This message made from 100% recycled bits.
I can explain it for you, but I can't understand it for you.
I speak only for myself, and I am unanimous in that!




More information about the fedora-list mailing list