Corrupted Drive

Henry Hartley henryhartley at westat.com
Fri Dec 3 19:48:40 UTC 2004


Last January, a server I run (Red Hat 9) was rebooted (I don't remember
why).  When it came back up, it reported problems with one partition and I
was given a shell prompt and told (as best I can remember) to run fsck
manually.  I did and when it asked if I wanted to fix the problems it found,
I said yes.  Upon exiting and rebooting, the system was dead.  Oh, it would
boot but many of the files that were "fixed" by fsck were in /usr/sbin and
they would not run.  This included httpd, sendmail and sshd.  It was a pain
but I decided that this would be as good a time as any to upgrade from
whatever old system I had to FC1.  I had recently bought a new hard drive
and installed to that, then mounted the old drive and was able to recover
pretty much everything without having to dig into my "real" backups.  Things
went relatively smoothly and I soon had a happily running FC1 server.  Deep
sigh.

This morning I was working near the server and needed to move the UPS that
it's plugged into.  Unfortunately, I pressed the UPS power switch (which
should be harder to press!) and the room went quiet!  Yikes.  Okay, don't
panic.  I turned things back on and the server started up.  I got virtually
the same message as last time which said to run fsck manually.  Naturally, I
was a bit worried about doing that but didn't see what choice I had.
Knowing the worst that would happen is I'd have to go to backups, I went
ahead and fixed the errors found.  Fortunately for me, the system booted
properly afterwards and everything seems to be running.  There may be
problems I haven't found yet but at least the main things are working.

The two incidents were 11 months apart and on different physical drives
(although the rest of the hardware is the same).  The system was not
rebooted from the time I finished the upgrade in January until today.  Sure,
it is time (at least) to upgrade from FC1 to FC3 but I'm always happier
doing it in MY time and not because I have to.

My question...

Is there something I should be doing to prevent this sort of thing?  On a
system that doesn't get rebooted very often, should fsck be run manually
from time to time?  Or would this just cause the same sort of problem?  Any
suggestions so that I don't have a repeat of this next November would be
appreciated.

-- 
Henry




More information about the fedora-list mailing list