2.4.24 I/O error breakage
Alex Bligh
alex at alex.org.uk
Sat Jul 3 11:43:49 UTC 2004
Twice in the past week (when things have previously been fine for a
year), a server has locked up spewing forth a continuous stream of
ext3 write errors. This is to a bog-standard IDE disk, only thing
on the controller, etc.
Nothing EVER hits the logs. Not a single error. Every process that
accesses the disk seems to fail. It looks like ext3 is failing every
I/O request.
If the machine is rebooted, it comes up completely clean.
I am prepared to believe I have a bad disk that occasionally pops up
the odd IDE error, but given it takes pretty intensive activity (busy
mail server) without failing, I don't believe it is riddled with errors.
I speculate that either
a) After one I/O error, ext3 or the IDE layer is returning I/O errors
for every request (something is not resetting some error condition),
or
b) Something is hanging the IDE bus, which only a reboot cures.
Is (a) a known possibility? If not, is (b) possible?
I am upgrading to 2.4.26-rc2, to see if that fixes things. What further
information should I look for (or post here) to help me (or you) debug this
further?
Alex
More information about the Ext3-users
mailing list