2.4.24 I/O error breakage

Alex Bligh alex at alex.org.uk
Sat Jul 3 11:43:49 UTC 2004


Twice in the past week (when things have previously been fine for a
year), a server has locked up spewing forth a continuous stream of
ext3 write errors. This is to a bog-standard IDE disk, only thing
on the controller, etc.

Nothing EVER hits the logs. Not a single error. Every process that
accesses the disk seems to fail. It looks like ext3 is failing every
I/O request.

If the machine is rebooted, it comes up completely clean.

I am prepared to believe I have a bad disk that occasionally pops up
the odd IDE error, but given it takes pretty intensive activity (busy
mail server) without failing, I don't believe it is riddled with errors.

I speculate that either
a) After one I/O error, ext3 or the IDE layer is returning I/O errors
   for every request (something is not resetting some error condition),
   or
b) Something is hanging the IDE bus, which only a reboot cures.

Is (a) a known possibility? If not, is (b) possible?

I am upgrading to 2.4.26-rc2, to see if that fixes things. What further
information should I look for (or post here) to help me (or you) debug this
further?

Alex





More information about the Ext3-users mailing list