2.4.24 I/O error breakage

FabF fabian.frederick at skynet.be
Sat Jul 3 13:12:41 UTC 2004

On Sat, 2004-07-03 at 13:43, Alex Bligh wrote:
> Twice in the past week (when things have previously been fine for a
> year), a server has locked up spewing forth a continuous stream of
> ext3 write errors. This is to a bog-standard IDE disk, only thing
> on the controller, etc.
> Nothing EVER hits the logs. Not a single error. Every process that
> accesses the disk seems to fail. It looks like ext3 is failing every
> I/O request.
> If the machine is rebooted, it comes up completely clean.
> I am prepared to believe I have a bad disk that occasionally pops up
> the odd IDE error, but given it takes pretty intensive activity (busy
> mail server) without failing, I don't believe it is riddled with errors.
> I speculate that either
> a) After one I/O error, ext3 or the IDE layer is returning I/O errors
>    for every request (something is not resetting some error condition),
>    or
> b) Something is hanging the IDE bus, which only a reboot cures.
> Is (a) a known possibility? If not, is (b) possible?
> I am upgrading to 2.4.26-rc2, to see if that fixes things. What further
> information should I look for (or post here) to help me (or you) debug this
> further?
> Alex
dumpe2fs -h <working partition> could help ... maybe.

