Reproducible Filesystem Corruption on FC4 (Long)

Tom Sightler ttsig at tuxyturvy.com
Wed Jun 29 23:37:47 UTC 2005


On Wed, 2005-06-29 at 20:19 -0300, Ben Steeves wrote: 
> On 6/29/05, Tom Sightler <ttsig at tuxyturvy.com> wrote:
> > I decided to reinstall and try again.  This time, immediately after the
> > install I ran fsck and found no errors.  I copied my directories from my
> > backup again, and the corruption also returned.  I repeated again, this
> > time I booted with ide=nodma before restoring my backup, this caused the
> > restore to take so long that I wasn't sure it would ever finish.  I did
> > not get corruption, but the system was far to slow to use with this
> > option.
> 
> This really, really sounds like a hardware problem.  I would check
> your /var/log/messages and see what smartctl has to say about your
> drives.  I'd also check the status of the drivers for your USB
> controller chipset, since if it is a software bug, that's probably
> where the problem lies.

I would agree that it sounds that way, but I simply don't think this is
the case.  For one thing, if it were a hardware problem, the system
wouldn't work with CentOS 4 or FC3 either, but both of those install and
run fine.  I use this system 12-14 hours a day with CentOS 4 and have
never experienced a single glitch.

There were absolutely no errors in /var/log/messages or in dmesg in
regards to the hardware, everything appeared to be working 100%
correctly, it just silently corrupted the data, time and time again.

I reinstalled CentOS 4, performed the identical steps, and everything
works perfectly.  I can also install FC3 and perform the steps without
issues, however, with FC4 I get silent corruption everytime I restore my
data from the USB device.

I suppose it's possible to be some issue with reading from the USB
drive.  I found some notes claiming that recent improvement in usb-
storage driver push the hardware harder and can sometimes expose USB
chipset problems that previously were hidden.  I could possibly buy
this, but even if the source drive is corrupt, that shouldn't corrupt
the drive your writing too, and in my case it's the internal IDE drive
that's being corrupted.  I can absolutely hammer this drive for days
with CentOS 4 without even a slight glitch and zero corruption.

I'm going to try tonight by installing FC4 and then replacing the kernel
before doing the restore, that should give me a good clue.

Thanks,
Tom





More information about the fedora-list mailing list