[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

How does ext3 handle drive failures?

We want to run multi-drive systems we have in a JBOD mode, where
each drive is basically a filesystem to itself.  With the drives
we currently have, we expect to have multiple failures, primarily
unrecoverable ECC read errors or sometimes the drive just dying

How does ext[23] handle these two primary conditions?  Using them
in a software RAID mode, I have sometimes seen problems with disks
hang all access to the filesystem and even the entire system, but
I'm not sure at what level that's happening (low-level driver?
scsi layer?  raid layer?  filesystem layer?).

If I have a drive fail taking out the entire ext3 filesystem, will
I be able to stop using the filesystem (say, my application gets
the error from the fs indicating some sort of problem in whatever
system call it's made, who cares what), forcibly unmount the
filesystem, and replace the drive?  Or will the system panic?  Or
worse, will my application just enter an uninterruptible sleep
never to return success or error?

Obviously, we'll be doing our own testing, but any knowledge of
these scenarios would be most appreciated.


* Philip Molter
* Texas.Net Internet
* http://www.texas.net/
* philip texas net

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]