EXT3 filesystem on scsi device becoming readonly

Theodore Tso tytso at mit.edu
Mon Aug 28 16:24:22 UTC 2006

On Mon, Aug 28, 2006 at 08:49:36AM -0500, James Bottomley wrote:
> On Mon, 2006-08-28 at 09:31 -0400, Theodore Tso wrote:
> > IMHO the right thing is for the device driver to retry for some amount
> > of time (maybe measured in seconds or perhaps a single digit number of
> > minutes), and in the meantime, pass a signal to the rest of the kernel
> > that any process that attempt to write to the filesystem should be
> > frozen while we wait for the disk to come back.  
> Actually, for this exact case, there's a feature propagating through the
> transport classes called the dev loss timer.  It's job, for pluggable
> transports like FC, is to allow the user time to unplug and replug
> cables before the system declares the device lost and starts erroring
> requests (which is what causes the fs to go read only).  Since the
> original reporter seemed to be using fibre, it sounds like this would
> suit.  Beware:  the dev loss timer shouldn't be much longer than the
> SCSI command timeout (say ~30s) or nasty things may happen.

Yes, that sounds ideal.  Does the dev loss timer need to be
configured, or is it going to be enabled with an appropriate-
for-most-systems defalut valaue (such as the SCSI command timeout).

Also, when did this get added to the various transport classes?  I
assume it's not going to be of much help for the original reporter he
heeds it to work on a RHEL 3 AS Update 6 kernel, but hopefully it will
be in SLES 10 / RHEL 5?  Or is this something that is just going into
the 2.6 mainline now?

					- Ted

