EXT3 filesystem on scsi device becoming readonly
James Bottomley
James.Bottomley at SteelEye.com
Mon Aug 28 16:44:30 UTC 2006
On Mon, 2006-08-28 at 12:24 -0400, Theodore Tso wrote:
> On Mon, Aug 28, 2006 at 08:49:36AM -0500, James Bottomley wrote:
> > On Mon, 2006-08-28 at 09:31 -0400, Theodore Tso wrote:
> > > IMHO the right thing is for the device driver to retry for some amount
> > > of time (maybe measured in seconds or perhaps a single digit number of
> > > minutes), and in the meantime, pass a signal to the rest of the kernel
> > > that any process that attempt to write to the filesystem should be
> > > frozen while we wait for the disk to come back.
> >
> > Actually, for this exact case, there's a feature propagating through the
> > transport classes called the dev loss timer. It's job, for pluggable
> > transports like FC, is to allow the user time to unplug and replug
> > cables before the system declares the device lost and starts erroring
> > requests (which is what causes the fs to go read only). Since the
> > original reporter seemed to be using fibre, it sounds like this would
> > suit. Beware: the dev loss timer shouldn't be much longer than the
> > SCSI command timeout (say ~30s) or nasty things may happen.
>
> Yes, that sounds ideal. Does the dev loss timer need to be
> configured, or is it going to be enabled with an appropriate-
> for-most-systems defalut valaue (such as the SCSI command timeout).
It's configurable via the fc transport class rports
(in /sys/class/fc_rport_class, value dev_loss_tmo) the default value is
60s
> Also, when did this get added to the various transport classes? I
> assume it's not going to be of much help for the original reporter he
> heeds it to work on a RHEL 3 AS Update 6 kernel, but hopefully it will
> be in SLES 10 / RHEL 5? Or is this something that is just going into
> the 2.6 mainline now?
Erm, pass. It predates git, so at least 2.6.12-rc2
James
More information about the Ext3-users
mailing list