EXT3 filesystem on scsi device becoming readonly

Theodore Tso tytso at mit.edu
Mon Aug 28 13:31:04 UTC 2006


On Sun, Aug 27, 2006 at 11:13:10PM -0400, Jayjitkumar Lobhe wrote:
> 1)Create a ext3 file system on HDLM device
>       mkfs –t ext3 /dev/sdn
> 2)Mount the device
>       Mount /dev/sddlmaa  /home/<dir. name>
> 3)Execute cp –f command in a loop on mounted device.
>       cp -f /root/install.log  /home/<dir name>/<file name>
> 4)Disconnect the path. (Either by plugpull or by disabling the port)
> 5)When all path become offline device will become read-only.
> 6)Now again connect the path. Though the status of the path becomes
> online, device remains as read-only.

This is a hard problem, and unfortunately we don't have a good
solution, other than "make sure you don't lose your last path".

The fundamental issue is that the kernel has no idea when the path to
the device might come back.  It might be in a few seconds, it might be
in a day, month, year, or never.  So how to handle this case is hard
because when the write fails to the filesystem, what should it do,
especially when there are outstanding transactions to the journal?

IMHO the right thing is for the device driver to retry for some amount
of time (maybe measured in seconds or perhaps a single digit number of
minutes), and in the meantime, pass a signal to the rest of the kernel
that any process that attempt to write to the filesystem should be
frozen while we wait for the disk to come back.  

A much more difficult thing to implement would be kernel functionality
which saves all critical disk blocks that eventually needs to be
written back to the device, where the system call has already returned
"OK" to userspace so there is an implicit commitment that the data has
been preserved, but to cause all future writes to the filesystem to
fail with an error, and when the path comes back, to write out the
critical disk blocks, and then allow writes to the filesystem to
succeed again.  Of course, this may end up confusing applications
pretty badly anyway.

Of course, this is all going to require significant development work.
For users of ext3 as it is currently shipped on distributions today,
I'm afraid the only solution in the case of the failure of the last
path to your disk, is to either reboot the system, or unmount the
filesystem, and then remount it.  And, of course, to try to make very
sure that the last path to the disk doesn't fail, possibly by adding
extra paths if possible.

Is this ideal?  Not hardly.  But solving the problem is an extremely
hard problem, and requires help and changes both at the filesystem as
well as below it on the storage stack.

						- Ted




More information about the Ext3-users mailing list