how does ext3 handle no communication to storage

Sun Feb 18 08:04:24 UTC 2007

On Feb 16, 2007  11:25 -0500, Sev Binello wrote:
> Theodore Tso wrote:
> >On Mon, Aug 28, 2006 at 03:04:26PM -0400, Sev Binello wrote:
> >>Can anyone tell us what the expected behavior is,
> >>in the event that ext3 loses total contact with the storage system ?
> >>
> >>We have found that the file system is put into read only mode,
> >>it is then found to contain errors, and requires an fsck.
> >>Sometimes the fsck finds numerous (some serious looking) errors,
> >>and that running without fsck doesn't seem like a safe option.
> >>
> >>We are trying to understand why exactly this is.
> >>Why do we get errors ?  Why serious ones ?
> >
> >The filesystem should go read-only when you try to modify it.
> >HOWEVER, the problem comes when connectivity is restored.  When an
> >attempt to modify the filesystem fails, the journal is aborted and an
> >I/O is returned.  However, there may be modified blocks left hanging
> >about in the buffer cache before the kernel realized that connectivity
> >has been lost, and what we need to do is to make sure that all dirty
> >blocks in the buffer cache and page cache are dropped.

In fact, there are a number of other places as well, like the elevator
and IDE/SCSI/LVM layers that can be hung up on timeouts and retries for a
long time.  It would be nice if the filesystem could abort all pending
IOs in the underlying layers

> >Basically, if I'm right, this is a bug, which we need to fix.  That
> >patch would require flushing all modified buffers and page cache pages
> >when the filesystem goes read-only.  The modified buffers is the more
> >important thing, since that's what causes the filesystem corruption,
> >although for correctness's sake we should be flushing any modified
> >page cache pages as well.  I don't have time to code this right now,
> >but I'll try to get a patch out to relatively soonish, if you're
> >willing to try it to see if it addresses your observed problem.

We talked at one time of marking the block device via set_device_ro().
That would prevent any of the blocks to be flushed out by the block
layer.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.