how does ext3 handle no communication to storage

Tue Aug 29 08:20:03 UTC 2006

On Aug 28, 2006  18:47 -0400, Sev Binello wrote:
> so there are buffers in the cache that haven't been written out yet,
> so by "flush" drop" you mean get rid of, or you mean make sure they are 
> written out ?
> If the former, how does that prevent file system corruption ?

If buffers are discarded outright (i.e. the first case) then the ext3
journal recovery will handle the interruption at remount time as if
the node had rebooted.  If, on the other hand, some dirty buffers are
sitting in memory but also some of the previous writes completed with
an error, then you can get inconsistencies in the filesystem.

Consider if a bunch of journal writes fail, but the journal commit block
sits in memory until the FC link is restored.  Then, upon remounting the
filesystem, the old garbage that was at the location of the transaction
gets copied into the filesystem because the commit block says "yup, this
transaction is complete and safe to checkpoint to the filesystem".

One solution that we have been looking at is the journal checksum patch
from U. Wisconsin.  This verifies that the journal transaction is complete
before doing any recovery checkpointing, and would prevent such an error.

Another option (to at least avoid gratiuitous damage, but haven't tried
this yet) is to mark the whole block device read-only when the journal
is aborted so that the block layer will prevent any writes to be submitted
to the block device.  The jbd code would need to clear the read-only flag
after the filesystem is being unmounted and all the buffers have been
discarded.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.