[dm-devel] dm thin: commit pool's metadata on last close of thin device
Mike Snitzer
snitzer at redhat.com
Thu May 17 04:00:41 UTC 2012
On Wed, May 16 2012 at 8:43pm -0400,
Mikulas Patocka <mpatocka at redhat.com> wrote:
>
>
> On Wed, 16 May 2012, Mike Snitzer wrote:
>
> > Reinstate dm_flush_all and dm_table_flush_all. dm_blk_close will
> > now trigger the .flush method of all targets within a table on the last
> > close of a DM device.
> >
> > In the case of the thin target, the thin_flush method will commit the
> > backing pool's metadata.
> >
> > Doing so avoids a deadlock that has been observed with the following
> > sequence (as can be triggered via "dmsetup remove_all"):
> > - IO is issued to a thin device, thin device is closed
> > - pool's metadata device is suspended before the pool is
> > - because the pool still has outstanding IO we deadlock because the
> > pool's metadata device is suspended
> >
> > Signed-off-by: Mike Snitzer <snitzer at redhat.com>
> > Cc: stable at vger.kernel.org
>
> I'd say --- don't do this sequence.
>
> Device mapper generally expects that devices are suspended top-down ---
> i.e. you should first suspend the thin device and then suspend its
> underlying data and metadata device. If you violate this sequence and
> suspend bottom-up, you get deadlocks.
>
> For example, if dm-mirror is resynchronizing and you suspend the
> underlying leg or log volume and then suspend dm-mirror, you get a
> deadlock.
>
> If dm-snapshot is merging and you suspend the underlying snapshot or
> origin volume and then suspend snapshot-merget target, you get a deadlock.
>
> These are not bugs in dm-mirror or dm-snapshot, this is expected behavior.
> Userspace shouldn't do any bottom-up suspend sequence.
>
> In the same sense, if you suspend the underlying data or metadata pool and
> then suspend dm-thin, you get a deadlock too. Fix userspace so that it
> doesn't do it.
Yeah, I agree. I told Zdenek it'd be best to just not do it.
'dmsetup remove_all' is a dumb command that invites these deadlocks when
more sophisticated stacking is being used.
But all said, in general I don't have a problem with triggering a target
specific "flush" on device close.
(Though the implementation of thin_flush could be made more
intelligent... as is we can see a pretty good storm of redundant
metadata commits if 100s of thin devices are closed simultaneously --
creating pmd->root_lock write lock contention).
More information about the dm-devel
mailing list