[dm-devel] staged dm_internal_{suspend, resume} related changes for wider review

Fri Nov 7 18:33:21 UTC 2014

On Fri, Nov 07 2014 at 11:20am -0500,
Mikulas Patocka <mpatocka at redhat.com> wrote:

> 
> 
> On Wed, 5 Nov 2014, Mike Snitzer wrote:
> 
> > On Wed, Nov 05 2014 at  9:37am -0500,
> > Mikulas Patocka <mpatocka at redhat.com> wrote:
> > 
> > > The patch series introduces two suspend mechanisms and it is unclear how 
> > > should they interact with each other.
> > 
> > And this point is not correct.  As you know dm_internal_suspend and
> > dm_internal_resume interface predates any of my changes.
> > 
> > That existing interface was extended them to be (mostly) fully formed
> > equivalents of dm_suspend() and dm_resume().
> > 
> > I say "mostly" because dm_internal_resume() doesn't call into the targets'
> > resume hooks because no existing callers (dm-stats or dm-thinp) need
> > to.  But obviously dm_resume() does need to so it passes @resume_targets
> > as true to __dm_resume().
> > 
> > I'm not trying to suggest there is a bug or bugs in this new code (you
> > already pointed out the locking issue across ioctls that I fixed).
> > 
> > But a bug doesn't implicitly mean this is an imperfect way forward --
> > if/when a bug is found we'll deal with it.. so feel free to pour over
> > this code to see if there is a bug or bugs.  I really do welcome your
> > review -- I would just like technical issues to be the focus of any
> > technical review.
> 
> Problems with that patch set:
> 
> 1. You walk all thin targets in pool_presuspend and call internal_suspend 
> on them and then again in pool_resume call internal_resume on them. 
> Between calls to pool_presuspend and pool_resume, dm-thin devices may be 
> added or removed, resulting in unballanced calls.

No, we cannot add/remove or activate/deactivate thin devices while the
pool is suspended.  Those operations should block until the pool is
resumed.  I'll audit/fix accordingly.

> 2. You walk all thin targets and call internal_suspend on them. If two 
> thin targets are in one dm table, it calls internal_suspend twice on the 
> same md device. It also calls internal_suspend twice on the same md device 
> if the device has both active and inactive table with a thin target.

While I see your point, a thin DM table will never have multiple thin
targets in it.  This probably _should_ be enforced by adding
DM_TARGET_SINGLETON to the thin_target's .features (should also probably
have DM_TARGET_IMMUTABLE).

> 3. The device may be suspended internally by pool presuspend and by 
> statistics at the same time - the code doesn't handle that:
> if (WARN_ON(dm_suspended_internally_md(md))) goto out; /* disallow nested 
> internal suspends! */

Shouldn't that be fine?  Just implies the stats won't be as precise...
BUT, the WARN_ON() is clearly overkill and needs to be removed.

> 4. when pool_presuspend is called, md->suspend_lock is alrady held. Taking 
> md->suspend_lock on another device results in lockdep warning.

That should be fixed with proper lockdep training.  As I thought was the
case by the mutex_lock_nested() I added to dm_suspend().  But I'll
recompile my kernel with lockdep enabled and silence lockdep once and
for all if it is still complaining.

> For reasons 1 and 2, I wouldn't really deal with "thin" targets at all - 
> they may be created or deleted independent on pool status. Instead, we 
> should block all active bios inside the pool - the bios are already 
> registered in dm_deferred_set or in the prison, so all you need to do is 
> to set a flag pool's presuspend method that causes all new bios to be 
> queues and the wait until the prison is empty and the counters in 
> deferred_set reach zero.

Given my explanation above, reasons 1 and 2 shouldn't really be a
concern in the end.

Thanks for the review!