[dm-devel] v4.8 dm-mpath
Mike Snitzer
snitzer at redhat.com
Thu Aug 18 01:54:34 UTC 2016
On Wed, Aug 17 2016 at 8:29pm -0400,
Bart Van Assche <bart.vanassche at sandisk.com> wrote:
> On 08/16/2016 07:43 PM, Mike Snitzer wrote:
> >On Tue, Aug 16 2016 at 3:12pm -0400, Mike Snitzer <snitzer at redhat.com> wrote:
> >>On Tue, Aug 16 2016 at 1:32pm -0400, Bart Van Assche <bart.vanassche at sandisk.com> wrote:
> >>>Can you have a look at this?
> >>
> >>I'll get linux-dm.git's 'dm-4.8' branch (v4.8-rc2) loaded on one of my
> >>testbed systems to run the mptest testsuite. It should provide coverage
> >>for simple failover and failback.
> >
> >I successfully ran mptest against that v4.8-rc2 based kernel. Any
> >chance you could try mptest to see if it also passes for you? Or if it
> >is able to trigger the issue for you.
> >
> >mptest's runtest script defaults to setting up scsi-mq and dm-mq.
>
> Hello Mike,
>
> I will run mptest as soon as I have the time.
OK, thanks.
> Earlier today I came up with three patches that avoid the hang in
> truncate_inode_pages_range() that I had reported before. It would be
> appreciated if you could have a look at these patches. Please keep
> in mind that I'm not a dm expert.
I don't recall where you mentioned a hang in
truncate_inode_pages_range().. ah I see it here:
https://patchwork.kernel.org/patch/9202331/
Comments inlined below.
> From 0e8e9f7d7489f5e3b0bf9e0c59257277d08a2ec0 Mon Sep 17 00:00:00 2001
> From: Bart Van Assche <bart.vanassche at sandisk.com>
> Date: Wed, 17 Aug 2016 14:18:36 -0700
> Subject: [PATCH 1/3] block: Avoid that requests get stuck when stopping and
> restarting a queue
>
> If a driver calls blk_start_queue() after a queue has been marked
> "dead", avoid that requests get stuck by invoking __blk_run_queue()
> directly if QUEUE_FLAG_DEAD has been set.
>
> While testing the dm-mpath driver I noticed that sporadically a
> readahead request was generated by the filesystem on top of the dm
> device but that that I/O request was never processed. This happened
> only while the dm core was suspending and resuming I/O.s
>
> The following backtrace (triggered by a WARN_ON_ONCE(true) statement
> that I had inserted just before the __blk_run_queue(q) call) shows
> that this code path is relevant:
>
> WARNING: CPU: 7 PID: 4707 at drivers/md/dm.c:2035 map_request+0x1d5/0x340 [dm_mod]
> Aug 17 15:47:35 ion-dev-ib-ini kernel: CPU: 7 PID: 4707 Comm: kdmwork-254:0 Tainted: G W 4.7.0-dbg+ #2
> Call Trace:
> [<ffffffff81322ee7>] dump_stack+0x68/0xa1
> [<ffffffff81061e06>] __warn+0xc6/0xe0
> [<ffffffff81061ed8>] warn_slowpath_null+0x18/0x20
> [<ffffffffa04282d5>] map_request+0x1d5/0x340 [dm_mod]
> [<ffffffffa042845d>] map_tio_request+0x1d/0x40 [dm_mod]
> [<ffffffff81085fe1>] kthread_worker_fn+0xd1/0x1b0
> [<ffffffff81085e9a>] kthread+0xea/0x100
> [<ffffffff8162857f>] ret_from_fork+0x1f/0x40
>
> References: commit 704605711ef0 ("block: Avoid scheduling delayed work on a dead queue")
> Signed-off-by: Bart Van Assche <bart.vanassche at sandisk.com>
> Cc: Christoph Hellwig <hch at lst.de>
> Cc: Mike Snitzer <snitzer at redhat.com>
> Cc: stable <stable at vger.kernel.org>
> ---
> block/blk-core.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/block/blk-core.c b/block/blk-core.c
> index 0d9638e..5f75709 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -355,6 +355,8 @@ void blk_run_queue_async(struct request_queue *q)
> {
> if (likely(!blk_queue_stopped(q) && !blk_queue_dead(q)))
> mod_delayed_work(kblockd_workqueue, &q->delay_work, 0);
> + else
> + __blk_run_queue(q);
> }
> EXPORT_SYMBOL(blk_run_queue_async);
Your else clause would catch the case when the queue is stopped but
not dead... __blk_run_queue() will handle this cleanly but I think the
logic could be: else if (blk_queue_dead(q))
Also, strikes me as odd to have an _async interface resort to a
synchronous call. Why is that needed? Are you sure it is safe for all
callers?
> From 9b103cfb2d9b4cdec73c7d09188b4d42b041d49e Mon Sep 17 00:00:00 2001
> From: Bart Van Assche <bart.vanassche at sandisk.com>
> Date: Wed, 17 Aug 2016 16:26:02 -0700
> Subject: [PATCH 2/3] dm: Change return type of __dm_internal_suspend() to int
>
> ---
> drivers/md/dm.c | 15 ++++++++++-----
> 1 file changed, 10 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index 3689b7f..6eccdf3 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -3379,16 +3379,18 @@ out:
> * It may be used only from the kernel.
> */
>
> -static void __dm_internal_suspend(struct mapped_device *md, unsigned suspend_flags)
> +static int __dm_internal_suspend(struct mapped_device *md,
> + unsigned suspend_flags)
> {
> struct dm_table *map = NULL;
> + int ret = 0;
>
> if (md->internal_suspend_count++)
> - return; /* nested internal suspend */
> + goto out; /* nested internal suspend */
>
> if (dm_suspended_md(md)) {
> set_bit(DMF_SUSPENDED_INTERNALLY, &md->flags);
> - return; /* nest suspend */
> + goto out; /* nest suspend */
> }
>
> map = rcu_dereference_protected(md->map, lockdep_is_held(&md->suspend_lock));
> @@ -3399,10 +3401,13 @@ static void __dm_internal_suspend(struct mapped_device *md, unsigned suspend_fla
> * would require changing .presuspend to return an error -- avoid this
> * until there is a need for more elaborate variants of internal suspend.
> */
> - (void) __dm_suspend(md, map, suspend_flags, TASK_UNINTERRUPTIBLE,
> - DMF_SUSPENDED_INTERNALLY);
> + ret = __dm_suspend(md, map, suspend_flags, TASK_UNINTERRUPTIBLE,
> + DMF_SUSPENDED_INTERNALLY);
>
> dm_table_postsuspend_targets(map);
> +
> +out:
> + return ret;
> }
>
> static void __dm_internal_resume(struct mapped_device *md)
So you're only wanting to have your new __dm_internal_suspend() caller,
__dm_destroy, issue a WARN_ON if __dm_suspend() fails.
Is this _really_ important or did you just do this for debugging
purposes?
> From f6e58e71eec584b3f66fe8367340ca896d29408e Mon Sep 17 00:00:00 2001
> From: Bart Van Assche <bart.vanassche at sandisk.com>
> Date: Wed, 17 Aug 2016 15:26:07 -0700
> Subject: [PATCH 3/3] dm: Avoid that requests get stuck while destroying a dm
> device
>
> ---
> drivers/md/dm.c | 25 ++++++++++++++-----------
> 1 file changed, 14 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index 6eccdf3..ae4bfa0 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -2958,10 +2958,13 @@ const char *dm_device_name(struct mapped_device *md)
> }
> EXPORT_SYMBOL_GPL(dm_device_name);
>
> +static int __dm_internal_suspend(struct mapped_device *md,
> + unsigned suspend_flags);
> +
> static void __dm_destroy(struct mapped_device *md, bool wait)
> {
> - struct dm_table *map;
> - int srcu_idx;
> + struct request_queue *q = dm_get_md_queue(md);
> + int res;
>
> might_sleep();
>
> @@ -2970,21 +2973,21 @@ static void __dm_destroy(struct mapped_device *md, bool wait)
> set_bit(DMF_FREEING, &md->flags);
> spin_unlock(&_minor_lock);
>
> - if (dm_request_based(md) && md->kworker_task)
> - flush_kthread_worker(&md->kworker);
Header needs detail. E.g. __dm_suspend() via, __dm_internal_suspend(),
will call flush_kthread_worker(). But that is a nit in the grand scheme
of things. Bigger concerns below.
> + /*
> + * Avoid that dm_make_request() gets called after
> + * __dm_internal_suspend() finished.
> + */
> + spin_lock_irq(q->queue_lock);
> + queue_flag_set(QUEUE_FLAG_DYING, q);
> + spin_unlock_irq(q->queue_lock);
Feels like the case for marking the queue as DYING can be made
independent of changes to the mechanism for flushing IO here.
>
> /*
> * Take suspend_lock so that presuspend and postsuspend methods
> * do not race with internal suspend.
> */
> mutex_lock(&md->suspend_lock);
> - map = dm_get_live_table(md, &srcu_idx);
> - if (!dm_suspended_md(md)) {
> - dm_table_presuspend_targets(map);
> - dm_table_postsuspend_targets(map);
> - }
> - /* dm_put_live_table must be before msleep, otherwise deadlock is possible */
> - dm_put_live_table(md, srcu_idx);
> + res = __dm_internal_suspend(md, DM_SUSPEND_NOFLUSH_FLAG);
> + WARN_ONCE(res, "__dm_internal_suspend() returned %d\n", res);
> mutex_unlock(&md->suspend_lock);
>
> /*
Please be explicit about why switching to using __dm_internal_suspend()
is important. Are you just going for code reuse or is there a more
important reason?
I can infer you probably want the call to dm_stop_queue()...
But I need way more convincing on why __dm_internal_suspend() is needed.
As you can see from the header for commit ffcc3936416, the internal
suspend code is relatively complex. Increasing its use, especially in a
method such as __dm_destroy, needs a good reason.
But that aside, what raises serious concerns for me is that you're using
DM_SUSPEND_NOFLUSH_FLAG where we absolutely _need_ to flush IO (given
we're in __dm_destroy).
At a minimum DM_SUSPEND_NOFLUSH_FLAG must not be used.
Mike
More information about the dm-devel
mailing list