[dm-devel] [PATCH 9/9] dm path selector: Avoid that device removal triggers an infinite loop

Mike Snitzer snitzer at redhat.com
Thu Sep 1 15:26:42 UTC 2016


On Thu, Sep 01 2016 at 11:22P -0400,
Bart Van Assche <bart.vanassche at sandisk.com> wrote:

> On 09/01/2016 08:06 AM, Mike Snitzer wrote:
> >On Thu, Sep 01 2016 at 10:14am -0400,
> >Bart Van Assche <Bart.VanAssche at sandisk.com> wrote:
> >
> >>On 08/31/16 20:29, Mike Snitzer wrote:
> >>>On Wed, Aug 31 2016 at  6:18pm -0400,
> >>>Bart Van Assche <bart.vanassche at sandisk.com> wrote:
> >>>
> >>>>If pg_init_retries is set and a request is queued against a
> >>>>multipath device with all underlying block devices in the "dying"
> >>>>state then an infinite loop is triggered because activate_path()
> >>>>never succeeds and hence never calls pg_init_done(). Fix this by
> >>>>making ql_select_path() skip dying paths.
> >>>
> >>>Assuming DM multipath needs to be sprinkling these dying queue checks so
> >>>deep (which I'm not yet sold on):
> >>>
> >>>Same would be needed in service-time and round-robin right?
> >>
> >>Hello Mike,
> >>
> >>Before addressing service-time and round-robin path selectors I wanted
> >>to make sure that we reach agreement about how to fix the queue length
> >>path selector.
> >>
> >>Do you have a proposal for an alternative approach to fix the infinite
> >>loop that can be triggered during device removal?
> >
> >I'm going to look closer now.  But I'd prefer to see the "dying" state
> >check(s) elevated to DM multipath.  Really would rather the path
> >selectors not have to worry about this state.
> 
> Hello Mike,
> 
> How about making blk_cleanup_queue() invoke a callback function in dm or
> dm-mpath and to use that callback function to keep track of the number of
> paths that are not in the "dying" state? That would allow to detect in the
> dm or dm-mpath driver whether or not all paths are in the dying state
> without having to modify every path selector. This is just an idea - there
> might be better alternatives.

Even that seems like overkill.  What about this?  Any chance you could
try the linux-dm.git 'dm-4.9' branch with this patch ontop?

diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
index ac734e5..15db5e9 100644
--- a/drivers/md/dm-mpath.c
+++ b/drivers/md/dm-mpath.c
@@ -1521,10 +1521,10 @@ static void activate_path(struct work_struct *work)
 {
 	struct pgpath *pgpath =
 		container_of(work, struct pgpath, activate_path.work);
+	struct request_queue *q = bdev_get_queue(pgpath->path.dev->bdev);
 
-	if (pgpath->is_active)
-		scsi_dh_activate(bdev_get_queue(pgpath->path.dev->bdev),
-				 pg_init_done, pgpath);
+	if (pgpath->is_active && !blk_queue_dying(q))
+		scsi_dh_activate(q, pg_init_done, pgpath);
 	else
 		pg_init_done(pgpath, SCSI_DH_DEV_OFFLINED);
 }




More information about the dm-devel mailing list