[dm-devel] dm-mq and end_clone_request()

Mike Snitzer snitzer at redhat.com
Mon Jul 25 21:23:25 UTC 2016


On Mon, Jul 25 2016 at  1:53pm -0400,
Mike Snitzer <snitzer at redhat.com> wrote:

> On Thu, Jul 21 2016 at  4:58pm -0400,
> Bart Van Assche <bart.vanassche at sandisk.com> wrote:
> 
> > On 07/20/2016 11:33 AM, Mike Snitzer wrote:
> > >Would be interesting to know the error returned from map_request()'s
> > >ti->type->clone_and_map_rq().  Really should just be DM_MAPIO_REQUEUE.
> > >But the stack you've provided shows map_request calling
> > >dm_complete_request(), which implies dm_kill_unmapped_request() is being
> > >called due to ti->type->clone_and_map_rq() returning < 0.
> > 
> > Hello Mike,
> > 
> > Apparently certain requests fail with -EIO because DM_DEV_SUSPEND
> > ioctls are being submitted to the same multipath target. As you know
> > DM_DEV_SUSPEND changes QUEUE_IF_NO_PATH from 1 into 0. A WARN_ON()
> > statement that I added in driver dm-mpath statement learned me that
> > multipathd is submitting these DM_DEV_SUSPEND ioctls. In the output
> > of strace -fp$(pidof multipathd) I found the following:
> > 
> > [pid 13927] ioctl(5, DM_TABLE_STATUS, 0x7fa1000483f0) = 0
> > [pid 13927] write(1, "mpathbe: failed to setup multipa"..., 35) = 35
> > [pid 13927] write(1, "dm-0: uev_add_map failed\n", 25) = 25
> > [pid 13927] write(1, "uevent trigger error\n", 21) = 21
> > [pid 13927] write(1, "sdh: remove path (uevent)\n", 26) = 26
> > [pid 13927] ioctl(5, DM_TABLE_LOAD, 0x7fa1000483f0) = 0
> > [pid 13927] ioctl(5, DM_DEV_SUSPEND, 0x7fa1000483f0) = 0
> > 
> > I'm still analyzing these and other messages.
> 
> The various ioctls you're seeing is just multipathd responding to the
> failures.  Part of reloading a table (with revised path info, etc) is to
> suspend and then resume the device that is being updated.
> 
> But I'm not actually sure on the historic reasoning of why
> queue_if_no_path is disabled (and active setting saved) on suspend.
> 
> I'll think about this further but maybe others recall why?

I think it dates back to when we queued IO within the multipath target.
Commit e809917735ebf ("dm mpath: push back requests instead of
queueing") obviously changed how we handle the retry.

But regardless __must_push_back() should catch the case where
queue_io_no_path is cleared during suspend (by checking if current !=
saved).

SO I'd be curious to know if your debugging has enabled you to identify
exactly where in the dm-mapth.c code the -EIO return is being
established.  do_end_io() is the likely candidate -- but again the
__must_push_back() check should prevent it and DM_ENDIO_REQUEUE should
be returned.

Mike




More information about the dm-devel mailing list