[dm-devel] dm-mq and end_clone_request()

Hannes Reinecke hare at suse.de
Thu Aug 4 10:09:48 UTC 2016


On 08/04/2016 11:53 AM, Hannes Reinecke wrote:
> On 08/03/2016 06:55 PM, Bart Van Assche wrote:
>> On 08/02/2016 05:40 PM, Mike Snitzer wrote:
>>> But I asked you to run the v4.7 kernel patches I
>>> pointed to _without_ any of your debug patches.
>>
>> I need several patches to fix bugs that are not related to the device
>> mapper, e.g. "sched: Avoid that __wait_on_bit_lock() hangs"
>> (https://lkml.org/lkml/2016/8/3/289).
>>
> Hmm. Can you test with this patch?
> 
> diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
> index 7790a70..9daed03 100644
> --- a/drivers/md/dm-mpath.c
> +++ b/drivers/md/dm-mpath.c
> @@ -439,8 +439,7 @@ static int must_push_back(struct multipath *m)
>  {
>         return (test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags) ||
>                 ((test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags) !=
> -                 test_bit(MPATHF_SAVED_QUEUE_IF_NO_PATH, &m->flags)) &&
> -                dm_noflush_suspending(m->ti)));
> +                 test_bit(MPATHF_SAVED_QUEUE_IF_NO_PATH, &m->flags)));
>  }
> 
>  /*
> 
> Reasoning:
> The original check for dm_noflush_suspending() was for bio-based
> drivers, which needed to queue I/O within the device-mapper core.
> So during suspend this I/O would keep a reference to the device-mapper
> core and the table couldn't be swapped.
> For request-based multipathing, however, the I/O is _never_ held within
> the device-mapper core but rather pushed back to the request queue.
> IE even for pushback the I/O will never hold a reference to the
> device-mapper core, and the tables can be swapped irrespective of the
> 'dm_noflush_suspend()' setting.
> 
> Or that's the idea, at least :-)
> 
> Yes Mike, I know, it's not going to work with bio-based multipathing.
> But this is just for figuring out where the real issue is.
> 
And indeed.

multipathd is calling DM_SUSPEND _without_ the noflush_suspending flag.
(On the grounds that originally it needed to flush all I/O from the
device-mapper core).
Which will be causing I/O errors if any I/O is executed after
->presuspend has been called.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare at suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)




More information about the dm-devel mailing list