[dm-devel] [RFC PATCH v3] dm mpath: add a queue_if_no_path timeout

Frank Mayhar fmayhar at google.com
Wed Nov 6 15:43:59 UTC 2013


On Wed, 2013-11-06 at 07:54 +0100, Hannes Reinecke wrote:
> On 11/05/2013 05:02 PM, Frank Mayhar wrote:
> > This is the patch submitted by Jun'ichi Nomura, originally based on
> > Mike's patch with some small changes by me.  Jun'ichi's description
> > follows, along with my changes:
> > 
> > On Tue, 2013-11-05 at 07:18 -0800, Frank Mayhar wrote:
> >> On Fri, 2013-11-01 at 04:17 +0000, Junichi Nomura wrote:
> >>> I slightly modified the patch:
> >>>   - fixed the timeout handler to correctly find
> >>>     clone request and "struct multipath"
> >>>   - the timeout handler just disables "queue_if_no_path"
> >>>     instead of killing the request directly
> >>>   - "dmsetup status" to show the parameter
> >>>   - changed an interface between dm core and target
> >>>   - added some debugging printk (you can remove them)
> >>> and checked the timeout occurs, at least.
> >>>
> >>> I'm not sure if this feature is good or not though.
> >>> (The timer behavior is not intuitive, I think)
> >> Thanks!  I integrated your new patch and tested it.  Sure enough, it
> >> seems to work.  I've made a few tweaks (added a module tunable and
> >> support for setting the timer in multipath_message(), removed your debug
> >> printks) and will submit the modified patch for discussion shortly.
> > 
> > Comments?
> > 
> Yeah. Seems to be my eternal fate; initiating fixes and not getting
> mentioned at all.
> Sigh.
> 
> I dimly remember having sent the original patch for the blk timeout
> function ... hence a short notice would've been nice.

Sorry, I did ding you early on (and I think Mike dinged you as well),
but you were apparently busy with other things at the time.

I also sent email last Thursday, quoted below:


On Wed, 2013-10-30 at 11:43 -0400, Mike Snitzer wrote:
> On Wed, Oct 30 2013 at 11:08am -0400,
> Frank Mayhar <fmayhar at google.com> wrote:
> 
> > On Tue, 2013-10-29 at 21:02 -0400, Mike Snitzer wrote:
> > > Any interest in this or should I just table it for >= v3.14?
> > 
> > Sorry, I've been busy putting out another fire.  Yes, there's
definitely
> > still interest.  I grabbed your revised patch and tested with it.
> > Unfortunately the timeout doesn't actually fire when requests are
queued
> > due to queue_if_no_path; IIRC the block request queue timeout logic
> > wasn't triggering.  I planned to look into it more deeply figure out
why
> > but I had to spend all last week fixing a nasty race and hadn't
gotten
> > back to it yet.
> 
> OK, Hannes, any idea why this might be happening?  The patch in
question
> is here: https://patchwork.kernel.org/patch/3070391/

I got to this today and so far the most interesting I see is that the
cloned request that's queued in multipath has no queue associated with
it when it's queued; a printk reveals:

[  517.610042] map_io: queueing rq ffff8801150e0070 q           (null)

When it's eventually dequeued, it gets a queue from the destination
device (in the pgpath) via bdev_get_queue().

Because of this and from just looking at the code, blk_start_request()
(and therefore blk_add_timer()) isn't being called for those requests,
so there's never a chance that the timeout would happen.

Does this make sense?  Or am I totally off-base?
-- 
Frank Mayhar
310-460-4042




More information about the dm-devel mailing list