[dm-devel] blk-mq request allocation stalls

Jens Axboe axboe at kernel.dk
Mon Jan 12 19:05:10 UTC 2015


On 01/12/2015 11:22 AM, Keith Busch wrote:
> On Mon, 12 Jan 2015, Jens Axboe wrote:
>> On 01/12/2015 10:53 AM, Keith Busch wrote:
>>> Is the nr_active count correct prior to starting the mkfs test? Trying
>>> to see if someone is calling "blk_mq_alloc_tag_set()" twice on the same
>>> set. It might be good to add a WARN if this is detected anyway.
>>
>> That might be a good debug aid, I agree. But the above doesn't look
>> like it's corrupted. If you add the values, you get 60 and 62 for the
>> two cases, which seems to indicate that we did bump the values
>> correctly, but for some reason we never did the decrement on
>> completion. Hence we stabilize around the queue depth of the device,
>> which will be 62 +/- a bit due to the sharing.
>>
>> I'm not familiar with how rq based dm works. We clone the original
>> request (which has the RQ_MQ_INFLIGHT flag set), then we issue the
>> clone(s) to the underlying device(s)? And when that completes, we
>> complete the original? That would work fine with the flag on the
>> original request. Maybe I'm missing something, and I'll let more
>> knowledgeable people discuss that.
>
> Oh, let's look at "__blk_rq_prep_clone". dm calls that after
> blk_get_request() for the blk-mq based multipath types and overrides the
> destinations cmd_flags with the source's even though the source was not
> allocated from a blk-mq based queue, much less a shared tag.

Heh, I suck, I had read that but read it as |=. So yes, that would seem 
to backup my missing flag theory.


-- 
Jens Axboe




More information about the dm-devel mailing list