[dm-devel] blk-mq request allocation stalls

Jens Axboe axboe at kernel.dk
Mon Jan 12 15:42:02 UTC 2015


On 01/12/2015 07:46 AM, Bart Van Assche wrote:
> On 01/10/15 04:10, Mike Snitzer wrote:
>> On Fri, Jan 09 2015 at  8:59pm -0500,
>> Jens Axboe <axboe at kernel.dk> wrote:
>>> Bart, could you try the patch (the -v4) and your DM hang and see if
>>> it solves it for you?
>>
>> Yes, I'm interested to hear from Bart on v4 too.
>
> Hello Mike and Jens,
>
> Sorry but even with v4 applied filesystem creation still takes too long.
> The kernel I have been testing with was generated as follows:
> * Started from Mike's dm-for-3.20-blk-mq branch.
> * Merged v3.19-rc4 with this branch.
> * Applied Jens' blk-mq tag patch and Mike's debug patch on top.
> * Modified Mike's patch to make it print the blk-mq "may_queue" state
>    (hctx_may_queue(hctx, bt)).
>
> Here are the results without multipath:
>
> # systemctl disable multipathd
> # systemctl stop multipathd
> # dmsetup remove_all
> # rmmod dm_service_time
> # rmmod dm_multipath
> # rmmod dm_mod
> # time mkfs.xfs -f /dev/sdc >/dev/null
> real    0m0.037s
> user    0m0.000s
> sys     0m0.020s
> # time mkfs.xfs -f /dev/sdd >/dev/null
> real    0m0.030s
> user    0m0.010s
> sys     0m0.010s
>
> With multipath:
>
> # ls -l /dev/sd[cd]
> brw-rw---- 1 root disk 8, 32 Jan 12 15:09 /dev/sdc
> brw-rw---- 1 root disk 8, 48 Jan 12 15:11 /dev/sdd
> # systemctl start multipathd
> # dmsetup table /dev/dm-0
> 0 256000 multipath 3 queue_if_no_path pg_init_retries 50 0 1 1
> service-time 0 2 2 8:48 1 1 8:32 1 1
> # time mkfs.xfs -f /dev/dm-0 >/dev/null
> real    0m8.845s
> user    0m0.000s
> sys     0m0.020s
> # time mkfs.xfs -f /dev/dm-0 >/dev/null
> real    0m14.905s
> user    0m0.000s
> sys     0m0.020s
>
> What is remarkable is that Mike's debug patch started to report
> "bt_get() returned -1" as soon as multipathd was started. The first of
> many identical call traces printed by this debug patch was as follows:
>
> bt_get: __bt_get() returned -1
> queue_num=2, nr_tags=62, reserved_tags=0, bits_per_word=3
> nr_free=62, nr_reserved=0, may_queue=0
> active_queues=8

Can you add dumping of hctx->nr_active when this fails? You case is that 
the may_queue logic says no-can-do, so it smells like the nr_active 
accounting is wonky since you have supposedly no allocated tags, yet it 
clearly thinks that you do.

-- 
Jens Axboe




More information about the dm-devel mailing list