[dm-devel] blk-mq request allocation stalls

Mike Snitzer snitzer at redhat.com
Tue Jan 13 14:17:46 UTC 2015


On Tue, Jan 13 2015 at  7:29am -0500,
Bart Van Assche <bart.vanassche at sandisk.com> wrote:

> On 01/12/15 21:22, Mike Snitzer wrote:
> > FYI, I staged Keith's patch here:
> > https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-for-3.20-blk-mq&id=7004ddf2462df38c6e3232ac020ed6ff655cc07e
> > 
> > Bart, this is the tip of the linux-dm.git "dm-for-3.20-blk-mq" branch.
> > Please test, it should hopefully take care of the stall you've been
> > seeing.
> 
> Hello Mike,
> 
> In the quick test I ran the I/O stalls were indeed gone. Thanks :-)

Good news, followed by a new mole rearing its head ;)
 
> However, I hit another issue while running I/O on top of a multipath
> device (on a kernel with lockdep and SLUB memory poisoning enabled):
>
> NMI watchdog: BUG: soft lockup - CPU#7 stuck for 23s! [kdmwork-253:0:3116]
> CPU: 7 PID: 3116 Comm: kdmwork-253:0 Tainted: G        W      3.19.0-rc4-debug+ #1
> Call Trace:
>  [<ffffffff8118e4be>] kmem_cache_alloc+0x28e/0x2c0
>  [<ffffffff81346aca>] alloc_iova_mem+0x1a/0x20
>  [<ffffffff81342c8e>] alloc_iova+0x2e/0x250
>  [<ffffffff81344b65>] intel_alloc_iova+0x95/0xd0
>  [<ffffffff81348a15>] intel_map_sg+0xc5/0x260
>  [<ffffffffa07e0661>] srp_queuecommand+0xa11/0xc30 [ib_srp]
>  [<ffffffffa001698e>] scsi_dispatch_cmd+0xde/0x5a0 [scsi_mod]
>  [<ffffffffa0017480>] scsi_queue_rq+0x630/0x700 [scsi_mod]
>  [<ffffffff8125683d>] __blk_mq_run_hw_queue+0x1dd/0x370
>  [<ffffffff81256aae>] blk_mq_alloc_request+0xde/0x150
>  [<ffffffff8124bade>] blk_get_request+0x2e/0xe0
>  [<ffffffffa07ebd0f>] __multipath_map.isra.15+0x1cf/0x210 [dm_multipath]
>  [<ffffffffa07ebd6a>] multipath_clone_and_map+0x1a/0x20 [dm_multipath]
>  [<ffffffffa044abb5>] map_tio_request+0x1d5/0x3a0 [dm_mod]
>  [<ffffffff81075d16>] kthread_worker_fn+0x86/0x1b0
>  [<ffffffff81075c0f>] kthread+0xef/0x110
>  [<ffffffff814db42c>] ret_from_fork+0x7c/0xb0

Unfortunate.  Is this still with a 16MB backing device or is it real
hardware?  Can you share the workload so that myself and/or Keith could
try to reproduce?




More information about the dm-devel mailing list