[dm-devel] [PATCH 8/9] dm: Fix two race conditions related to stopping and starting queues

Mike Snitzer snitzer at redhat.com
Fri Sep 2 15:12:13 UTC 2016


On Thu, Sep 01 2016 at  8:03pm -0400,
Bart Van Assche <bart.vanassche at sandisk.com> wrote:

> On 09/01/2016 04:48 PM, Mike Snitzer wrote:
> > On Thu, Sep 01 2016 at  7:17pm -0400,
> > Bart Van Assche <bart.vanassche at sandisk.com> wrote:
> >> Sorry that I misread your previous e-mail. After I received your
> >> latest e-mail I rebased my tree on top of the devel.bart branch
> >> mentioned above. My tests still pass. The only two patches in my
> >> tree that are relevant and that are not in the devel.bart branch
> >> have been attached to this e-mail. Did your test involve the sd
> >> driver? If so, do the attached two patches help? If the sd driver
> >> was not involved, can you provide more information about the hang
> >> you ran into? The output and log messages generated by the following
> >> commands after the hang has been reproduced would be very welcome:
> >> * echo w > /proc/sysrq-trigger
> >> * (cd /sys/block && grep -a '' dm*/mq/*/{pending,cpu*/rq_list})
> > 
> > sd is used.  I'll apply those patches and test, tomorrow, but I'm pretty
> > skeptical.
> > 
> > Haven't had any problems with these tests for quite a while.  The tests
> > I'm running are just those in the mptest testsuite, see:
> > https://github.com/snitm/mptest
> > 
> > Running them should be as simple as you doing:
> > 
> > git clone git://github.com/snitm/mptest.git
> > cd mptest
> > ./runtest
> > 
> > The default is to use dm-mq on scsi-mq ontop of tcmloop.
> > 
> > [ ... ]
> 
> Hello Mike,
> 
> If I run mptest on my setup I can reproduce the hang. But what I see is
> that the service-time path selector is in use when the hang is triggered.
> I will patch that path selector in the same way as I did with the
> queue-length path selector and rerun the test.
> 
> # dmsetup table
> 1Linux_scsi_debug_2000: 0 2097152 multipath 3 retain_attached_hw_handler queue_mode mq 1 alua 3 1 service-time 0 1 2 8:128 1 1 service-time 0 1 2 8:144 1 1 service-time 0 1 2 8:112 1 1 
> mp: 0 2097152 multipath 3 retain_attached_hw_handler queue_mode mq 1 alua 2 1 queue-length 0 2 1 8:96 1 8:112 1 queue-length 0 2 1 8:128 1 8:144 1 
> # (cd /sys/block && grep -a '' dm*/mq/*/{pending,cpu*/rq_list}) | grep -v ':$'
> dm-0/mq/0/pending:      ffff880358610000
> dm-1/mq/0/pending:      ffff880358220200
> dm-1/mq/0/pending:      ffff880358220400

In case I haven't been clear: calling blk_mq_freeze_queue() _after_
you've suspended the DM device will only trigger IO that will get
re-queued to the DM device's blk-mq queue.  So you're creating a
livelock since blk_mq_freeze_queue_wait() will not return, see stack
from:
https://www.redhat.com/archives/dm-devel/2016-September/msg00017.html

FYI, the BLK_MQ_S_STOPPED check that you removed from dm_mq_queue_rq()
in this commit...
http://git.kernel.org/cgit/linux/kernel/git/snitzer/linux.git/commit/?h=devel.bart&id=69eb3e60e099a6117fc754e70eedd504685326ad
...is effectively serving as this when the device is suspended:

diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
index b5db523..3bc16dc 100644
--- a/drivers/md/dm-rq.c
+++ b/drivers/md/dm-rq.c
@@ -863,6 +863,9 @@ static int dm_mq_queue_rq(struct blk_mq_hw_ctx *hctx,
                dm_put_live_table(md, srcu_idx);
        }

+       if (unlikely(test_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags)))
+               return BLK_MQ_RQ_QUEUE_BUSY;
+
        if (ti->type->busy && ti->type->busy(ti))
                return BLK_MQ_RQ_QUEUE_BUSY;

Which is comparable to what the old .request_fn DM does (see:
dm_make_request).

Without that type of suspend check the code will go on to call
dm-mpath.c:multipath_clone_and_map() which will only result in
DM_MAPIO_REQUEUE (and dm_mq_queue_rq returning BLK_MQ_RQ_QUEUE_BUSY in
the case when multipath has no paths, via must_push_back_rq()).

Anyway, not what we want in general.  The goal during DM device suspend
is to stop IO from being mapped.  With request-based DM suspend we're
punting requests back to the block layer's request_queue.

So in the case of blk-mq request-based DM: we cannot expect
blk_mq_freeze_queue(), during suspend, to complete if requests are
getting requeued to the blk-mq queue via BLK_MQ_RQ_QUEUE_BUSY.

Mike




More information about the dm-devel mailing list