[dm-devel] [PATCH 8/9] dm: Fix two race conditions related to stopping and starting queues

Mike Snitzer snitzer at redhat.com
Thu Sep 1 19:05:05 UTC 2016


On Thu, Sep 01 2016 at  1:59pm -0400,
Bart Van Assche <bart.vanassche at sandisk.com> wrote:

> On 09/01/2016 09:12 AM, Mike Snitzer wrote:
> >On Thu, Sep 01 2016 at 11:50am -0400,
> >Mike Snitzer <snitzer at redhat.com> wrote:
> >
> >>On Thu, Sep 01 2016 at 11:31am -0400,
> >>Bart Van Assche <bart.vanassche at sandisk.com> wrote:
> >>
> >>>On 09/01/2016 08:05 AM, Mike Snitzer wrote:
> >>>>I've staged most of your changes (with slight tweaks), see:
> >>>>https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/log/?h=dm-4.9
> >>>>
> >>>>Only remaining issue is the queue dying race(s) in dm-multipath.
> >>>
> >>>Thanks Mike! Two minor comments though:
> >>>* In dm_start_queue(), I think that the queue_flag_clear_unlocked()
> >>>  call should be converted into queue_flag_clear() and that it should
> >>>  be protected by the block layer queue lock. Every call of
> >>>  queue_flag_clear_unlocked() after block device initialization has
> >>>  finished is wrong if blk_cleanup_queue() can be called concurrently.
> >>
> >>OK, I'll have a look.
> >
> >Please see/test the dm-4.8 and dm-4.9 branches (dm-4.9 being rebased
> >ontop of dm-4.8):
> >https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/log/?h=dm-4.8
> >https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/log/?h=dm-4.9
> 
> Hello Mike,
> 
> The result of my tests of the dm-4.9 branch is as follows:
> * With patch "dm mpath: check if path's request_queue is dying in
> activate_path()" I still see every now and then that CPU usage of
> one of the kworker threads jumps to 100%.

So you're saying that the dying queue check is still needed in the path
selector?  Would be useful to know why the 100% is occuring.  Can you
get a stack trace during this time?

> * A "if (!blk_queue_stopped(q))" test needs to be added in
> dm_stop_queue() to avoid the following hang (that test was present
> in my version of the patch that adds the
> blk_mq_{freeze,unfreeze}_queue() calls):
> 
>     sysrq: SysRq : Show Blocked State
>       task                        PC stack   pid father
>     multipathd      D ffff8803c8d37b80     0  3242      1 0x00000000
>     Call Trace:
>      [<ffffffff81627087>] schedule+0x37/0x90
>      [<ffffffff813097e1>] blk_mq_freeze_queue_wait+0x51/0xb0
>      [<ffffffff8130be05>] blk_mq_freeze_queue+0x15/0x20
>      [<ffffffffa034d882>] dm_stop_queue+0x62/0xc0 [dm_mod]
>      [<ffffffffa0342a1b>] dm_swap_table+0x2fb/0x370 [dm_mod]
>      [<ffffffffa0347875>] dev_suspend+0x95/0x220 [dm_mod]
>      [<ffffffffa03480fc>] ctl_ioctl+0x1fc/0x550 [dm_mod]
>      [<ffffffffa034845e>] dm_ctl_ioctl+0xe/0x20 [dm_mod]
>      [<ffffffff811ee27f>] do_vfs_ioctl+0x8f/0x690
>      [<ffffffff811ee8bc>] SyS_ioctl+0x3c/0x70
>      [<ffffffff8162d125>] entry_SYSCALL_64_fastpath+0x18/0xa8

OK, I've adjusted accordingly and pushed dm-4.8 and dm-4.9 again (with
force, sorry about that).




More information about the dm-devel mailing list