[dm-devel] dm mpath: Fix a dm_blk_ioctl() deadlock

Mike Snitzer snitzer at redhat.com
Tue Jun 28 19:33:03 UTC 2016


On Tue, Jun 28 2016 at  3:16pm -0400,
Bart Van Assche <bart.vanassche at sandisk.com> wrote:

> On 06/28/2016 08:59 PM, Mike Snitzer wrote:
> >Can we go back to what it is you've experienced?  is it that you have
> >'queue_if_no_path' enabled and are issuing ioctls to an mpath device
> >(while removing underlying paths) you'll experience a live-lock (_not_
> >deadlock) once no valid paths exist?
> >
> >If that isn't what you're hitting then I'd like to better understand how
> >a request_queue that is "dying" isn't able to keep itself up enough to
> >fail IO issued to it (to allow normal error handling to trap the IO
> >failure).
> 
> Hello Mike,
> 
> Since I started testing kernel v4.7-rc<n> I noticed about twenty
> times that systemd-udevd got stuck in truncate_inode_pages(). I have
> not yet seen this with any older kernel version. queue_if_no_path is
> indeed enabled in my tests. The test I run consists of running fio
> on top of an mpath device and repeatedly removing and restoring the
> underlying devices. The test script is available at
> https://github.com/bvanassche/srp-test/blob/master/tests/02. Please
> let me know if you need more information.

I'm not going to be able to setup this test and chase this in the
near-term.  If you want this fixed soon then I'll need you to continue
chasing this.

Something else must be going on.  I fail to see how avoiding dying
queues, like your 2nd path selectors patch does, should be needed.

A dying queue, and the underlying device that is being torn down, still
needs to complete (fail) any of its outstanding IO -- or IO issued to it
e.g. via __blkdev_driver_ioctl -- right?

Could your driver's queue maybe not be getting torn down like it did in the
past? -- if it lingers in this "dying" state then that could start to
explain why this is happening all of a sudden in v4.7-rc<n>.  Would be
nice to know if that is what is happening.

But you've definitely seen that your path selector patch, that skips
selecting paths with dying queues, avoids this live-lock issue?




More information about the dm-devel mailing list