[dm-devel] dm bufio: fix deadlock issue with loop device

Tue Jul 9 00:15:29 UTC 2019

On Mon, Jul 08 2019 at  7:54pm -0400,
Junxiao Bi <junxiao.bi at oracle.com> wrote:

> On 7/8/19 7:14 AM, Mike Snitzer wrote:
> 
> >On Fri, Jul 05 2019 at  4:24pm -0400,
> >Junxiao Bi <junxiao.bi at oracle.com> wrote:
> >
> >>Hi Mike,
> >>
> >>Do i make sense on this?
> >No, you haven't made your chase for this change.  Sorry.
> >
> >Please refine the patch header to _not_ get into context you have from
> >a vendor kernel.  I know you say this is hard to reproduce, etc.
> Thanks, I will refine it in v2.
> >But
> >you don't even get into ther usecase where the issue was seen.  Was this
> >DM thinp?  DM cache?  Something else?
> it's thin-provision target. Customer is using docker.

OK, with loop files? (really hackish and poor performing but loopback
enabled the ability to not reinstall, or plan ahead, caused a lot of
people to use it... that is until overlayfs arrived)

> >Please be as concise and precise as possible.  Saying that shrinker is
> >the same context as loop doesn't imply a whole lot to me (relative to
> >why this is prone to deadlock).
> >
> >To restate my concern: if __GFP_FS isn't set then why does your patch
> >help at all?  If __GFP_FS is set, then that changes things..
> 
> If __GFP_FS isn't set, the behavior is the same with w/o this patch.

Yes.

> If it is set and the mutex was already hold by others, shrinker
> stop, deadlock avoid.

Fine, please explain how that happens in the context of existing
upstream code.  Please make the case for fixing upstream.

Thanks,
Mike