[dm-devel] [PATCH 1/2] blk-mq: add callback of .cleanup_rq
Mike Snitzer
snitzer at redhat.com
Fri Jul 19 12:26:54 UTC 2019
On Thu, Jul 18 2019 at 9:35pm -0400,
Ming Lei <ming.lei at redhat.com> wrote:
> On Thu, Jul 18, 2019 at 10:52:01AM -0400, Mike Snitzer wrote:
> > On Wed, Jul 17 2019 at 11:25pm -0400,
> > Ming Lei <ming.lei at redhat.com> wrote:
> >
> > > dm-rq needs to free request which has been dispatched and not completed
> > > by underlying queue. However, the underlying queue may have allocated
> > > private stuff for this request in .queue_rq(), so dm-rq will leak the
> > > request private part.
> >
> > No, SCSI (and blk-mq) will leak. DM doesn't know anything about the
> > internal memory SCSI uses. That memory is a SCSI implementation detail.
>
> It isn't noting to do with dm-rq, which frees one request after BLK_STS_*RESOURCE
> is returned from blk_insert_cloned_request(), in this case it has to be
> the user for releasing the request private data.
>
> >
> > Please fix header to properly reflect which layer is doing the leaking.
>
> Fine.
>
> >
> > > Add one new callback of .cleanup_rq() to fix the memory leak issue.
> > >
> > > Another use case is to free request when the hctx is dead during
> > > cpu hotplug context.
> > >
> > > Cc: Ewan D. Milne <emilne at redhat.com>
> > > Cc: Bart Van Assche <bvanassche at acm.org>
> > > Cc: Hannes Reinecke <hare at suse.com>
> > > Cc: Christoph Hellwig <hch at lst.de>
> > > Cc: Mike Snitzer <snitzer at redhat.com>
> > > Cc: dm-devel at redhat.com
> > > Cc: <stable at vger.kernel.org>
> > > Fixes: 396eaf21ee17 ("blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback")
> > > Signed-off-by: Ming Lei <ming.lei at redhat.com>
> > > ---
> > > drivers/md/dm-rq.c | 1 +
> > > include/linux/blk-mq.h | 13 +++++++++++++
> > > 2 files changed, 14 insertions(+)
> > >
> > > diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
> > > index c9e44ac1f9a6..21d5c1784d0c 100644
> > > --- a/drivers/md/dm-rq.c
> > > +++ b/drivers/md/dm-rq.c
> > > @@ -408,6 +408,7 @@ static int map_request(struct dm_rq_target_io *tio)
> > > ret = dm_dispatch_clone_request(clone, rq);
> > > if (ret == BLK_STS_RESOURCE || ret == BLK_STS_DEV_RESOURCE) {
> > > blk_rq_unprep_clone(clone);
> > > + blk_mq_cleanup_rq(clone);
> > > tio->ti->type->release_clone_rq(clone, &tio->info);
> > > tio->clone = NULL;
> > > return DM_MAPIO_REQUEUE;
> >
> > Requiring upper layer driver (dm-rq) to explicitly call blk_mq_cleanup_rq()
> > seems wrong. In this instance tio->ti->type->release_clone_rq()
> > (dm-mpath's multipath_release_clone) calls blk_put_request(). Why can't
> > blk_put_request(), or blk_mq_free_request(), call blk_mq_cleanup_rq()?
>
> I did think about doing it in blk_put_request(), and I just want to
> avoid the little cost in generic fast path, given freeing request after
> dispatch is very unusual, so far only nvme multipath and dm-rq did in
> that way.
>
> However, if no one objects to move blk_mq_cleanup_rq() to blk_put_request()
> or blk_mq_free_request(), I am fine to do that in V2.
Think it'd be a less fragile/nuanced way to extend the blk-mq
interface. Otherwise there is potential for other future drivers
experiencing leaks.
> > Not looked at the cpu hotplug case you mention, but my naive thought is
> > it'd be pretty weird to also sprinkle a call to blk_mq_cleanup_rq() from
> > that specific "dead hctx" code path.
>
> It isn't weird, and it is exactly what NVMe multipath is doing, please see
> nvme_failover_req(). And it is just that nvme doesn't allocate request
> private data.
>
> Wrt. blk-mq cpu hotplug handling: after one hctx is dead, we can't dispatch
> request to this hctx any more, however one request has been bounded to its
> hctx since its allocation and the association can't(or quite hard to) be
> changed any more, do you have any better idea to deal with this issue?
No, as I prefaced before "Not looked at the cpu hotplug case you
mention". As such I should've stayed silent ;)
But my point was we should hook off current interfaces rather than rely
on a new primary function call.
Mike
More information about the dm-devel
mailing list