[dm-devel] [PATCH 1/2] blk-mq: add callback of .cleanup_rq

Ming Lei ming.lei at redhat.com
Fri Jul 19 01:35:47 UTC 2019


On Thu, Jul 18, 2019 at 10:52:01AM -0400, Mike Snitzer wrote:
> On Wed, Jul 17 2019 at 11:25pm -0400,
> Ming Lei <ming.lei at redhat.com> wrote:
> 
> > dm-rq needs to free request which has been dispatched and not completed
> > by underlying queue. However, the underlying queue may have allocated
> > private stuff for this request in .queue_rq(), so dm-rq will leak the
> > request private part.
> 
> No, SCSI (and blk-mq) will leak.  DM doesn't know anything about the
> internal memory SCSI uses.  That memory is a SCSI implementation detail.

It isn't noting to do with dm-rq, which frees one request after BLK_STS_*RESOURCE
is returned from blk_insert_cloned_request(), in this case it has to be
the user for releasing the request private data.

> 
> Please fix header to properly reflect which layer is doing the leaking.

Fine.

> 
> > Add one new callback of .cleanup_rq() to fix the memory leak issue.
> > 
> > Another use case is to free request when the hctx is dead during
> > cpu hotplug context.
> > 
> > Cc: Ewan D. Milne <emilne at redhat.com>
> > Cc: Bart Van Assche <bvanassche at acm.org>
> > Cc: Hannes Reinecke <hare at suse.com>
> > Cc: Christoph Hellwig <hch at lst.de>
> > Cc: Mike Snitzer <snitzer at redhat.com>
> > Cc: dm-devel at redhat.com
> > Cc: <stable at vger.kernel.org>
> > Fixes: 396eaf21ee17 ("blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback")
> > Signed-off-by: Ming Lei <ming.lei at redhat.com>
> > ---
> >  drivers/md/dm-rq.c     |  1 +
> >  include/linux/blk-mq.h | 13 +++++++++++++
> >  2 files changed, 14 insertions(+)
> > 
> > diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
> > index c9e44ac1f9a6..21d5c1784d0c 100644
> > --- a/drivers/md/dm-rq.c
> > +++ b/drivers/md/dm-rq.c
> > @@ -408,6 +408,7 @@ static int map_request(struct dm_rq_target_io *tio)
> >  		ret = dm_dispatch_clone_request(clone, rq);
> >  		if (ret == BLK_STS_RESOURCE || ret == BLK_STS_DEV_RESOURCE) {
> >  			blk_rq_unprep_clone(clone);
> > +			blk_mq_cleanup_rq(clone);
> >  			tio->ti->type->release_clone_rq(clone, &tio->info);
> >  			tio->clone = NULL;
> >  			return DM_MAPIO_REQUEUE;
> 
> Requiring upper layer driver (dm-rq) to explicitly call blk_mq_cleanup_rq() 
> seems wrong.  In this instance tio->ti->type->release_clone_rq()
> (dm-mpath's multipath_release_clone) calls blk_put_request().  Why can't
> blk_put_request(), or blk_mq_free_request(), call blk_mq_cleanup_rq()?

I did think about doing it in blk_put_request(), and I just want to
avoid the little cost in generic fast path, given freeing request after
dispatch is very unusual, so far only nvme multipath and dm-rq did in
that way.

However, if no one objects to move blk_mq_cleanup_rq() to blk_put_request()
or blk_mq_free_request(), I am fine to do that in V2.

> 
> Not looked at the cpu hotplug case you mention, but my naive thought is
> it'd be pretty weird to also sprinkle a call to blk_mq_cleanup_rq() from
> that specific "dead hctx" code path.

It isn't weird, and it is exactly what NVMe multipath is doing, please see
nvme_failover_req(). And it is just that nvme doesn't allocate request
private data.

Wrt. blk-mq cpu hotplug handling: after one hctx is dead, we can't dispatch
request to this hctx any more, however one request has been bounded to its
hctx since its allocation and the association can't(or quite hard to) be
changed any more, do you have any better idea to deal with this issue?


Thanks,
Ming




More information about the dm-devel mailing list