[dm-devel] Kernel v4.1-rc1 + MQ dm-multipath + MQ SRP oops

Mike Snitzer snitzer at redhat.com
Wed Apr 29 13:34:33 UTC 2015


On Wed, Apr 29 2015 at  9:20am -0400,
Christoph Hellwig <hch at lst.de> wrote:

> On Tue, Apr 28, 2015 at 01:52:20PM +0200, Bart Van Assche wrote:
> > Hello,
> >
> > Earlier today I started testing an SRP initiator patch series on top of 
> > Linux kernel v4.1-rc1. Although that patch series works reliably on top of 
> > kernel v4.0, a test during which I triggered scsi_remove_host() + relogin 
> > (for p in /sys/class/srp_remote_ports/*; do echo 1 >$p/delete & done; wait; 
> > srp_daemon -oaec) triggered the following kernel oops:
> 
> Can you try the patch below?  From my cursory reading of the dm code
> it can have tio->clone allocated for a while before it sets up the ->q
> pointer for it:
> 
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index f8c7ca3..ee74764 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -1089,7 +1089,7 @@ static void free_rq_clone(struct request *clone)
>  
>  	blk_rq_unprep_clone(clone);
>  
> -	if (clone->q->mq_ops)
> +	if (clone->q && clone->q->mq_ops)
>  		tio->ti->type->release_clone_rq(clone);
>  	else if (!md->queue->mq_ops)
>  		/* request_fn queue stacked on request_fn queue(s) */

I'm seeing this same crash on the completion path (when using your
tcm_loop script).  But for Bart's case his stacktrace included
dm_requeue_unmapped_original_request() -- which if called from
map_request() implies clone->q won't have been initialized given
__multipath_map()'s code for setting up the old request_fn case.

Long story short: your fix is right for Bart's crash (but not the ones
I'm seeing with tcm_loop) -- I'll get it queued up with a proper header
attributed to you and cc'ing stable as needed.

Thanks,
Mike




More information about the dm-devel mailing list