[dm-devel] dm-mpath: fix a tiny case which can cause an infinite loop

Mike Snitzer snitzer at redhat.com
Thu Feb 4 03:24:52 UTC 2016


On Wed, Feb 03 2016 at  9:08pm -0500,
jiangyiwen <jiangyiwen at huawei.com> wrote:

> When two processes submit WRTIE SAME bio simultaneously and
> first IO return failed because of INVALID FIELD IN CDB, and
> then second IO can enter into an infinite loop.
> The problem can be described as follows:
> 
> process 1                              process 2
> submit_bio(REQ_WRITE_SAME) and
> wait io completion
>                                        begin submit_bio(REQ_WRITE_SAME)
>                                        -> blk_queue_bio()
>                                          -> dm_request_fn()
>                                            -> map_request()
>                                              -> scsi_request_fn()
>                                                -> blk_peek_request()
>                                                  -> scsi_prep_fn()
> In the moment, IO return and
> sense_key with illegal_request,
> sense_code with 0x24(INVALID FIELD IN CDB).
> In this way it will set no_write_same = 1
> in sd_done() and disable write same
>                                        In sd_setup_write_same_cmnd()
>                                        when find (no_write_same == 1)
>                                        will return BLKPREP_KILL,
>                                        and then in blk_peek_request()
>                                        set error to EIO.
>                                        However, in multipath_end_io()
>                                        when find error is EIO it will
>                                        fail path and retry even if
>                                        device doesn't support WRITE SAME.
> 
> In this situation, when process of multipathd reinstate
> path by using test_unit_ready periodically, it will cause
> second WRITE SAME IO enters into an infinite loop and
> loop never terminates.
> 
> In do_end_io(), when finding device doesn't support WRITE SAME and
> request is WRITE SAME, return EOPNOTSUPP to applications.
> 
> Signed-off-by: Yiwen Jiang <jiangyiwen at huawei.com>
> Reviewed-by: Joseph Qi <joseph.qi at huawei.com>
> Reviewed-by: xuejiufei <xuejiufei at huawei.com>
> ---
>  drivers/md/dm-mpath.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
> index cfa29f5..ad1602a 100644
> --- a/drivers/md/dm-mpath.c
> +++ b/drivers/md/dm-mpath.c
> @@ -1269,6 +1269,11 @@ static int do_end_io(struct multipath *m, struct request *clone,
>  	if (noretry_error(error))
>  		return error;
> 
> +	/* do not retry in case of WRITE SAME not supporting */
> +	if ((clone->cmd_flags & REQ_WRITE_SAME) &&
> +			!clone->q->limits.max_write_same_sectors)
> +		return -EOPNOTSUPP;
> +
>  	if (mpio->pgpath)
>  		fail_path(mpio->pgpath);
> 

Did you test this patch?  Looks like it isn't going to make a
difference.  'error' will be -EREMOTEIO, which will be caught by
noretry_error().  So we'll never go on to run your new code.

Please see commit 7eee4ae2db ("dm: disable WRITE SAME if it fails").

I think DM already handles this case properly.  The only thing is it'll
return -EREMOTEIO (rather than -EOPNOTSUPP) further up the stack.




More information about the dm-devel mailing list