[dm-devel] [PATCH] libmultipath: ensure dev_loss_tmo will be update to MAX_DEV_LOSS_TMO if no_path_retry set to queue

Benjamin Marzinski bmarzins at redhat.com
Thu Dec 1 16:44:54 UTC 2016


On Thu, Dec 01, 2016 at 09:06:14AM +0800, peng.liang5 at zte.com.cn wrote:
>    If fast_io_fail_tmo isn't set, it will be use the DEFAULT_FAST_IO_FAIL
>    in select_fast_io_fail.
> 
>    So, multipath will not run the limited of dev_loss_tmo to 600.

Yes, but the kernel will. With your patch installed, if I disable
fast_io_fail_tmo and set no_path_retry to queue, I get these messages

Dec 01 04:19:02 | rport-11:0-0: failed to set dev_loss_tmo to
2147483647, error 22

Because if fast_io_fail_tmo is not set, the kernel itself will bar
dev_loss_tmo from being above 600 seconds. Also, even if you could set
dev_loss_tmo to it's maximum without fast_io_fail_tmo set, you would
never want to, because you would break multipath.

With fast_io_fail_tmo disabled, the scsi device will never pass the
failed IO back up until dev_loss_tmo triggers.  This means that if you
lose a path on your multipath device while doing IO, you won't be able
to resend that IO down another path for 68 years (2147483647 seconds).
Also, all the synchronous checker functions will not return for 648
years. And during all this time these processes will be uninterruptable
sleep. At that point, there would be no point to even having multiple
paths, because you couldn't ever actually use them if one went down.

> 
>    And I think using MP_FAST_IO_FAIL_UNSET as the condition is meaningless
>    after multipath
> 
>    run select_fast_io_fail even if it's not set.

This is true in the default case, but we can't rely on the default case.
Since we allow users to turn it off, we need to correctly configure
multipath when it is off.

-Ben

>                                     原始邮件
>    发件人:BenjaminMarzinski
>    收件人:彭亮10137102;
>    抄送人:<dm-devel at redhat.com>张凯10072500;
>    日 期 :2016年11月29日 08:30
>    主 题 :Re: [dm-devel] [PATCH] libmultipath: ensure dev_loss_tmo will be
>    update to MAX_DEV_LOSS_TMO if no_path_retry set to queue
> 
>    On Fri, Nov 25, 2016 at 02:36:04PM +0800, peng.liang5 at zte.com.cn wrote:
>    > From: PengLiang <peng.liang5 at zte.com.cn>    > 
>    > If no_path_retry set to queue, we should make sure dev_loss_tmo update to MAX_DEV_LOSS_TMO.
>    > But, it will be limit to 600 if fast_io_fail_tmo set to off or 0 meanwhile.
> 
>    Doesn't the system still limit dev_loss_tmo to 600 if fast_io_fail_tmo isn't set. Multipath
>    was using this limit, since the underlying system uses it.
> 
>    -Ben
> 
>    > 
>    > Signed-off-by: PengLiang <peng.liang5 at zte.com.cn>    > ---
>    >  libmultipath/discovery.c | 3 ++-
>    >  1 file changed, 2 insertions(+), 1 deletion(-)
>    > 
>    > diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
>    > index aaa915c..05b0842 100644
>    > --- a/libmultipath/discovery.c
>    > +++ b/libmultipath/discovery.c
>    > @@ -608,7 +608,8 @@ sysfs_set_rport_tmo(struct multipath *mpp, struct path *pp)
>    >                  goto out;
>    >              }
>    >          }
>    > -    } else if (mpp->dev_loss > DEFAULT_DEV_LOSS_TMO) {
>    > +    } else if (mpp->dev_loss > DEFAULT_DEV_LOSS_TMO &&
>    > +        mpp->no_path_retry != NO_PATH_RETRY_QUEUE) {
>    >          condlog(3, "%s: limiting dev_loss_tmo to %d, since "
>    >              "fast_io_fail is not set",
>    >              rport_id, DEFAULT_DEV_LOSS_TMO);
>    > -- 
>    > 2.8.1.windows.1
> 
>    --
>    dm-devel mailing list
>    dm-devel at redhat.com
>    https://www.redhat.com/mailman/listinfo/dm-devel





More information about the dm-devel mailing list