[dm-devel] multipathd ignoring dev_loss_tmo setting
Martin Wilck
mwilck at suse.de
Mon Mar 4 12:09:45 UTC 2019
On Thu, 2019-02-28 at 11:38 +0000, Martins, Bruno O wrote:
> Hello guys,
>
> I am trying to modify /etc/multipath.conf on my system so that the
> parameter 'dev_loss_tmo' is changed from the default value.
>
> My multipath.conf file contains the following:
>
> defaults {
> verbosity 2
> polling_interval 5
> max_polling_interval 10
> multipath_dir "/lib64/multipath"
> path_selector "round-robin 0"
> path_grouping_policy "failover"
> uid_attribute "ID_SERIAL"
> prio "const"
> prio_args ""
> features "0"
> path_checker "directio"
> alias_prefix "mpath"
> failback "manual"
> rr_min_io 1000
> rr_min_io_rq 1
> max_fds "max"
> rr_weight "uniform"
> no_path_retry "fail"
> queue_without_daemon "no"
> checker_timeout 15
> flush_on_last_del "no"
> user_friendly_names "yes"
> fast_io_fail_tmo 5
> dev_loss_tmo 10
> bindings_file "/etc/multipath/bindings"
> wwids_file /etc/multipath/wwids
> log_checker_err always
> retain_attached_hw_handler no
> detect_prio no
> }
>
> However, when checking the value currently in use I am getting the
> wrong value (which is '30') for some of the remote ports:
>
> for f in /sys/class/fc_remote_ports/rport-*/dev_loss_tmo; do
> d=$(dirname $f); echo $(basename $d):$(cat $d/node_name):$(cat $f);
> done
>
> rport-3:0-0:0x5742b0f00007c500:10
> rport-3:0-1:0x5742b0f00007c500:10
> rport-3:0-2:0x5742b0f00007c500:10
> rport-3:0-3:0x5000097408369800:30
> rport-3:0-4:0x500009757804cbff:30
> rport-4:0-0:0x5742b0f00007c500:10
> rport-4:0-1:0x5742b0f00007c500:10
> rport-4:0-2:0x5000097408369800:30
> rport-4:0-3:0x5742b0f00007c500:10
> rport-4:0-4:0x500009757804cbff:30
> rport-5:0-0:0x5742b0f00007c500:10
> rport-5:0-1:0x5742b0f00007c500:10
> rport-5:0-2:0x5742b0f00007c500:10
> rport-5:0-3:0x5000097408369800:30
> rport-5:0-4:0x500009757804cbff:30
> rport-6:0-0:0x5742b0f00007c500:10
> rport-6:0-1:0x5742b0f00007c500:10
> rport-6:0-2:0x5000097408369800:30
> rport-6:0-3:0x5742b0f00007c500:10
> rport-6:0-4:0x500009757804cbff:30
>
> systool is giving me the same information:
>
> systool -c fc_remote_ports -v | grep dev_loss_tmo
>
> dev_loss_tmo = "10"
> dev_loss_tmo = "10"
> dev_loss_tmo = "10"
> dev_loss_tmo = "10"
>
>
> >
> > I am using the following versions:
> >
> > rpm -qa multipath-tools
> > multipath-tools-0.4.9-109.1
> >
> > uname -a
> > Linux mysystem 3.0.101-63-default #1 SMP Tue Jun 23 16:02:31 UTC
> 2015
> > (4b89d0c) x86_64 x86_64 x86_64 GNU/Linux
> >
> > Thanks for your help!
> >
> > Kind regards,
> >
> > Bruno
> >
> > --
> > dm-devel mailing list
> > dm-devel at redhat.com
> > https://www.redhat.com/mailman/listinfo/dm-devel
> >
>
>
> dev_loss_tmo = "10"
> dev_loss_tmo = "10"
> dev_loss_tmo = "10"
> dev_loss_tmo = "10"
> dev_loss_tmo = "10"
> dev_loss_tmo = "30"
> dev_loss_tmo = "10"
> dev_loss_tmo = "30"
> dev_loss_tmo = "30"
> dev_loss_tmo = "10"
> dev_loss_tmo = "30"
> dev_loss_tmo = "10"
> dev_loss_tmo = "30"
> dev_loss_tmo = "30"
> dev_loss_tmo = "30"
> dev_loss_tmo = "30"
>
> Where is this value coming from? May this be a bug? I couldn't find
> anything useful on the Internet regarding this.
It'd be very helpful if you could upload "multipath -v3" (or multipathd
with verbosity 3) logs somewhere.
It looks as if you're using some SLE11 variant, so maybe you want to
open a support case?
Another question would be why you want such a low dev_loss_tmo. It's
not generally recommended, because on the kernel side, removing and re-
adding a device is a lot more complex than disabling and re-enabling
it. The fast_io_fail_tmo should provide you with quick path failover
already. My recommendation is to set dev_loss_tmo to a value which
would, in the given data center, indicate that the device loss is
really not due to a temporary outage but due to a permantly removed
device (e.g. permanent storage configuration change). So basically, the
dev_loss_tmo shouldn't be shorter than the admin's lunch break.
Martin
More information about the dm-devel
mailing list