[dm-devel] 答复: 答复: why command of multipath send reinstate message to all dm's paths

Martin Wilck mwilck at suse.com
Mon Jul 2 07:10:37 UTC 2018


Jiaojianbing,

On Mon, 2018-07-02 at 01:06 +0000, Jiaojianbing wrote:
> > [I've added Hannes, Ben and Douglas to the recepient list to fill
> > in knowledge
> > from the past that I may lack].
> > 
> > tl;dr summary: We've got 3 issues:
> > 
> >  1) Why does multipath, in reinstate_paths(), try to reinstate
> > paths which are
> > known to be down?
> >  2) rescan-scsi-bus.sh can call "multipath" even if "-m" switch is
> > not used (that
> > looks like a bug to me).
> >  3) In Jiaojianbing's environment, dead paths that have been
> > removed on the
> > target and were already marked "offline" may appear as "running"
> > after rescan-scsi-bus.sh invocation.
> > 
> > Furthermore,
> >  4) perhaps rescan-scsi-bus.sh should replace suboptimal
> > "multipath"
> > calls with multipathd cli commands (or better even, we multipath-
> > tools people
> > should eventually finish the "delegate to multipathd" work).
> 
> It's my negligence, command "multipath" is not the one in rescan-
> scsi-bus.sh, but another one called every five minutes by process
> "test.sh". 
> It means there are two processes, one is rescan-scsi-bus.sh, another
> is test.sh which call multipath every five minutes.

Please describe your setup bottom-up. You have two scripts running
periodically, one calling "rescan-scsi-bus.sh" and one "multipath", and
they are (can be) running at the same time? How frequently are they
running? Be aware that "multipath" is not a monitoring command, it
basically causes a reconfiguration. It's not recommended to run it
periodically. Of course running "multipath" shouldn't cause a system
hang, but in your case I still think the root problem is that devices
that can't respond to IO are seen in "running" state by the kernel. If
that happens, other processes are allowed, actually supposed, to do
probing on these devices. But it's hard to say more without knowing
what exactly is going on.

Also, please consider updating to more recent version of the tools. dm-
devel a mailing list for discussing upstream issues, and your versions
of both multipath and sg3_utils are rather ancient. I guess you're
using some older distribution, in which case you may want to engage
with your distro's support team.

I don't think well make much progress without detailed logs of both the
kernel (please activate scsi logging with
MLCOMPLETE=1|ERROR=4|SCAN_BUS=4, run rescan_scsi_bus.sh with -d switch,
and set multipath verbosity of both  multipathd and "multipath" command
to 3 at least, and put results on a pastebin somewhere, and provide us
with links.

> In the scene, rescan-scsi-bus.sh will consume more larger time than
> the scene without calling "test.sh". The reason is that all "systemd-
> udevd" process 
> are in D state who send io to device mapper device, such as dm-105. 

If that's the case, please also run "udevadm -l debug" and provide udev
logs. We need to know which udev commands are hanging.

Martin

-- 
Dr. Martin Wilck <mwilck at suse.com>, Tel. +49 (0)911 74053 2107
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)




More information about the dm-devel mailing list