[dm-devel] [PATCH] multipathd: check and cleanup zombie paths

Martin Wilck mwilck at suse.com
Mon Mar 19 21:42:06 UTC 2018


On Fri, 2018-03-09 at 10:22 -0600, Benjamin Marzinski wrote:
> On Fri, Mar 09, 2018 at 06:47:30AM +0000, Chongyun Wu wrote:
> > On 2018/3/8 23:45, Benjamin Marzinski wrote:
> > > 
> > > If there are multiple routes to the storage, Some of them can be
> > > down,
> > > even if everything is fine on the storage.  This will cause some
> > > paths
> > > to be up and some to be down, regardless of the state of the LUN.
> > > In
> > > every other multipath case but this one, there is just one LUN,
> > > and not
> > > all the paths have the same state.
> > > 
> > > Ideally, there would be a way to determine if a path is a zombie,
> > > simply
> > > by looking at it alone.  The additional sense code "LOGICAL UNIT
> > > NOT
> > > SUPPORTED" that you posted earlier isn't one that I recall seeing
> > > for
> > > failed multipathd paths.  I'll check around more, but a quick
> > > look makes
> > > it appear that this code is only used when you are accessing a
> > > LUN that
> > > really isn't there. It's possible that the TUR checker could
> > > return a
> > > special path state for this, that would cause multipathd to
> > > remove the
> > > device.  Also, even if that additional sense code is only
> > > supposed to be
> > > used for this condition, we should still removing a device that
> > > returns
> > > it configurable, because I can almost guarantee that there will
> > > be a
> > > scsi device that does follow the standard for this.
> > > 
> > 
> > Hi Ben,
> > You just mentioned *the TUR checker could return a special path
> > state 
> > for this*, what is the special path state?  Thanks~
> > 
> 
> We would have to add a new state, like PATH_NOT_SUPPORTED, that the
> TUR
> checker could return in this case.  multipathd could be configured to
> remove the path if it returned this state. If it wasn't configured to
> do
> so, multipathd would just change the state to PATH_DOWN.

Is it really multipathd's job to do remove devices that return "LOGICAL
UNIT NOT SUPPORTED"? To me it sounds like a misconfiguration on the
SCSI/storage level, and I'm unsure if that's a thing multipathd should
mess with.

Martin

-- 
Dr. Martin Wilck <mwilck at suse.com>, Tel. +49 (0)911 74053 2107
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)




More information about the dm-devel mailing list