[dm-devel] [PATCH 14/19] multipath-tools: add ANA support for NVMe device

Martin Wilck mwilck at suse.com
Thu Dec 20 23:45:29 UTC 2018


On Thu, 2018-12-20 at 16:17 +0100, Hannes Reinecke wrote:
> +
> > +enum {
> > +	ANA_PRIO_OPTIMIZED		= 50,
> > +	ANA_PRIO_NONOPTIMIZED		= 10,
> > +	ANA_PRIO_INACCESSIBLE		= 5,
> > +	ANA_PRIO_PERSISTENT_LOSS	= 1,
> > +	ANA_PRIO_CHANGE			= 0,
> > +	ANA_PRIO_RESERVED		= 0,
> > +	ANA_PRIO_GETCTRL_FAILED		= -1,
> > +	ANA_PRIO_NOT_SUPPORTED		= -2,
> > +	ANA_PRIO_GETANAS_FAILED		= -3,
> > +	ANA_PRIO_GETANALOG_FAILED	= -4,
> > +	ANA_PRIO_GETNSID_FAILED		= -5,
> > +	ANA_PRIO_GETNS_FAILED		= -6,
> > +	ANA_PRIO_NO_MEMORY		= -7,
> > +	ANA_PRIO_NO_INFORMATION		= -8,
> > +};
> 
> Please model the priorities according to the ALUA handler; ANA state 
> 'persistent loss' maps onto ALUA 'unavailable' (and hence should have
> a 
> priority of '0'), and ANA state 'inaccessible' is roughly similar to 
> ALUA 'standby', hence should have a priority of '1'.

Will do. But please note that, in contrast to what we discussed off-
list, a priority of "0" has no special meaning. In particular,
pathgroup priority "0" (or negative!) doesn't imply that the PG in
question can't be selected for I/O. The only thing that is "special"
about priority 0 is that multipathd assigns this prio to PGs that have
no working paths. Therefore, a PG to which the prioritizer assigns prio
<= 0 will not be *preferred* over such a zero-path PG.

The only way to avoid that the kernel select a particular PG is to set
all paths in the PG to failed state, or to remove it altogether.
multipathd could try to set the PG to "disabled" state, but currently
it doesn't, and if it did, it wouldn't have the expected effect,
because "disabled" really just means "bypassed" in device mapper. A
"bypassed" PG will be selected for I/O if no other PG has healthy
paths. (Side note: "bypassed" might actually be a reasonable PG state
to use for a PG consisting only of GHOST paths, but we don't do that
today).

Therefore, I think that it makes sense to add an "ana path checker" to
multipathd, which would detect NVMe paths in states not suitable for
I/O and fail them in device mapper. We don't want device mapper to try
these paths. I'm not quite sure about "inaccessible" state - your
statement above would imply that "inaccessible" shouldn't be failed. 
But the way I read the ANA spec (8.19.4), simply trying I/O through
"inaccessible" ports would be wrong. Rather, the path should be
monitored for a transition to either "optimized" or "non-optimized"
state. That matches the behavior of the kernel native NVMe multipath
driver, which AFAICS never attempts I/O through any paths which aren't
either "optimized" or "non-optimized", and makes no distinction between
"inaccessible" and "persistent loss" states.

Cheers,
Martin

-- 
Dr. Martin Wilck <mwilck at suse.com>, Tel. +49 (0)911 74053 2107
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)





More information about the dm-devel mailing list