[dm-devel] RFC: one more time: SCSI device identification

Martin Wilck martin.wilck at suse.com
Thu Apr 22 09:07:15 UTC 2021


On Wed, 2021-04-21 at 22:46 -0400, Martin K. Petersen wrote:
> 
> Martin,
> 
> > Hm, it sounds intriguing, but it has issues in its own right. For
> > years to come, user space will have to probe whether these attribute
> > exist, and fall back to the current ones ("wwid", "vpd_pg83")
> > otherwise. So user space can't be simplified any time soon. Speaking
> > for an important user space consumer of WWIDs (multipathd), I doubt
> > that this would improve matters for us. We'd be happy if the kernel
> > could just pick the "best" designator for us. But I understand that
> > the kernel can't guarantee a good choice (user space can't either).
> 
> But user space can be adapted at runtime to pick one designator over
> the
> other (ha!).

And that's exactly the problem. Effectively, all user space relies on
udev today, because that's where this "adaptation" is taking place. It
happens

 1) either in systemd's scsi_id built-in 
   (https://github.com/systemd/systemd/blob/7feb1dd6544d1bf373dbe13dd33cf563ed16f891/src/udev/scsi_id/scsi_serial.c#L37)
 2) or in the udev rules coming with sg3_utils 
   (https://github.com/hreinecke/sg3_utils/blob/master/scripts/55-scsi-sg3_id.rules)

1) is just as opaque and un-"adaptable" as the kernel, and the logic is
suboptimal. 2) is of course "adaptable", but that's a problem in
practice, if udev fails to provide a WWID. multipath-tools go through
various twists for this case to figure out "fallback" WWIDs, guessing
whether that "fallback" matches what udev would have returned if it had
worked.

That's the gist of it - the general frustration about udev among some
of its heaviest users (talk to the LVM2 maintainers).

I suppose 99.9% of users never bother with customizing the udev rules.
IOW, these users might as well just use a kernel-provided value. But
the remaining 0.1% causes headaches for user-space applications, which
can't make solid assumptions about the rules. Thus, in a way, the
flexibility of the rules does more harm than it helps.

> We could do that in the kernel too, of course, but I'm afraid what
> the
> resulting BLIST changes would end up looking like over time.

That's something we want to avoid, sure.

But we can actually combine both approaches. If "wwid" yields a good
value most of the time (which is true IMO), we could make user space
rely on it by default, and make it possible to set an udev property
(e.g. ENV{ID_LEGACY}="1") to tell udev rules to determine WWID
differently. User-space apps like multipath could check the ID_LEGACY
property to determine whether or not reading the "wwid" attribute would
be consistent with udev. That would simplify matters a lot for us (Ben,
do you agree?), without the need of adding endless BLIST entries.


> I am also very concerned about changing what the kernel currently
> exports in a given variable like "wwid". A seemingly innocuous change
> to
> the reported value could lead to a system no longer booting after
> updating the kernel.

AFAICT, no major distribution uses "wwid" for this purpose (yet). I
just recently realized that the kernel's ALUA code refers to it. (*)

In a recent discussion with Hannes, the idea came up that the priority
of "SCSI name string" designators should actually depend on their
subtype. "naa." name strings should map to the respective NAA
descriptors, and "eui.", likewise (only "iqn." descriptors have no
binary counterpart; we thought they should rather be put below NAA,
prio-wise).

I wonder if you'd agree with a change made that way for "wwid". I
suppose you don't. I'd then propose to add a new attribute following
this logic. It could simply be an additional attribute with a different
name. Or this new attribute could be a property of the block device
rather than the SCSI device, like NVMe does it
(/sys/block/nvme0n2/wwid).

I don't like the idea of having separate sysfs attributes for
designators of different types, that's impractical for user space.

> But taking a step back: Other than "it's not what userland currently
> does", what specifically is the problem with designator_prio()? We've
> picked the priority list once and for all. If we promise never to
> change
> it, what is the issue?

If the prioritization in kernel and user space was the same, we could
migrate away from udev more easily without risking boot failure.

Thanks,
Martin

(*) which is an argument for using "wwid" in user space too - just to
be consitent with the kernel's internal logic.

-- 
Dr. Martin Wilck <mwilck at suse.com>, Tel. +49 (0)911 74053 2107
SUSE Software Solutions Germany GmbH
HRB 36809, AG Nürnberg GF: Felix Imendörffer






More information about the dm-devel mailing list