[dm-devel] RFC: one more time: SCSI device identification

Erwin van Londen erwin at erwinvanlonden.net
Tue Apr 27 03:48:36 UTC 2021



On Mon, 2021-04-26 at 13:16 +0000, Martin Wilck wrote:
> On Mon, 2021-04-26 at 13:14 +0200, Ulrich Windl wrote:
> > > > 
> > > 
> > > While we're at it, I'd like to mention another issue: WWID
> > > changes.
> > > 
> > > This is a big problem for multipathd. The gist is that the device
> > > identification attributes in sysfs only change after rescanning
> > > the
> > > device. Thus if a user changes LUN assignments on a storage
> > > system,
> > > it can happen that a direct INQUIRY returns a different WWID as
> > > in
> > > sysfs, which is fatal. If we plan to rely more on sysfs for
> > > device
> > > identification in the future, the problem gets worse. 
> > 
> > I think many devices rely on the fact that they are identified by
> > Vendor/model/serial_nr, because in most professional SAN storage
> > systems you
> > can pre-set the serial number to a custom value; so if you want a
> > new
> > disk
> > (maybe a snapshot) to be compatible with the old one, just assign
> > the
> > same
> > serial number. I guess that's the idea behind.
> 
> What you are saying sounds dangerous to me. If a snapshot has the
> same
> WWID as the device it's a snapshot of, it must not be exposed to any
> host(s) at the same time with its origin, otherwise the host may
> happily combine it with the origin into one multipath map, and data
> corruption will almost certainly result. 
> 
> My argument is about how the host is supposed to deal with a WWID
> change if it happens. Here, "WWID change" means that a given H:C:T:L
> suddenly exposes different device designators than it used to, while
> this device is in use by a host. Here, too, data corruption is
> imminent, and can happen in a blink of an eye. To avoid this, several
> things are needed:
> 
>  1) the host needs to get notified about the change (likely by an UA
> of
> some sort)
>  2) the kernel needs to react to the notification immediately, e.g.
> by
> blocking IO to the device,
>  3) userspace tooling such as udev or multipathd need to figure out
> how
> to  how to deal with the situation cleanly, and eventually unblock
> it.
> 
> Wrt 1), we can only hope that it's the case. But 2) and 3) need work,
> afaics.
> 
In my view the WWID should never change. If a snapshot is created it
should either obtain a new WWID. An example out of a Hitachi array is

Device Identification VPD page:
Addressed logical unit:
designator type: T10 vendor identification, code set: ASCII
vendor id: HITACHI 
vendor specific: 50403B050709
designator type: NAA, code set: Binary
0x60060e80123b050050403b0500000709

The majority of the naa wwid is tied to the storage subsystem and
identifies the vendor oui, model, serial etc. The last 4 in this
example indicate the LDEV ID (Sorry mainframe heritage here..). When a
snapshot is taken these 4 will change as a new LDEV ID is assigned to
the snapshot. This sort of behaviour should be consistent across all
storage vendors imho.

> Martin
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/dm-devel/attachments/20210427/c5727866/attachment.htm>


More information about the dm-devel mailing list