[dm-devel] Antw: [EXT] Re: RFC: one more time: SCSI device identification

Tue Apr 27 20:04:19 UTC 2021

On Tue, 2021-04-27 at 12:52 +0200, Ulrich Windl wrote:
> > > > Hannes Reinecke <hare at suse.de> schrieb am 27.04.2021 um 10:21
> > > > in Nachricht
> 
> <2a6903e4-ff2b-67d5-e772-6971db8448fb at suse.de>:
> > On 4/27/21 10:10 AM, Martin Wilck wrote:
> > > On Tue, 2021‑04‑27 at 13:48 +1000, Erwin van Londen wrote:
> > > > > 
> > > > > Wrt 1), we can only hope that it's the case. But 2) and 3)
> > > > > need work,
> > > > > afaics.
> > > > > 
> > > > 
> > > > In my view the WWID should never change. 
> > > 
> > > In an ideal world, perhaps not. But in the dm‑multipath realm, we
> > > know
> > > that WWID changes can happen with certain storage arrays. See 
> > > 
https://listman.redhat.com/archives/dm‑devel/2021‑February/msg00116.html
> > >  
> > > and follow‑ups, for example.
> > > 
> > 
> > And it's actually something which might happen quite easily.
> > The storage array can unmap a LUN, delete it, create a new one, and
> > map
> > that one into the same LUN number than the old one.
> > If we didn't do I/O during that interval upon the next I/O we will
> > be
> > getting the dreaded 'Power‑On/Reset' sense code.
> > _And nothing else_, due to the arcane rules for sense code
> > generation in
> > SAM.
> > But we end up with a completely different device.
> > 
> > The only way out of it is to do a rescan for every POR sense code,
> > and
> > disable the device eg via DID_NO_CONNECT whenever we find that the
> > identification has changed. We already have a copy of the original
> > VPD
> > page 0x83 at hand, so that should be reasonably easy.
> 
> I don't know the depth of the SCSI or FC protocol, but storage
> systems
> typically signal such events, maybe either via some unit attention or
> some FC
> event. Older kernels logged that there was a change, but a manual
> SCSI bus scan
> is needed, while newer kernels find new devices "automagically" for
> some
> products. The HP EVA 6000 series wored that way, a 3PAR SotorServ
> 8000 series
> also seems to work that way, but not Pure Storage X70 R3. FOr the
> latter you
> need something like a FC LIP to make the kernel detect the new
> devices (LUNs).
> I'm unsure where the problem is, but in principle the kernel can be
> notified...

There has to be some command on which the Unit Attention status
can be returned.  (In a multipath configuration, the path checker
commands may do this).  In absence of a command, there is no
asynchronous mechanism in SCSI to report the status.

On FC things related to finding a remote port will trigger a rescan.

-Ewan

> 
> > 
> > I had a rather lengthy discussion with Fred Knight @ NetApp about
> > Power‑On/Reset handling, what with him complaining that we don't
> > handle
> > is correctly. So this really is something we should be looking
> > into,
> > even independently of multipathing.
> > 
> > But actually I like the idea from Martin Petersen to expose the
> > parsed
> > VPD identifiers to sysfs; that would allow us to drop sg_inq
> > completely
> > from the udev rules.
> 
> Talking of VPDs: Somewhere in the last 12 years (within SLES 11)there
> was a
> kernel change regarding trailing blanks in VPD data. That change blew
> up
> several configurations being unable to re-recognize the devices. In
> one case
> the software even had bound a license to a specific device with
> serial number,
> and that software found "new" devices while missing the "old" ones...
> 
> Regards,
> Ulrich
> 
> > 
> > Cheers,
> > 
> > Hannes
> > ‑‑ 
> > Dr. Hannes Reinecke		        Kernel Storage Architect
> > hare at suse.de			               +49 911 74053
> > 688
> > SUSE Software Solutions Germany GmbH, 90409 Nürnberg
> > GF: F. Imendörffer, HRB 36809 (AG Nürnberg)
> 
> 
>