[dm-devel] path coalescing once more

Stewart, Sean Sean.Stewart at netapp.com
Thu May 15 16:48:13 UTC 2014


Hi Brian,

On Tue, 2014-05-13 at 16:55 -0700, Brian Bunker wrote:
> We continue to run into problems with the device-mapper presumably
> putting LUNs which do not belong with the dm under its control. Here
> is the latest (I am picking out just dm-2 but there are others in this
> same state):

> 
> I see the following kernel messages in the syslog when the device sdaj
> arrives to when it is put in the wrong place:
> May 13 11:28:27 rb9init4 kernel: scsi 1:0:0:1: Direct-Access     PURE     FlashArray       400B PQ: 0 ANSI: 6
> May 13 11:28:27 rb9init4 kernel: sd 1:0:0:1: [sdaj] 1048576000 512-byte logical blocks: (536 GB/500 GiB)
> May 13 11:28:27 rb9init4 kernel: sd 1:0:0:1: Attached scsi generic sg1 type 0
> May 13 11:28:27 rb9init4 kernel: sd 1:0:0:1: [sdaj] Write Protect is off
> May 13 11:28:27 rb9init4 kernel: sd 1:0:0:1: [sdaj] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
> May 13 11:28:27 rb9init4 kernel: sdaj:
> May 13 11:28:27 rb9init4 kernel: unknown partition table
> May 13 11:28:27 rb9init4 kernel: sd 1:0:0:1: [sdaj] Attached SCSI disk
> May 13 11:28:28 rb9init4 multipathd: sdaj: add path (uevent)
> May 13 11:28:28 rb9init4 multipathd: sdaj path added to devmap 3624a9370c90d0d631ef8783e00010004
> 
I ran through the code a little bit the other day, and I don't see how
it could be making a mistake here.  It runs scsi_id, the result is
placed in a 128 character buffer, then it is strncmp'd with the wwid of
the mpath devices.
> 
> I don’t understand where it gets 3624a9370c90d0d631ef8783e00010004.
> When I run the ‘multipath -v6 -d’ so that it prints out what it wants
> to do but doesn’t do it, I see:

> May 13 16:45:15 | sdaj: getuid = /lib/udev/scsi_id --whitelisted
> --device=/dev/%n (config file default)
> May 13 16:45:15 | sdaj: uid = 3624a9370c90d0d631ef8783e00010002
> (callout)
> May 13 16:45:15 | sdaj: state = running
> May 13 16:45:15 | sdaj: detect_prio = 1 (config file default)
> May 13 16:45:15 | sdaj: prio = const (config file default)
> May 13 16:45:15 | sdaj: const prio = 1
> 
Running this command later will make it do the same thing, run scsi_id
which runs an inquiry that gets the wwid..  I would think the only
reason these should be different is that the inquiries returned
different values when multipathd did it at 11:28, and when you did it
through multipath at 16:45.  

In order to see that, we'd probably need to set verbosity 3 in the
defaults section of multipath.conf, restart the daemon, and do it
again..  Does anyone else have any thoughts on this?
> 
> So it seems to know thats its serial ends in “02” and not “04” like
> where it put the device. I don’t understand how to debug this further,
> so any help would be appreciated.


Thanks,
Sean Stewart






More information about the dm-devel mailing list