[dm-devel] [QUESTION]: multipath device with wrong path lead to metadata err

lixiaokeng lixiaokeng at huawei.com
Tue Jan 26 11:14:04 UTC 2021


>> Hi,
>>   Unfortunately the verify_path() called before *and* after domap()
>> in
>> coalesce_paths can't solve this problem. I think it is another way to
>> lead multipath with wrong path, but now I can't find the way from
>> log.
> 
> Can you provide multipathd -v3 logs, and kernel logs? Maybe I'll see
> something.
> 

Logs in first sence:
Jan 26 12:37:47 client1 multipathd[2532437]: reconfigure (operator)
Jan 25 12:37:47 client1 multipathd[2532437]: 3600140584e11eb1818c4afab12c17800: remove multipath map
Jan 25 12:37:47 client1 multipathd[2532437]: sdh: orphan path, map flushed
Jan 25 12:37:47 client1 multipathd[2532437]: BUG: orphaning path sdh that holds hwe of 3600140584e11eb1818c4afab12c17800
Jan 25 12:37:47 client1 multipathd[2532437]: 3600140531f063b3e19349bc82028e0cc: remove multipath map
Jan 25 12:37:47 client1 multipathd[2532437]: sdn: orphan path, map flushed
Jan 25 12:37:47 client1 multipathd[2532437]: BUG: orphaning path sdn that holds hwe of 3600140531f063b3e19349bc82028e0cc
Jan 25 12:37:47 client1 multipathd[2532437]: 36001405ca5165367d67447ea68108e1d: remove multipath map
Jan 25 12:37:47 client1 multipathd[2532437]: sdi: orphan path, map flushed
Jan 25 12:37:47 client1 multipathd[2532437]: BUG: orphaning path sdi that holds hwe of 36001405ca5165367d67447ea68108e1d
Jan 25 12:37:47 client1 multipathd[2532437]: 8:208: cannot find block device
Jan 25 12:37:47 client1 multipathd[2532437]: 3600140531f063b3e19349bc82028e0cc: remove multipath map
Jan 25 12:37:47 client1 multipathd[2532437]: 8:128: cannot find block device
Jan 25 12:37:47 client1 multipathd[2532437]: 36001405ca5165367d67447ea68108e1d: remove multipath map
Jan 25 12:37:47 client1 multipathd[2532437]: 8:112: cannot find block device
Jan 25 12:37:47 client1 multipathd[2532437]: 3600140584e11eb1818c4afab12c17800: remove multipath map
Jan 25 12:37:47 client1 multipathd[2532437]: sda: orphan path, blacklisted
Jan 25 12:37:47 client1 multipathd[2532437]: sdb: orphan path, blacklisted
Jan 25 12:37:47 client1 multipathd[2532437]: 3600140584e11eb1818c4afab12c17800: set ACT_CREATE (map does not exist)
Jan 25 12:37:47 client1 kernel: [1202538.089589] device-mapper: multipath: Reinstating path 8:208.
Jan 25 12:37:47 client1 kernel: [1202538.089606] lxk kernel params 0 1 alua 1 1 service-time 0 1 1 8:48 1
Jan 25 12:37:47 client1 kernel: [1202538.089607] name dm-3; majir:minor 253:3; dm-3
Jan 25 12:37:47 client1 kernel: [1202538.089688] device-mapper: multipath: Reinstating path 8:128.
Jan 25 12:37:47 client1 kernel: [1202538.089762] lxk load_table success 0
Jan 25 12:37:47 client1 kernel: [1202538.089768] device-mapper: multipath: Reinstating path 8:112.
Jan 25 12:37:47 client1 kernel: [1202538.090266] lxk kernel params 0 1 alua 1 1 service-time 0 1 1 8:48 1
Jan 25 12:37:47 client1 kernel: [1202538.090268] name dm-3; majir:minor 253:3; dm-3
Jan 25 12:37:47 client1 multipath[788570]: loading /lib64/multipath/libchecktur.so checker
Jan 25 12:37:47 client1 multipath[788570]: 8:112: cannot find block device
Jan 25 12:37:47 client1 multipath[788570]: 8:112: Empty device name
Jan 25 12:37:47 client1 multipath[788570]: 8:112: Empty device name
Jan 25 12:37:47 client1 multipath[788570]: get_udev_device: failed to look up 8:112 with type 1
Jan 25 12:37:47 client1 multipath[788570]: dm-3: no usable paths found
Jan 25 12:37:47 client1 multipath[788569]: loading /lib64/multipath/libchecktur.so checker
Jan 25 12:37:47 client1 multipath[788571]: loading /lib64/multipath/libchecktur.so checker
Jan 25 12:37:47 client1 multipath[788569]: 8:128: cannot find block device
Jan 25 12:37:47 client1 multipath[788569]: 8:128: Empty device name
Jan 25 12:37:47 client1 multipath[788569]: 8:128: Empty device name
Jan 25 12:37:47 client1 multipath[788569]: get_udev_device: failed to look up 8:128 with type 1
Jan 25 12:37:47 client1 multipath[788569]: dm-6: no usable paths found
Jan 25 12:37:47 client1 multipath[788571]: 8:208: cannot find block device
Jan 25 12:37:47 client1 multipath[788571]: 8:208: Empty device name
Jan 25 12:37:47 client1 multipath[788571]: 8:208: Empty device name
Jan 25 12:37:47 client1 multipath[788571]: get_udev_device: failed to look up 8:208 with type 1
Jan 25 12:37:47 client1 multipath[788571]: dm-4: no usable paths found
Jan 25 12:37:47 client1 multipath[788580]: loading /lib64/multipath/libchecktur.so checker
Jan 25 12:37:47 client1 kernel: [1202538.090340] lxk load_table success 0
Jan 25 12:37:47 client1 kernel: [1202538.128005] device-mapper: multipath: Failing path 8:208.
Jan 25 12:37:47 client1 kernel: [1202538.128937] device-mapper: multipath: Failing path 8:128.
Jan 25 12:37:47 client1 multipathd[2532437]: 3600140584e11eb1818c4afab12c17800: load table [0 20971520 multipath 0 1 alua 1 1 service-time 0 1 1 8:48 1]
Jan 25 12:37:47 client1 multipathd[2532437]: 3600140531f063b3e19349bc82028e0cc: set ACT_CREATE (map does not exist)
Jan 25 12:37:47 client1 kernel: [1202538.132361] scsi 3:0:0:0: alua: Detached
Jan 25 12:37:47 client1 kernel: [1202538.132370] lxk ref put after: name sdh; count 0
Jan 25 12:37:47 client1 kernel: [1202538.133024] EXT4-fs error: 1 callbacks suppressed
Jan 25 12:37:47 client1 kernel: [1202538.133027] EXT4-fs error (device dm-6): ext4_find_entry:1456: inode #131073: comm fsstress: reading directory lblock 0
Jan 25 12:37:47 client1 kernel: [1202538.133031] EXT4-fs: 12 callbacks suppressed
Jan 25 12:37:47 client1 kernel: [1202538.133033] EXT4-fs (dm-6): ext4_commit_super: ec42fbc81d447c94 78c83f8f086afab0 seed is 0xca479c5b
Jan 25 12:37:47 client1 kernel: [1202538.133550] lxk kernel params 0 1 alua 1 1 service-time 0 1 1 8:224 1
Jan 25 12:37:47 client1 kernel: [1202538.133552] name dm-4; majir:minor 253:4; dm-4
Jan 25 12:37:47 client1 kernel: [1202538.133682] lxk load_table success 0
Jan 25 12:37:47 client1 kernel: [1202538.133968] lxk kernel params 0 1 alua 1 1 service-time 0 1 1 8:224 1
Jan 25 12:37:47 client1 kernel: [1202538.133970] name dm-4; majir:minor 253:4; dm-4
Jan 25 12:37:47 client1 kernel: [1202538.134017] lxk load_table success 0
Jan 25 12:37:47 client1 kernel: [1202538.135805] EXT4-fs (dm-6): I/O error while writing superblock
Jan 25 12:37:47 client1 kernel: [1202538.136029] EXT4-fs error (device dm-6): ext4_find_entry:1456: inode #131073: comm fsstress: reading directory lblock 0
Jan 25 12:37:47 client1 kernel: [1202538.136032] EXT4-fs (dm-6): ext4_commit_super: ec42fbc81d447c94 78c83f8f086afab0 seed is 0xca479c5b
Jan 25 12:37:47 client1 kernel: [1202538.142127] EXT4-fs (dm-6): I/O error while writing superblock
Jan 25 12:37:47 client1 kernel: [1202538.143920] sd 4:0:0:0: alua: port group 00 state A non-preferred supports TOlUSNA
Jan 25 12:37:47 client1 kernel: [1202538.144911] EXT4-fs error (device dm-6): ext4_find_entry:1456: inode #131073: comm fsstress: reading directory lblock 0
Jan 25 12:37:47 client1 kernel: [1202538.144915] EXT4-fs (dm-6): ext4_commit_super: ec42fbc81d447c94 78c83f8f086afab0 seed is 0xca479c5b
Jan 25 12:37:47 client1 kernel: [1202538.145373] EXT4-fs (dm-6): I/O error while writing superblock
Jan 25 12:37:47 client1 kernel: [1202538.145960] EXT4-fs error (device dm-6): ext4_find_entry:1456: inode #131073: comm fsstress: reading directory lblock 0
Jan 25 12:37:47 client1 kernel: [1202538.145963] EXT4-fs (dm-6): ext4_commit_super: ec42fbc81d447c94 78c83f8f086afab0 seed is 0xca479c5b
Jan 25 12:37:47 client1 kernel: [1202538.146046] EXT4-fs (dm-6): I/O error while writing superblock
Jan 25 12:37:47 client1 kernel: [1202538.146195] EXT4-fs error (device dm-6): ext4_find_entry:1456: inode #131073: comm fsstress: reading directory lblock 0
Jan 25 12:37:47 client1 kernel: [1202538.146196] EXT4-fs (dm-6): ext4_commit_super: ec42fbc81d447c94 78c83f8f086afab0 seed is 0xca479c5b
Jan 25 12:37:47 client1 kernel: [1202538.148027] EXT4-fs (dm-6): I/O error while writing superblock
Jan 25 12:37:47 client1 kernel: [1202538.148920] EXT4-fs error (device dm-6): ext4_find_entry:1456: inode #131073: comm fsstress: reading directory lblock 0
Jan 25 12:37:47 client1 kernel: [1202538.149466] EXT4-fs error (device dm-6): ext4_find_entry:1456: inode #131073: comm fsstress: reading directory lblock 0
Jan 25 12:37:47 client1 kernel: [1202538.149578] EXT4-fs error (device dm-6): ext4_find_entry:1456: inode #131073: comm fsstress: reading directory lblock 0
Jan 25 12:37:48 client1 kernel: [1202538.149639] EXT4-fs error (device dm-6): ext4_find_entry:1456: inode #131073: comm fsstress: reading directory lblock 0
Jan 25 12:37:48 client1 kernel: [1202538.149673] EXT4-fs error (device dm-6): ext4_find_entry:1456: inode #131073: comm fsstress: reading directory lblock 0
Jan 25 12:37:48 client1 kernel: [1202538.149961] EXT4-fs warning (device dm-6): htree_dirblock_to_tree:995: inode #131073: lblock 0: comm fsstress: error -5 reading directory block
Jan 25 12:37:48 client1 kernel: [1202538.155906] scsi 3:0:0:1: alua: Detached
Jan 25 12:37:48 client1 kernel: [1202538.155914] lxk ref put after: name sdn; count 0
Jan 25 12:37:48 client1 kernel: [1202538.158130] lxk kernel params 0 1 alua 1 1 service-time 0 1 1 8:80 1
Jan 25 12:37:48 client1 kernel: [1202538.158132] name dm-5; majir:minor 253:5; dm-5
Jan 25 12:37:48 client1 kernel: [1202538.159310] dm destory name dm-7; majir:minor 253:7; dm-7
Jan 25 12:37:48 client1 kernel: [1202538.161448] lxk load_table success 0
Jan 25 12:37:48 client1 kernel: [1202538.163417] lxk kernel params 0 1 alua 1 1 service-time 0 1 1 8:64 1
Jan 25 12:37:48 client1 kernel: [1202538.163419] name dm-6; majir:minor 253:6; dm-6
Jan 25 12:37:48 client1 kernel: [1202538.163498] lxk load_table success 0
Jan 25 12:37:48 client1 kernel: [1202538.163972] dm destory name dm-5; majir:minor 253:5; dm-5
Jan 25 12:37:48 client1 kernel: [1202538.196041] sd 4:0:0:3: alua: port group 00 state A non-preferred supports TOlUSNA
Jan 25 12:37:48 client1 kernel: [1202538.200208] scsi 3:0:0:3: alua: Detached
Jan 25 12:37:48 client1 kernel: [1202538.224361] lxk ref put after: name sdi; count 0
Jan 25 12:37:48 client1 kernel: [1202538.225086] lxk kernel params 0 1 alua 1 1 service-time 0 1 1 8:64 1
Jan 25 12:37:48 client1 kernel: [1202538.225088] name dm-6; majir:minor 253:6; dm-6
Jan 25 12:37:48 client1 kernel: [1202538.226914] lxk load_table success 0
Jan 25 12:37:48 client1 kernel: [1202538.311125] sd 4:0:0:0: [sdd] Synchronizing SCSI cache
Jan 25 12:37:48 client1 kernel: [1202538.312397] sd 4:0:0:3: alua: port group 00 state A non-preferred supports TOlUSNA
Jan 25 12:37:48 client1 kernel: [1202538.350483] device-mapper: multipath: Failing path 8:48.
Jan 25 12:37:48 client1 kernel: [1202538.378721] sd 4:0:0:3: [sde] Synchronizing SCSI cache
Jan 25 12:37:48 client1 kernel: [1202538.404214] device-mapper: multipath: Failing path 8:64.
Jan 25 12:37:48 client1 kernel: [1202538.404656] EXT4-fs warning (device dm-6): htree_dirblock_to_tree:995: inode #131073: lblock 0: comm fsstress: error -5 reading directory block
Jan 25 12:37:48 client1 kernel: [1202538.405940] EXT4-fs warning (device dm-6): htree_dirblock_to_tree:995: inode #131073: lblock 0: comm fsstress: error -5 reading directory block
Jan 25 12:37:48 client1 kernel: [1202538.467014] sd 4:0:0:2: [sdf] Synchronizing SCSI cache
Jan 25 12:37:48 client1 kernel: [1202538.568195] scsi 4:0:0:2: alua: Detached
Jan 25 12:37:48 client1 kernel: [1202538.569470] sd 4:0:0:1: [sdo] Synchronizing SCSI cache
Jan 25 12:37:48 client1 kernel: [1202538.621389] scsi host2: iSCSI Initiator over TCP/IP
Jan 25 12:37:48 client1 kernel: [1202538.625805] scsi 2:0:0:0: Direct-Access     LIO-ORG  disk1            4.0  PQ: 0 ANSI: 5
Jan 25 12:37:48 client1 kernel: [1202538.627294] scsi 2:0:0:0: alua: supports implicit and explicit TPGS
Jan 25 12:37:48 client1 kernel: [1202538.627302] scsi 2:0:0:0: alua: device naa.600140584e11eb1818c4afab12c17800 port group 0 rel port 1
Jan 25 12:37:48 client1 kernel: [1202538.627875] sd 2:0:0:0: Attached scsi generic sg2 type 0
Jan 25 12:37:48 client1 kernel: [1202538.630344] scsi 2:0:0:3: Direct-Access     LIO-ORG  disk4            4.0  PQ: 0 ANSI: 5
Jan 25 12:37:48 client1 kernel: [1202538.630507] sd 2:0:0:0: [sdf] 20971520 512-byte logical blocks: (10.7 GB/10.0 GiB)
Jan 25 12:37:48 client1 kernel: [1202538.630655] sd 2:0:0:0: [sdf] Write Protect is off
Jan 25 12:37:48 client1 kernel: [1202538.630787] scsi 2:0:0:3: alua: supports implicit and explicit TPGS
Jan 25 12:37:48 client1 kernel: [1202538.630793] scsi 2:0:0:3: alua: device naa.6001405ca5165367d67447ea68108e1d port group 0 rel port 1
Jan 25 12:37:48 client1 kernel: [1202538.630914] sd 2:0:0:0: [sdf] Write cache: enabled, read cache: enabled, supports DPO and FUA
Jan 25 12:37:48 client1 kernel: [1202538.631010] sd 2:0:0:3: Attached scsi generic sg3 type 0
Jan 25 12:37:48 client1 kernel: [1202538.632201] scsi 2:0:0:2: Direct-Access     LIO-ORG  disk3            4.0  PQ: 0 ANSI: 5
Jan 25 12:37:48 client1 kernel: [1202538.632388] sd 2:0:0:0: [sdf] Optimal transfer size 33488896 bytes
Jan 25 12:37:48 client1 kernel: [1202538.632590] scsi 2:0:0:2: alua: supports implicit and explicit TPGS
Jan 25 12:37:48 client1 kernel: [1202538.632595] scsi 2:0:0:2: alua: device naa.6001405b7679bd96b094bccbf971bc90 port group 0 rel port 1
Jan 25 12:37:48 client1 kernel: [1202538.632798] sd 2:0:0:2: Attached scsi generic sg4 type 0
Jan 25 12:37:48 client1 kernel: [1202538.633665] scsi 2:0:0:1: Direct-Access     LIO-ORG  disk2            4.0  PQ: 0 ANSI: 5
Jan 25 12:37:48 client1 kernel: [1202538.633869] sd 2:0:0:3: [sdh] 20971520 512-byte logical blocks: (10.7 GB/10.0 GiB)
Jan 25 12:37:48 client1 kernel: [1202538.634858] scsi 2:0:0:1: alua: supports implicit and explicit TPGS
Jan 25 12:37:48 client1 kernel: [1202538.634864] scsi 2:0:0:1: alua: device naa.600140531f063b3e19349bc82028e0cc port group 0 rel port 1
Jan 25 12:37:48 client1 kernel: [1202538.635089] sd 2:0:0:1: Attached scsi generic sg5 type 0
Jan 25 12:37:48 client1 kernel: [1202538.635295] sd 2:0:0:3: [sdh] Write Protect is off
Jan 25 12:37:48 client1 kernel: [1202538.635851] sd 2:0:0:2: [sdi] 20971520 512-byte logical blocks: (10.7 GB/10.0 GiB)
Jan 25 12:37:48 client1 kernel: [1202538.636726] sd 2:0:0:2: [sdi] Write Protect is off
Jan 25 12:37:48 client1 kernel: [1202538.636763] sd 2:0:0:0: alua: port group 00 state A non-preferred supports TOlUSNA
Jan 25 12:37:48 client1 kernel: [1202538.636988] sd 2:0:0:2: [sdi] Write cache: enabled, read cache: enabled, supports DPO and FUA
Jan 25 12:37:48 client1 kernel: [1202538.638092] sd 2:0:0:1: [sdk] 20971520 512-byte logical blocks: (10.7 GB/10.0 GiB)
Jan 25 12:37:48 client1 kernel: [1202538.638188] sd 2:0:0:2: [sdi] Optimal transfer size 33488896 bytes
Jan 25 12:37:48 client1 kernel: [1202538.638271] sd 2:0:0:1: [sdk] Write Protect is off
Jan 25 12:37:48 client1 kernel: [1202538.638454] sd 2:0:0:1: [sdk] Write cache: enabled, read cache: enabled, supports DPO and FUA
Jan 25 12:37:48 client1 kernel: [1202538.638618] sd 2:0:0:1: [sdk] Optimal transfer size 33488896 bytes
Jan 25 12:37:48 client1 kernel: [1202538.639700] sd 2:0:0:3: [sdh] Write cache: enabled, read cache: enabled, supports DPO and FUA
Jan 25 12:37:48 client1 kernel: [1202538.640445] sd 2:0:0:3: alua: port group 00 state A non-preferred supports TOlUSNA
Jan 25 12:37:48 client1 kernel: [1202538.640798] sd 2:0:0:3: [sdh] Optimal transfer size 33488896 bytes
Jan 25 12:37:48 client1 kernel: [1202538.648538] sd 2:0:0:1: alua: port group 00 state A non-preferred supports TOlUSNA
Jan 25 12:37:48 client1 kernel: [1202538.648550] sd 2:0:0:2: alua: port group 00 state A non-preferred supports TOlUSNA
Jan 25 12:37:48 client1 kernel: [1202538.808805] sd 2:0:0:1: [sdk] Attached SCSI disk
Jan 25 12:37:48 client1 kernel: [1202538.814309] sd 2:0:0:2: [sdi] Attached SCSI disk
Jan 25 12:37:48 client1 kernel: [1202538.820568] sd 2:0:0:3: [sdh] Attached SCSI disk
Jan 25 12:37:48 client1 kernel: [1202538.825745] sd 2:0:0:0: [sdf] Attached SCSI disk
Jan 25 12:37:48 client1 kernel: [1202539.030796] EXT4-fs warning (device dm-6): htree_dirblock_to_tree:995: inode #131073: lblock 0: comm fsstress: error -5 reading directory block
Jan 25 12:37:48 client1 kernel: [1202539.033019] EXT4-fs warning (device dm-6): htree_dirblock_to_tree:995: inode #131073: lblock 0: comm fsstress: error -5 reading directory block
Jan 25 12:37:48 client1 kernel: [1202539.033378] EXT4-fs warning (device dm-6): htree_dirblock_to_tree:995: inode #131073: lblock 0: comm fsstress: error -5 reading directory block
Jan 25 12:37:48 client1 kernel: [1202539.039261] EXT4-fs warning (device dm-6): htree_dirblock_to_tree:995: inode #131073: lblock 0: comm fsstress: error -5 reading directory block
Jan 25 12:37:48 client1 kernel: [1202539.039337] EXT4-fs warning (device dm-6): htree_dirblock_to_tree:995: inode #131073: lblock 0: comm fsstress: error -5 reading directory block
Jan 25 12:37:48 client1 kernel: [1202539.039377] EXT4-fs warning (device dm-6): htree_dirblock_to_tree:995: inode #131073: lblock 0: comm fsstress: error -5 reading directory block
Jan 25 12:37:48 client1 kernel: [1202539.040013] device-mapper: multipath: Failing path 8:224.
Jan 25 12:37:48 client1 kernel: [1202539.116181] EXT4-fs warning (device dm-6): htree_dirblock_to_tree:995: inode #131073: lblock 0: comm fsstress: error -5 reading directory block
Jan 25 12:37:48 client1 multipath[788595]: loading /lib64/multipath/libchecktur.so checker
Jan 25 12:37:48 client1 multipathd[2532437]: 3600140531f063b3e19349bc82028e0cc: load table [0 20971520 multipath 0 1 alua 1 1 service-time 0 1 1 8:224 1]
Jan 25 12:37:48 client1 multipathd[2532437]: 36001405b7679bd96b094bccbf971bc90: set ACT_CREATE (map does not exist)
Jan 25 12:37:48 client1 multipathd[2532437]: 36001405b7679bd96b094bccbf971bc90: load table [0 20971520 multipath 0 1 alua 1 1 service-time 0 1 1 8:80 1]
Jan 25 12:37:48 client1 multipathd[2532437]: 36001405ca5165367d67447ea68108e1d: set ACT_CREATE (map does not exist)
Jan 25 12:37:48 client1 multipathd[2532437]: 36001405ca5165367d67447ea68108e1d: load table [0 20971520 multipath 0 1 alua 1 1 service-time 0 1 1 8:64 1]



(1)multipath -r: The sdf is found as a path of 36001405b7679bd96b094bccbf971bc90
(iscsi node is 4:0:0:2)

(2)iscsi logout: The sdf is removed in iscsi in system time [1202538.467014].

(3)iscsi login: The sdf appears in iscsi in system time [1202538.825745].
It is a path of 3600140584e11eb1818c4afab12c17800 (iscsi node 2:0:0:0)

Here I have a doubt. When I stop in domap using gdb and iscsi log out/in,
the sdf will not  be used again becasue the disk refcount is not zero. I
add a print if the disk refcount is zero in put_disk_and_module (for
example lxk ref put after: name sdi; count 0), but there is not this print
about sdf.

(4)domap successd.

Do you have any advice about this?

Regards;
Lixiaokeng





More information about the dm-devel mailing list