[dm-devel] BUG: unable to handle kernel paging request

Igor Druzhinin igor.druzhinin at citrix.com
Tue Jan 10 19:54:37 UTC 2017


Hi,

During a multipath failover we are getting a crash like this below. The
host is booting from an FCoE device so as the root disk is a remote LUN
somewhere in the network. The failing over is intended behavior - we
disconnect all the network links for a short period of time while still
queuing all the outstanding requests during the outage. In most cases,
we only receive one or two messages like this one:

[  100.888455] blk_update_request: I/O error, dev dm-0, sector 55432

which is normal. But sometimes (which is quite rare) we may also get the
following output:

[  100.888479] Buffer I/O error on device dm-4, logical block 6673

that usually precedes the crash. (dm-0 is the root multipath device,
while dm-4 is an LVM device on top of that). We use 4.4 kernel now.

I'm looking for some clues to understand what is actually happening
there and would appreciate any advice. I can also provide any additional
information if it helps.

Apologies, if it has been fixed before but I couldn't find any related
issue.

Thanks,
Igor

[  100.888401] Buffer I/O error on device dm-4, logical block 26030
[  100.888429] Buffer I/O error on device dm-4, logical block 26031
[  100.888455] blk_update_request: I/O error, dev dm-0, sector 55432
[  100.888479] Buffer I/O error on device dm-4, logical block 6673
[  100.888535] blk_update_request: I/O error, dev dm-0, sector 32648
[  100.888556] Buffer I/O error on device dm-4, logical block 3825
[  100.888583] Buffer I/O error on device dm-4, logical block 3826
[  100.888712] BUG: unable to handle kernel paging request at
ffffc900400b3048
[  100.888737] IP: [<ffffffffa00680e4>] map_request+0x34/0x230 [dm_mod]
[  100.888815] PGD 788c0067
[  100.888815] device-mapper: multipath: Failing path 8:192.
[  100.888861] PUD 18f09b067 PMD 18f09c067 PTE 0
[  100.888875] Oops: 0000 [#1] [  100.888914] device-mapper: multipath:
Failing path 8:16.

[  100.888938] SMP
[  100.888955] Modules linked in: openvswitch nf_defrag_ipv6 ipt_REJECT
nf_reject_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 xt_tcpudp xt_multiport
xt_conntrack nf_conntrack iptable_filter ipmi_devintf
x86_pkg_temp_thermal coretemp crc32_pclmul aesni_intel aes_x86_64
ablk_helper cryptd ipmi_si sg psmouse lrw hpilo ipmi_msghandler gf128mul
tpm_tis glue_helper sb_edac edac_core tpm wmi hed lpc_ich shpchp
i2c_i801 mfd_core nls_utf8 isofs nfsd auth_rpcgss oid_registry nfs_acl
lockd grace sunrpc ip_tables x_tables dm_service_time 8021q garp stp llc
mrp sd_mod uhci_hcd serio_raw bnx2x(O) xhci_pci ehci_pci mdio xhci_hcd
ehci_hcd vxlan ip6_udp_tunnel udp_tunnel hpsa(O) ptp scsi_transport_sas
pps_core libcrc32c dm_mirror dm_region_hash dm_log bnx2fc(O) cnic(O) uio
fcoe libfcoe libfc scsi_transport_fc scsi_dh_rdac scsi_dh_hp_sw
scsi_dh_emc scsi_dh_alua dm_multipath scsi_mod dm_mod ipv6 autofs4
[  100.889542] CPU: 2 PID: 701 Comm: kdmwork-253:0 Tainted: G
O    4.4.0+2 #1
[  100.889556] Hardware name: HP ProLiant BL460c Gen9, BIOS I36 09/12/2016
[  100.889573] task: ffff88018b009c00 ti: ffff880183080000 task.ti:
ffff880183080000
[  100.889622] RIP: e030:[<ffffffffa00680e4>]  [<ffffffffa00680e4>]
map_request+0x34/0x230 [dm_mod]
[  100.889645] RSP: e02b:ffff880183083e30  EFLAGS: 00010286
[  100.889691] RAX: 0000000000000001 RBX: ffff880183136df0 RCX:
ffff880183137190
[  100.889702] RDX: ffff88018baa8000 RSI: ffff880183136e60 RDI:
ffff880183136df0
[  100.889713] RBP: ffff880183083e68 R08: ffff88018acf7800 R09:
000000018020000e
[  100.889764] R10: 000000008b914201 R11: ffffea00062e4500 R12:
ffff880183136c80
[  100.889774] R13: ffff88018baa8000 R14: ffffc900400b3040 R15:
ffff88018b009c00
[  100.889798] FS:  0000000000000000(0000) GS:ffff88018f840000(0000)
knlGS:ffff88018f840000
[  100.889842] CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
[  100.889851] CR2: ffffc900400b3048 CR3: 0000000181d2b000 CR4:
0000000000042660
[  100.889867] Stack:
[  100.889904]  ffff8801fffffffb 0000000000000000 ffff88018baa8000
ffff880183136c80
[  100.889921]  ffff88018b009c00 ffff88018b009c00 ffff88018b009c00
ffff880183083e88
[  100.889937]  ffffffffa0068302 ffff88018baa83c8 ffff88018b009c00
ffff880183083ec8
[  100.889989] Call Trace:
[  100.890014]  [<ffffffffa0068302>] map_tio_request+0x22/0x40 [dm_mod]
[  100.890065]  [<ffffffff8108e0ef>] kthread_worker_fn+0xcf/0x160
[  100.890076]  [<ffffffff8108e020>] ? kthread_create_on_node+0x180/0x180
[  100.890089]  [<ffffffff8108ddf5>] kthread+0xd5/0xe0
[  100.890132]  [<ffffffff8108dd20>] ? kthread_stop+0x110/0x110
[  100.890152]  [<ffffffff815a0a4f>] ret_from_fork+0x3f/0x70
[  100.890167]  [<ffffffff8108dd20>] ? kthread_stop+0x110/0x110
[  100.890207] Code: 41 57 41 56 41 55 49 89 d5 41 54 49 89 f4 53 48 89
fb 48 83 ec 10 48 8b 77 18 4c 8b 77 08 48 c7 45 d0 00 00 00 00 48 85 f6
74 32 <49> 8b 4e 08 48 8d 57 48 48 89 75 d0 4c 89 f7 ff 51 40 83 f8 01
[  100.890436] RIP  [<ffffffffa00680e4>] map_request+0x34/0x230 [dm_mod]
[  100.890492]  RSP <ffff880183083e30>
[  100.890500] CR2: ffffc900400b3048
[  100.890519] ---[ end trace 92251d486ed850c9 ]---
[  100.898324] ERST: [Firmware Warn]: Firmware does not respond in time.
[  100.898614] sd 2:0:0:0: alua: port group 3e8 state A non-preferred
supports TolUsNA
[  102.281054] libfcoe: host1: FIP Fibre-Channel Forwarder MAC
0e:fc:00:00:00:00 deselected
[  102.281090] libfcoe: host1: FIP selected Fibre-Channel Forwarder MAC
00:2a:6a:46:69:aa
[  102.281112] host1: Assigned Port ID d10204
[  104.587926] sd 1:0:1:0: alua: port group 3e8 state A non-preferred
supports TolUsNA
[  104.589831] sd 1:0:0:0: alua: port group 3e8 state A non-preferred
supports TolUsNA




More information about the dm-devel mailing list