[dm-devel] [RFC PATCH] dm-rq: fix double free of blk tag set in dev remove after table load fails

Benjamin Block bblock at linux.ibm.com
Thu Apr 29 21:37:00 UTC 2021


When loading a device-mapper table for a request-based mapped device,
and the allocation/initialization of the blk-mq tag set for the device
fails, a following device remove will cause a double free.

E.g. (dmesg):
  device-mapper: core: Cannot initialize queue for request-based dm-mq mapped device
  device-mapper: ioctl: unable to set up device queue for new table.
  Unable to handle kernel pointer dereference in virtual kernel address space
  Failing address: 0305e098835de000 TEID: 0305e098835de803
  Fault in home space mode while using kernel ASCE.
  AS:000000025efe0007 R3:0000000000000024
  Oops: 0038 ilc:3 [#1] SMP
  Modules linked in: ... lots of modules ...
  Supported: Yes, External
  CPU: 0 PID: 7348 Comm: multipathd Kdump: loaded Tainted: G        W      X    5.3.18-53-default #1 SLE15-SP3
  Hardware name: IBM 8561 T01 7I2 (LPAR)
  Krnl PSW : 0704e00180000000 000000025e368eca (kfree+0x42/0x330)
             R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
  Krnl GPRS: 000000000000004a 000000025efe5230 c1773200d779968d 0000000000000000
             000000025e520270 000000025e8d1b40 0000000000000003 00000007aae10000
             000000025e5202a2 0000000000000001 c1773200d779968d 0305e098835de640
             00000007a8170000 000003ff80138650 000000025e5202a2 000003e00396faa8
  Krnl Code: 000000025e368eb8: c4180041e100       lgrl    %r1,25eba50b8
             000000025e368ebe: ecba06b93a55       risbg   %r11,%r10,6,185,58
            #000000025e368ec4: e3b010000008       ag      %r11,0(%r1)
            >000000025e368eca: e310b0080004       lg      %r1,8(%r11)
             000000025e368ed0: a7110001           tmll    %r1,1
             000000025e368ed4: a7740129           brc     7,25e369126
             000000025e368ed8: e320b0080004       lg      %r2,8(%r11)
             000000025e368ede: b904001b           lgr     %r1,%r11
  Call Trace:
   [<000000025e368eca>] kfree+0x42/0x330
   [<000000025e5202a2>] blk_mq_free_tag_set+0x72/0xb8
   [<000003ff801316a8>] dm_mq_cleanup_mapped_device+0x38/0x50 [dm_mod]
   [<000003ff80120082>] free_dev+0x52/0xd0 [dm_mod]
   [<000003ff801233f0>] __dm_destroy+0x150/0x1d0 [dm_mod]
   [<000003ff8012bb9a>] dev_remove+0x162/0x1c0 [dm_mod]
   [<000003ff8012a988>] ctl_ioctl+0x198/0x478 [dm_mod]
   [<000003ff8012ac8a>] dm_ctl_ioctl+0x22/0x38 [dm_mod]
   [<000000025e3b11ee>] ksys_ioctl+0xbe/0xe0
   [<000000025e3b127a>] __s390x_sys_ioctl+0x2a/0x40
   [<000000025e8c15ac>] system_call+0xd8/0x2c8
  Last Breaking-Event-Address:
   [<000000025e52029c>] blk_mq_free_tag_set+0x6c/0xb8
  Kernel panic - not syncing: Fatal exception: panic_on_oops

When allocation/initialization of the blk-mq tag set fails in
`dm_mq_init_request_queue()`, it is uninitialized/freed, but the pointer
is not reset to NULL; so when `dev_remove()` later gets into
`dm_mq_cleanup_mapped_device()` it sees the pointer and tries to
uninitialized and free it again.

Fix this by also setting the pointer to NULL in
`dm_mq_init_request_queue()` after error-handling.

Cc: <stable at vger.kernel.org> # 4.6+
Fixes: 1c357a1e86a4 ("dm: allocate blk_mq_tag_set rather than embed in mapped_device")
Signed-off-by: Benjamin Block <bblock at linux.ibm.com>
---
 drivers/md/dm-rq.c | 1 +
 1 file changed, 1 insertion(+)

Hey,

I got this report from internal testing for a distribution kernel (you
see the version information in the dmesg output), but I checked the code
for 5.12, and it looks like nothing has change really, and the same
crash would happen there as well under the same circumstances.

I'm not sure why exactly the tag set allocation/initialization failed..
like you see in the dmesg output, the stack of `table_load()` already
unwound and the IOCTL returned, so I can't check the error-code or the
tag set state. But the crash seems obvious enough to me.

I wrote a small inject to test this with 5.12, because I didn't know how
to trigger this otherwise. It just jumps to `out_tag_set` in
`dm_mq_init_request_queue()` after the tag set is allocated, but
before `blk_mq_init_allocated_queue()` could run.

Here the kernel with inject and patch applied:

  ...
  [   52.291458] sd 0:0:0:1075265560: alua: port group 00 state A preferred supports tolusnA
  [   52.291815] sd 0:0:0:1075265560: alua: port group 00 state A preferred supports tolusnA
  [  158.242153] device-mapper: core: Cannot initialize queue for request-based dm mapped device
  [  158.242233] device-mapper: ioctl: unable to set up device queue for new table.
  [  158.761525] sd 0:0:0:1075134488: alua: port group 00 state A preferred supports tolusnA
  [  158.761839] sd 0:0:0:1075134488: alua: port group 00 state A preferred supports tolusnA
  ...

>From multipath (the userspace tool) perspective, it looks like this:

  Apr 29 23:13:03 | ds8k31_err_40184014_npiv: addmap [0 20971520 multipath 1 queue_if_no_path 1 alua 1 1 service-time 0 2 1 8:0 1 8:64 1]
  Apr 29 23:13:03 | libdevmapper: ioctl/libdm-iface.c(1923): device-mapper: reload ioctl on ds8k31_err_40184014_npiv  failed: Cannot allocate memory
  Apr 29 23:13:03 | dm_addmap: libdm task=0 error: Success
  Apr 29 23:13:03 | ds8k31_err_40184014_npiv: ignoring map
  Apr 29 23:13:03 | ds8k31_err_40184015_npiv: addmap [0 20971520 multipath 1 queue_if_no_path 1 alua 1 1 service-time 0 2 1 8:16 1 8:80 1]
  create: ds8k31_err_40184015_npiv (36005076309ffd4300000000000001815) undef IBM,2107900
  size=10G features='1 queue_if_no_path' hwhandler='1 alua' wp=undef
  `-+- policy='service-time 0' prio=50 status=undef
    |- 0:0:0:1075134488 sdb 8:16  undef ready running
    `- 1:0:1:1075134488 sdf 8:80  undef ready running
  ...

Recreating the exact same crash as in the report without the patch (but
with inject) is actually not all that easy on s390x; the double free
doesn't necessarily end up touching unmapped memory, and the crashes I
got where all over the place.. so its actually quite lucky I got this
clear report.

I don't really know the device-mapper code by heart, so thats why I
marked it as RFC.


 - Benjamin

diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
index 13b4385f4d5a..4583c5d6885f 100644
--- a/drivers/md/dm-rq.c
+++ b/drivers/md/dm-rq.c
@@ -569,6 +569,7 @@ int dm_mq_init_request_queue(struct mapped_device *md, struct dm_table *t)
 	blk_mq_free_tag_set(md->tag_set);
 out_kfree_tag_set:
 	kfree(md->tag_set);
+	md->tag_set = NULL;
 
 	return err;
 }
-- 
2.30.2




More information about the dm-devel mailing list