[dm-devel] linux-next: Boot hangs 3 minutes with device mapper on s390
Mike Snitzer
snitzer at redhat.com
Fri Mar 1 18:30:50 UTC 2019
On Fri, Mar 01 2019 at 12:33pm -0500,
Michael Holzheu <holzheu at linux.ibm.com> wrote:
> Hi Mike,
>
> On Fedora 29, the following "linux-next" commit introduced a regression on s390:
>
> commit 1efa3bb79d3de8ca1b7f6770313a1fc0bebe25c7
> Author: Mike Snitzer <snitzer at redhat.com>
> Date: Fri Feb 22 11:23:01 2019 -0500
>
> dm: must allocate dm_noclone for stacked noclone devices
>
> Otherwise various lvm2 testsuite tests fail because the lower layers of
> the stacked noclone device aren't updated to allocate a new 'struct
> dm_clone' that reflects the upper layer bio that was issued to it.
>
> Fixes: 97a89458020b38 ("dm: improve noclone bio support")
> Reported-by: Mikulas Patocka <mpatocka at redhat.com>
> Signed-off-by: Mike Snitzer <snitzer at redhat.com>
>
> With this commit the boot hangs three minutes on a z/VM system with the
> following device mapper setup:
>
> # dmsetup ls --tree
> mpathe (252:5)
> ├─ (8:128)
> └─ (8:144)
> mpathd (252:4)
> ├─ (8:96)
> └─ (8:112)
> mpathc (252:3)
> ├─ (8:64)
> └─ (8:80)
> mpathb (252:2)
> ├─ (8:32)
> └─ (8:48)
> mpatha1 (252:1)
> └─mpatha (252:0)
> ├─ (8:16)
> └─ (8:0)
>
> On the console we get messages like the following:
>
> 10.116863 sd 3:0:0:1083719813: sdj Write Protect is off
> 10.117170 sd 3:0:0:1083719813: sdj Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> 10.130562 sd 3:0:0:1083719813: sdj Attached SCSI disk
> A start job is running for udev Wai Device Initialization (6s / 3min)
> A start job is running for udev Wai Device Initialization (6s / 3min)
> A start job is running for udev Wai Device Initialization (7s / 3min)
> A start job is running for udev Wai Device Initialization (7s / 3min)
> A start job is running for udev Wai Device Initialization (8s / 3min)
> ...
>
> After three minutes the boot process continues and the system comes.
>
> As kernel config we used "performance_defconfig" (make performance_defconfig).
I'm struggling to see why this particular change would cause such a boot
stall -- but then resolve itself.
Can you provide the output from 'dmsetup table'?
Multipath defaults to blk-mq (request-based). The change you've called
into question is related to bio-based DM. So There must be some
bio-based DM layer ontop of the multipath devices. Is mpatha1 a
dm-linear device layered ontop of multipath?
Thanks,
Mike
More information about the dm-devel
mailing list