[dm-devel] linux-next: Boot hangs 3 minutes with device mapper on s390

Mike Snitzer snitzer at redhat.com
Fri Mar 1 18:30:50 UTC 2019


On Fri, Mar 01 2019 at 12:33pm -0500,
Michael Holzheu <holzheu at linux.ibm.com> wrote:

> Hi Mike,
> 
> On Fedora 29, the following "linux-next" commit introduced a regression on s390:
> 
>   commit 1efa3bb79d3de8ca1b7f6770313a1fc0bebe25c7
>   Author: Mike Snitzer <snitzer at redhat.com>
>   Date:   Fri Feb 22 11:23:01 2019 -0500
> 
>     dm: must allocate dm_noclone for stacked noclone devices
>     
>     Otherwise various lvm2 testsuite tests fail because the lower layers of
>     the stacked noclone device aren't updated to allocate a new 'struct
>     dm_clone' that reflects the upper layer bio that was issued to it.
>     
>     Fixes: 97a89458020b38 ("dm: improve noclone bio support")
>     Reported-by: Mikulas Patocka <mpatocka at redhat.com>
>     Signed-off-by: Mike Snitzer <snitzer at redhat.com>
> 
> With this commit the boot hangs three minutes on a z/VM system with the
> following device mapper setup:
> 
> # dmsetup ls --tree
> mpathe (252:5)
>  ├─ (8:128)
>  └─ (8:144)
> mpathd (252:4)
>  ├─ (8:96)
>  └─ (8:112)
> mpathc (252:3)
>  ├─ (8:64)
>  └─ (8:80)
> mpathb (252:2)
>  ├─ (8:32)
>  └─ (8:48)
> mpatha1 (252:1)
>  └─mpatha (252:0)
>     ├─ (8:16)
>     └─ (8:0)
> 
> On the console we get messages like the following:
> 
>    10.116863 sd 3:0:0:1083719813: sdj Write Protect is off 
>    10.117170 sd 3:0:0:1083719813: sdj Write cache: enabled, read cache: enabled, doesn't support DPO or FUA 
>    10.130562 sd 3:0:0:1083719813: sdj Attached SCSI disk 
>  A start job is running for udev Wai   Device Initialization (6s / 3min)
>  A start job is running for udev Wai   Device Initialization (6s / 3min)
>  A start job is running for udev Wai   Device Initialization (7s / 3min)
>  A start job is running for udev Wai   Device Initialization (7s / 3min)
>  A start job is running for udev Wai   Device Initialization (8s / 3min)
>  ...
> 
> After three minutes the boot process continues and the system comes.
> 
> As kernel config we used "performance_defconfig" (make performance_defconfig).

I'm struggling to see why this particular change would cause such a boot
stall -- but then resolve itself.

Can you provide the output from 'dmsetup table'?

Multipath defaults to blk-mq (request-based).  The change you've called
into question is related to bio-based DM.  So There must be some
bio-based DM layer ontop of the multipath devices.  Is mpatha1 a
dm-linear device layered ontop of multipath?

Thanks,
Mike




More information about the dm-devel mailing list