[dm-devel] limits->max_sectors is getting set to 0, why/where? [was: Re: dm: kernel oops by divide error on v4.16+]

Linus Torvalds torvalds at linux-foundation.org
Mon Apr 9 22:11:55 UTC 2018


On mobile, sorry for html crud and top posting, but here:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e9092d0d97961146655ce51f43850907d95f68c3

 Should fix it.

     Linus




On Mon, Apr 9, 2018, 14:56 Jens Axboe <axboe at kernel.dk> wrote:

> On 4/9/18 3:26 PM, Jens Axboe wrote:
> > On 4/9/18 1:32 PM, Jens Axboe wrote:
> >> On 4/9/18 12:38 PM, Mike Snitzer wrote:
> >>> On Mon, Apr 09 2018 at 11:51am -0400,
> >>> Mike Snitzer <snitzer at redhat.com> wrote:
> >>>
> >>>> On Sun, Apr 08 2018 at 12:00am -0400,
> >>>> Ming Lei <ming.lei at redhat.com> wrote:
> >>>>
> >>>>> Hi,
> >>>>>
> >>>>> The following kernel oops(divide error) is triggered when running
> >>>>> xfstest(generic/347) on ext4.
> >>>>>
> >>>>> [  442.632954] run fstests generic/347 at 2018-04-07 18:06:44
> >>>>> [  443.839480] divide error: 0000 [#1] PREEMPT SMP PTI
> >>>>> [  443.840201] Dumping ftrace buffer:
> >>>>> [  443.840692]    (ftrace buffer empty)
> >>> ...
> >>>>> [  443.845756] CPU: 1 PID: 29607 Comm: dmsetup Not tainted
> 4.16.0_f605ba97fb80_master+ #1
> >>>>> [  443.846968] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
> BIOS 1.10.2-2.fc27 04/01/2014
> >>>>> [  443.848147] RIP: 0010:pool_io_hints+0x77/0x153 [dm_thin_pool]
> >>>
> >>> ...
> >>>
> >>>> I was able to reproduce (in my case RIP was pool_io_hints+0x45)
> >>>>
> >>>> Which on my kernel, is:
> >>>>
> >>>> crash> dis -l pool_io_hints+0x45
> >>>> /root/snitm/git/linux/drivers/md/dm-thin.c: 2748
> >>>> 0xffffffffc0765165 <pool_io_hints+69>:  div    %rdi
> >>>>
> >>>> Which is drivers/md/dm-thin.c:is_factor()'s return
> >>>> !sector_div(block_size, n);
> >>>>
> >>>> SO looking at pool_io_hints() it would seem limits->max_sectors is 0
> for
> >>>> this xfstests device... why would that be!?
> >>>>
> >>>> Clearly pool_io_hints() could stand to be more defensive with a
> >>>> !limits->max_sectors negative check but is it ever really valid for
> >>>> max_sectors to be 0?
> >>>>
> >>>> Pretty sure the ultimate bug is outside DM (but not seeing an obvious
> >>>> place where block core would set max_sectors to 0, all blk-settings.c
> >>>> uses min_not_zero(), etc).
> >>>
> >>> I successfully ran this test against the linux-dm.git
> >>> "for-4.17/dm-changes" tag that Linus merged after the block changes:
> >>>  git://
> git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git
> tags/for-4.17/dm-changes
> >>>
> >>> # ./check tests/generic/347
> >>> FSTYP         -- ext4
> >>> PLATFORM      -- Linux/x86_64 thegoat 4.16.0-rc5.snitm
> >>> MKFS_OPTIONS  -- /dev/mapper/test-xfstests_scratch
> >>> MOUNT_OPTIONS -- -o acl,user_xattr /dev/mapper/test-xfstests_scratch
> /scratch
> >>>
> >>> generic/347      65s
> >>> Ran: generic/347
> >>> Passed all 1 tests
> >>>
> >>> SO this would seem to implicate some regression in the 4.17 block layer
> >>> changes.
> >>
> >> No immediate ideas come to mind, we didn't have a lot of changes and I
> >> don't see anything that looks problematic. Maybe you can try and
> >> bisect it and see what you come up with?
> >
> > I ran it, problematic commit is:
> >
> > commit 3c8ba0d61d04ced9f8d9ff93977995a9e4e96e91
> > Author: Kees Cook <keescook at chromium.org>
> > Date:   Fri Mar 30 18:52:36 2018 -0700
> >
> >     kernel.h: Retain constant expression output for max()/min()
> >
>
> The fun continues. Thinking I'd try a userspace repro and thinking it
> would be difficult to reproduce, try the attached min.c that just copies
> all the bits from include/linux/kernel.h
>
> axboe at x1:~ $ gcc -Wall -O2 -o min min.c
> axboe at x1:~ $ ./min 128 256
> min_not_zero(128, 256) = 0
>
> --
> Jens Axboe
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/dm-devel/attachments/20180409/3669572f/attachment.htm>


More information about the dm-devel mailing list