[linux-lvm] Regression with FALLOC_FL_PUNCH_HOLE in 3.5-rc kernel
zkabelac at redhat.com
Sun Jul 1 12:57:52 UTC 2012
Dne 1.7.2012 01:10, Hugh Dickins napsal(a):
> On Sat, 30 Jun 2012, Zdenek Kabelac wrote:
>> Dne 30.6.2012 21:55, Hugh Dickins napsal(a):
>>> On Sat, 30 Jun 2012, Zdenek Kabelac wrote:
>>>> When I've used 3.5-rc kernels - I've noticed kernel deadlocks.
>>>> Ooops log included. After some experimenting - reliable way to hit this
>>>> is to run lvm test suite for 10 minutes. Since 3.5 merge window does not
>>>> included anything related to this oops I've went for bisect.
>>> Thanks a lot for reporting, and going to such effort to find
>>> a reproducible testcase that you could bisect on.
>>>> Game result is commit: 3f31d07571eeea18a7d34db9af21d2285b807a17
>>>> mm/fs: route MADV_REMOVE to FALLOC_FL_PUNCH_HOLE
>>> But this leaves me very puzzled.
>>> Is the "lvm test suite" what I find at git.fedorahosted.org/git/lvm2.git
>>> under tests/ ?
>> Yes - that's it -
>> as root:
>> cd test
>> make check_local
>> (inside test subdirectory should be enough, if not - just report any problem)
>>> If you have something else running at the same time, which happens to use
>>> madvise(,,MADV_REMOVE) on a filesystem which the commit above now enables
>>> it on (I guess ext4 from the =y in your config), then I suppose we should
>>> start searching for improper memory freeing or scribbling in its holepunch
>>> support: something that might be corrupting the dm_region in your oops.
>> What the test is doing - it creates file in LVM_TEST_DIR (default is /tmp)
>> and using loop device to simulate device (small size - it should fit bellow
>> Within this file second layer through virtual DM devices is created and
>> simulates various numbers of PV devices to play with.
> This sounds much easier to set up than I was expecting:
> thanks for the info, I'll try it later on today.
>> So since everything now support TRIM - such operations should be passed
>> down to the backend file - which probably triggers the path.
> What filesystem do you have for /tmp?
> If tmpfs, then it will make much more sense if we assume your bisection
> endpoint was off by one. Your bisection log was not quite complete;
> and even if it did appear to converge on the commit you cite, you might
> have got (un)lucky when testing the commit before it, and concluded
> "good" when more attempts would have said "bad".
> The commit before, 83e4fa9c16e4af7122e31be3eca5d57881d236fe
> "tmpfs: support fallocate FALLOC_FL_PUNCH_HOLE", would be a
> much more likely first bad commit if your /tmp is on tmpfs:
> that does indeed wire up loop to pass TRIM down to tmpfs by
> fallocate - that indeed played a part in my own testing.
> Whereas if your /tmp is on ext4, loop has been passing TRIM down
> with fallocate since v3.0. And whichever, madvise(,,MADV_REMOVE)
> should be completely irrelevant.
While I've been aware of the fact that tmpfs was enhanced with trim support -
I've not tried to run on real ext4 filesystem since for my tests I'm using
tmpfs for quite some time to safe rewrites of SSD :)
So now I've checked with real ext4 - and the bug is there as well
so I've went back - it crashes on 3.4, 3.3 and 3.2 as well.
3.1 is the first kernel which does survive (checked 5 repeated runs)
And you are correct, the first commit which causes crash really is
83e4fa9c16e4af when I use tmpfs as backend storage - the problem why I've
missed to properly identify this commit in my bisect is that crash usually
happens on the second pass of the lvm test suite 'make check_local' execution
- and I've been running test just once. To be sure I've run 5 run passes on
3.4.0-08568-gec9516f - which is OK, but 3.4.0-08569-g83e4fa9 is crashing
usually on second run, with commit 3f31d07571e the crash always happens in
the first pass.
I've also checked some rawhide kernel vmlinuz-3.5.0-0.rc2.git0.1.fc18.x86_64
and it's crashing as well - so it's probably not uniqueness of my config.
So is there any primary suspect in 3.2 which is worth to check - or I need
another day to play another bisect game ?
>>> I'll be surprised if that is the case, but it's something that you can
>>> easily check by inserting a WARN_ON(1) in mm/madvise.c madvise_remove():
>>> that should tell us what process is using it.
>> I could try that if that will help.
> That would help, if you're very sure of your bisection endpoint;
> but if your /tmp is on tmpfs, then I do think it's more likely
> that you've actually found a bug in the commit before.
the only thing which could be tricky is udev support
(by default it's not enabled ---enable-udev_sync)
However Debian based distros are distributing their own rules, which are not
100% compatible with upstream and create some unpredictable issues,
where the slowness is the least problem.)
If you have Fedora Rawhide with latest lvm2 installed - you should get pretty
well configured system for running test dir (unfortunately there is no way
to virtualize udev...)
More information about the linux-lvm