[dm-devel] [4.7.0rc6] Page Allocation Failures with dm-crypt

Matthias Dahl ml_linux-kernel at binary-island.eu
Mon Jul 11 14:47:17 UTC 2016


Hello Mike...

On 2016-07-11 15:30, Mike Snitzer wrote:

> But that is expected given you're doing an unbounded buffered write to
> the device.  What isn't expected, to me anyway, is that the mm 
> subsystem
> (or the default knobs for buffered writeback) would be so aggressive
> about delaying writeback.

Ok. But, and please correct me if I am wrong, I was under the impression
that only the file caches/buffers were affected, iow, if I use free to
monitor the memory usage, the used memory increases to the point where 
it
consumes all memory, not the buffers/file caches... that is what I am
seeing here.

Also, if I use dd directly on the device w/o dm-crypt in-between, there
is no problem. Sure, buffers increase hugely also... but only those.

> Why are you doing this test anyway?  Such a large buffered write 
> doesn't
> seem to accurately model any application I'm aware of (but obviously it
> should still "work").

It is not a test per se. I simply wanted to fill the partition with 
noise.
And doing it this way is faster than using urandom or anything. ;-) That 
is
why I stumbled over this issue in the first place.

> Now that is weird.  Are you (or the distro you're using) setting any mm
> subsystem tunables to really broken values?

You can see those in my initial mail. I attached the kernel warnings, 
all
sysctl tunables and more. Maybe that helps.

> What is your raid10's full stripesize?

4 disks in RAID10, with a stripe size of 64k.

> Is your dd IO size of 512K somehow triggering excess R-M-W cycles which
> is exacerbating the problem?

The partitions are properly aligned. And as you can see, with that 
stripe
size, there is no issue.

In the meantime I did some further tests: I created an ext2 on the
partition as well as a 60GiB container image on it. I used that image
with dm-crypt, same parameters as before. No matter what I do here, I
cannot trigger the same behavior.

Maybe it is an interaction issue between dm-crypt and the s/w RAID. But
at this point, I have no idea how to further diagnose/test it. If you
can point me in any direction that would be great...

With Kind Regards from Germany
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu
  services: custom software [desktop, mobile, web], server administration




More information about the dm-devel mailing list