[linux-lvm] when bringing dm-cache online, consumes all memory and reboots

Mon Mar 23 22:02:24 UTC 2020

On Mon, Mar 23, 2020 at 2:57 AM Zdenek Kabelac <zkabelac at redhat.com> wrote:
> Dne 23. 03. 20 v 9:26 Joe Thornber napsal(a):
> > On Sun, Mar 22, 2020 at 10:57:35AM -0700, Scott Mcdermott wrote:
> > > have a 931.5 GibiByte SSD pair in raid1 (mdraid) as cache LV for a
> > > data LV on 1.8 TebiByte raid1 (mdraid) pair of larger spinning disk.
>
> Users should be 'performing' some benchmarking about the 'useful' sizes of
> hotspot areas - using nearly 1T of cache for 1.8T of origin doesn't look
> the right ration for caching.
> (i.e. like if your CPU cache would be halve of your DRAM)

the 1.8T origin will be upgraded over time with larger/more spinning
disks, but the cache will remain as it is.  hopefully it can perform
well whether it is 1:2 cache:data as now or 1:10+ as later.

> Too big 'cache size' leads usually into way too big caching chunks
> (since we try to limit number of 'chunks' in cache to 1 milion  - you
> can rise up this limit - but it will consume a lot of your RAM space as well)
> So IMHO I'd recommend to use at most 512K chunks - which gives you
> about 256GiB of cache size -  but still users should be benchmarking what is
> the best for them...)

how to raise this limit? since I'm low RAM this is a problem, but why
are large chunks an issue, besides memory usage? is this causing
unnecessary I/O by an amplification effect? if my system doesn't have
enough memory for this job I will have to find a host board with more
RAM.

> Another hint - lvm2 introduced support for new dm-writecache target as well.

this won't work for me since a lot of my data is reads, and I'm low
memory with large numbers of files.  rsync of large trees is the main
workload; existing algorithm is not working fantastically well, but
nonetheless giving a nice boost to my rsync completion times over the
uncached times.