[linux-lvm] when bringing dm-cache online, consumes all memory and reboots

Scott Mcdermott scott at smemsh.net
Mon Mar 23 21:35:45 UTC 2020

On Mon, Mar 23, 2020 at 1:26 AM Joe Thornber <thornber at redhat.com> wrote:
> On Sun, Mar 22, 2020 at 10:57:35AM -0700, Scott Mcdermott wrote:
> > [system crashed, uses all memory when brought online...]
> > parameters were set with: lvconvert --type cache
> > --cachemode writeback --cachepolicy smq
> > --cachesettings migration_threshold=10000000
> If you crash then the cache assumes all blocks are dirty and performs
> a full writeback.  You have set the migration_threshold extremely high
> so I think this writeback process is just submitting far too much io at once.
> Bring it down to around 2048 and try again.

the device wasn't visible in "dmsetup table" prior to activation, so I tried:

  lvchange -ay raidbak4/bakvol4; dmsetup message raidbak4-bakvol4 0
migration_threshold 204800

but this continued to crash, apparently the value used at activation
time is enough to crash the system.  instead using:

  lvchange --cachesettings migration_threshold=204800 raidbak4/bakvol4
  lvchange -ay raidbak4/bakvol4

it worked, and the used disk bandwidth was much lower (which, I don't
want it to be, but a functioning system is needed for the thing to
work at all).  after some time doing a lot of I/Os, it went silent and
is presumably flushed, seems to be in working order, thanks.

so I have to experiment to find the highest migration_threshold value
that won't crash my system with OOM? I don't want there to be any
cache bandwidth restriction, it should saturate and use all available
to aggressively promote (for my frequent case, working set actually
would fit entirely in cache, but it's ok if the cache learns this
slowly from usage).

seems like I should be able to use a value that means "use all
available bandwidth" that isn't going to take down my system with OOM.
even if I play with the value, I might find during some pathological
circumstance, it pushes beyond where I tested and now it crashes my
system again.  is there some safe calculation I can use to determine
the maximum amount?

More information about the linux-lvm mailing list