[dm-devel] [PATCH 0/2] dm kcopyd: dm snapshot: Fix bugs causing excessive memory usage and workqueue stalls

Nikos Tsironis ntsironis at arrikto.com
Wed Oct 31 21:53:07 UTC 2018


This patchset fixes two kcopyd and dm-snapshot related issues:

  1. If kcopyd is not used properly, the kcopyd job slab cache can grow
     without limit causing excessive memory usage, which can lead to
     user processes being killed by the OOM killer.

  2. The kcopyd thread can hog the CPU while processing the submitted
     jobs resulting in workqueue stalls and performance degradation.

The root cause for issue (1) is the way dm-snapshot uses kcopyd. In
particular, the lack of an explicit or implicit limit to the maximum
number of in-flight COW jobs.

Issue (2) is caused by:

  a. Again, the lack of an upper limit to the number of in-flight kcopyd
     jobs created by dm-snapshot.

  b. The fact that one can keep inserting new completed jobs through
     dm_kcopyd_do_callback() or submitting copy jobs with a source size
     of 0, while the kcopyd thread is processing these jobs. This can
     lead to the kcopyd thread running completed jobs indefinitely.

Relevant dmesg outputs are provided per patch.

These issues were discovered using the device mapper test suite [1] and
the fixes were also tested with it. The relevant tests used are
documented per patch.

[1] https://github.com/jthornber/device-mapper-test-suite

Nikos Tsironis (2):
  dm snapshot: Fix excessive memory usage and workqueue stalls
  dm kcopyd: Fix bug causing workqueue stalls

 drivers/md/dm-kcopyd.c | 19 ++++++++++++++-----
 drivers/md/dm-snap.c   | 22 ++++++++++++++++++++++
 2 files changed, 36 insertions(+), 5 deletions(-)

-- 
2.11.0




More information about the dm-devel mailing list