[dm-devel] dm-cache performance behaviour

Zdenek Kabelac zkabelac at redhat.com
Tue Apr 5 08:36:12 UTC 2016


Dne 5.4.2016 v 09:12 Andreas Herrmann napsal(a):
> Hi,
>
> I've recently looked at performance behaviour of dm-cache and bcache.
> I've repeatedly observed very low performance with dm-cache in
> different tests. (Similar tests with bcache showed no such oddities.)
>
> To rule out user errors that might have caused this, I shortly describe
> what I've done and observed.
>
> - tested kernel version: 4.5.0
>
> - backing device: 1.5 TB spinning drive
>
> - caching device: 128 GB SSD (used for metadata and cache and size
>    of metadata part calculated based on
>    https://www.redhat.com/archives/dm-devel/2012-December/msg00046.html)
>
> - my test procedure consisted of a sequence of tests performing fio
>    runs with different data sets, fio randread performance (bandwidth
>    and IOPS) were compared, fio was invoked using something like
>
>    fio --directory=/cached-device --rw=randread --name=fio-1 \
>      --size=50G --group_reporting --ioengine=libaio \
>      --direct=1 --iodepth=1 --runtime=40 --numjobs=1
>
>    I've iterated over 10 runs for each of numjobs=1,2,3 and varied the
>    name parameter to operate with different data sets.
>
>    This procedure implied that with 3 jobs the underlying data set for
>    the test consisted of 3 files with 50G each which exceeds the size
>    of the caching device.
>
> - Between some tests I've tried to empty the cache. For dm-cache I did
>    this by unmounting the "compound" cache device, switching to cleaner
>    target, zeroing metadata part of the caching device, recreating
>    caching device and finally recreating the compound cache device
>    (during this procedure I kept the backing device unmodified).
>
>    I used dmsetup status to check for success of this operation
>    (checking for #used_cache_blocks).
>    If there is an easier way to do this please let me know -- If it's
>    documented I've missed it.
>
> - dm-cache parameters:
>    * cache_mode: writeback
>    * block size: 512 sectors
>    * migration_threshold 2048 (default)
>
> I've observed two oddities:
>
>    (1) Only fio tests with the first data set created (and thus
>    initially occupying the cache) showed decent performance
>    results. Subsequent fio tests with another data set showed poor
>    performance. I think this indicates that SMQ policy does not
>    properly promote/demote data to/from caching device in my tests.
>
>    (2) I've seen results where performance was actually below "native"
>    (w/o caching) performance of the backing device. I think that this
>    should not happen. If a data access falls back to the backing device
>    due to a cache miss I would have expected to see almost the
>    performance of the backing device. Maybe this points to a
>    performance issue in SMQ -- spending too much time in policy code
>    before falling back to the backing device.
>
> I've tried to figure out what actually happened in SMQ code in these
> cases - but eventually dismissed this. Next I want to check whether
> there might be a flaw in my test setup/dm-cache configuration.

Hi

The dm-cache SMQ/MQ is a 'slow moving' hot-spot cache.

So before the block is 'promoted' to the cache - there needs to be a reason 
for it - and it's not a plain single read.

So if the other cache promotes the block to the cache with a single block 
access you may observe different performance.

dm-cache is not targeted for 'quick' promoting of read blocks into a cache - 
rather 'slow' moving of often used blocks.

Unsure how that fits your testing environment and what you try to actually test?

Regards

PS: 256K dm-cache blocks size is quite large - it really depends upon workload 
- min supported size is 32K - lvm2 defaults to 64K...

Zdenek




More information about the dm-devel mailing list