[dm-devel] Severe performance degradation for dm-cache mq since c86c3070

Andrey Korolyov andrey at xdel.ru
Tue Sep 22 11:24:59 UTC 2015


On Fri, Sep 18, 2015 at 6:30 PM, Joe Thornber <thornber at redhat.com> wrote:
> On Fri, Sep 11, 2015 at 11:32:38PM +0300, Andrey Korolyov wrote:
>> Please take a look on an
>> attached results - at a given cache size smq outperforms all other
>> algorithms but both new mq and smq performs poorly when cache fills up
>> and issuing demotions.
>
> Having looked at your results again I think there are two issues here:
>
> i) You're expecting dm-cache to be a writeback cache that streams
> _all_ writes to the SSD, and then updates the spindle in the
> background.  It's not, it's a slow moving cache that promotes specific
> regions of the spindle to the SSD.
>
> ii) You're doing random, small, sync IO.  Which evenly hits all areas.
> In this case the cache has a really hard time improving over the
> performance of the spindle.  In your tests I think you have around 6G
> of SSD and 48G of active data.  So only 1 in 8 IOs are going to hit the
> SSD; any benefits will be marginal.  You use the same SSD size when
> running with an 8G work load, so nearly every IO hits the SSD.  Hence
> better performance.
>
> Does this benchmark really reflect your use case?

Not really, I tried to reproduce worst possible case for writeback
cache. The main problem for fine representation of the production SDS
workload is a lack of hot read/write spots and apparent lack in
pattern differences - writes tends to be sequential and group over
several spots at most and reads are mostly scattered. Despite
relatively small difference b/w 3.10 and 3.18 policy in fio test above
the real-world workload with described r/w patterns differs more
heavily, but I`m afraid that nobody from the list would want to set up
a Ceph/Gluster cluster to put a couple of VMs on to to reproduce that.

For a random worst-case randrw I assume that the current SMQ *could*
be outperformed by modified version which delays block eviction in a
same manner as it was in 3.10. If I miss a case where delayed eviction
would perform worse, please point me at those.




More information about the dm-devel mailing list