[dm-devel] [PATCH 2/2] dm-writecache

Dan Williams dan.j.williams at intel.com
Tue Feb 13 22:07:26 UTC 2018


On Tue, Feb 13, 2018 at 2:00 PM, Mikulas Patocka <mpatocka at redhat.com> wrote:
>
>
> On Fri, 8 Dec 2017, Dan Williams wrote:
>
>> > > > when we write to
>> > > > persistent memory using cached write instructions and use dax_flush
>> > > > afterwards to flush cache for the affected range, the performance is about
>> > > > 350MB/s. It is practically unusable - worse than low-end SSDs.
>> > > >
>> > > > On the other hand, the movnti instruction can sustain performance of one
>> > > > 8-byte write per clock cycle. We don't have to flush cache afterwards, the
>> > > > only thing that must be done is to flush the write-combining buffer with
>> > > > the sfence instruction. Movnti has much better throughput than dax_flush.
>> > >
>> > > What about memcpy_flushcache?
>> >
>> > but
>> >
>> > - using memcpy_flushcache is overkill if we need just one or two 8-byte
>> > writes to the metadata area. Why not use movnti directly?
>> >
>>
>> The driver performs so many 8-byte moves that the cost of the
>> memcpy_flushcache() function call significantly eats into your
>> performance?
>
> I've measured it on Skylake i7-6700 - and the dm-writecache driver has 2%
> lower throughput when it uses memcpy_flushcache() to update it metadata
> instead of explicitly coded "movnti" instructions.
>
> I've created this patch - it doesn't change API in any way, but it
> optimizes memcpy_flushcache for 4, 8 and 16-byte writes (that is what my
> driver mostly uses). With this patch, I can remove the explicit "asm"
> statements from my driver. Would you consider commiting this patch to the
> kernel?
>
> Mikulas
>
>

Yes, this looks good to me. You can send it to the x86 folks with my:

Reviewed-by: Dan Williams <dan.j.williams at intel.com>

...or let me know and I can chase it through the -tip tree. Either way
works for me.




More information about the dm-devel mailing list