[dm-devel] [RFC PATCH 0/4] dm mpath: vastly improve blk-mq IO performance

Johannes Thumshirn jthumshirn at suse.de
Fri Apr 1 13:37:36 UTC 2016


[ +Cc Hannes ]

On 2016-04-01 15:22, Mike Snitzer wrote:
> On Fri, Apr 01 2016 at  4:12am -0400,
> Johannes Thumshirn <jthumshirn at suse.de> wrote:
> 
>> On 2016-03-31 22:04, Mike Snitzer wrote:
>> >I developed these changes some weeks ago but have since focused on
>> >regression and performance testing on larger NUMA systems.
>> >
>> >For regression testing I've been using mptest:
>> >https://github.com/snitm/mptest
>> >
>> >For performance testing I've been using a null_blk device (with
>> >various configuration permutations, e.g. pinning memory to a
>> >particular NUMA node, and varied number of submit_queues).
>> >
>> >By eliminating multipath's heavy use of the m->lock spinlock in the
>> >fast IO paths serious performance improvements are realized.
>> 
>> Hi Mike,
>> 
>> Are this the patches you pointed Hannes to?
>> 
>> If yes, please add my Tested-by: Johannes Thumshirn 
>> <jthumshirn at suse.de>
> 
> No they are not.
> 
> Hannes seems to have last pulled in my DM mpath changes that (ab)used 
> RCU.
> I ended up dropping those changes and this patchset is the replacement.

Now that you're saying it I can remember some inspiring RCU usage in the 
patches.

> So please retest with this patchset (I know you guys have a large setup
> that these changes are very relevant for).  If you could actually share
> _how_ yo've tested that'd help me understand how these changes are
> holding up.  So far all looks good for me...

The test itself is actually quite simple, we're testing with fio against 
a fiber channel array (all SSDs but I was very careful to only write 
into the cache)

Here's my fio job file:
[mq-test]
iodepth=128
numjobs=40
group_reporting
direct=1
ioengine=libaio
size=3G
filename=/dev/dm-0
filename=/dev/dm-1
filename=/dev/dm-2
filename=/dev/dm-3
filename=/dev/dm-4
filename=/dev/dm-5
filename=/dev/dm-6
filename=/dev/dm-7
name="MQ Test"

and the test runner:
#!/bin/sh

for rw in 'randread' 'randwrite' 'read' 'write'; do
         for bs in '4k' '8k' '16k' '32k' '64k'; do
                 fio mq-test.fio --bs="${bs}" --rw="${rw}" 
--output="fio-${bs}-${rw}.txt"
         done
done

The initiator has 40 CPUs on 4 NUMA nodes (no HT) and 64GB RAM. I'm not 
sure how much in term of numbers I can share from the old patchset (will 
ask Hannes on Monday), but I'm aware I'll have to when I retested with 
your new patches and we want to compare the results.

Byte,
    Johannes




More information about the dm-devel mailing list