[dm-devel] [RFC PATCH 0/4] dm mpath: vastly improve blk-mq IO performance
Johannes Thumshirn
jthumshirn at suse.de
Fri Apr 1 13:37:36 UTC 2016
[ +Cc Hannes ]
On 2016-04-01 15:22, Mike Snitzer wrote:
> On Fri, Apr 01 2016 at 4:12am -0400,
> Johannes Thumshirn <jthumshirn at suse.de> wrote:
>
>> On 2016-03-31 22:04, Mike Snitzer wrote:
>> >I developed these changes some weeks ago but have since focused on
>> >regression and performance testing on larger NUMA systems.
>> >
>> >For regression testing I've been using mptest:
>> >https://github.com/snitm/mptest
>> >
>> >For performance testing I've been using a null_blk device (with
>> >various configuration permutations, e.g. pinning memory to a
>> >particular NUMA node, and varied number of submit_queues).
>> >
>> >By eliminating multipath's heavy use of the m->lock spinlock in the
>> >fast IO paths serious performance improvements are realized.
>>
>> Hi Mike,
>>
>> Are this the patches you pointed Hannes to?
>>
>> If yes, please add my Tested-by: Johannes Thumshirn
>> <jthumshirn at suse.de>
>
> No they are not.
>
> Hannes seems to have last pulled in my DM mpath changes that (ab)used
> RCU.
> I ended up dropping those changes and this patchset is the replacement.
Now that you're saying it I can remember some inspiring RCU usage in the
patches.
> So please retest with this patchset (I know you guys have a large setup
> that these changes are very relevant for). If you could actually share
> _how_ yo've tested that'd help me understand how these changes are
> holding up. So far all looks good for me...
The test itself is actually quite simple, we're testing with fio against
a fiber channel array (all SSDs but I was very careful to only write
into the cache)
Here's my fio job file:
[mq-test]
iodepth=128
numjobs=40
group_reporting
direct=1
ioengine=libaio
size=3G
filename=/dev/dm-0
filename=/dev/dm-1
filename=/dev/dm-2
filename=/dev/dm-3
filename=/dev/dm-4
filename=/dev/dm-5
filename=/dev/dm-6
filename=/dev/dm-7
name="MQ Test"
and the test runner:
#!/bin/sh
for rw in 'randread' 'randwrite' 'read' 'write'; do
for bs in '4k' '8k' '16k' '32k' '64k'; do
fio mq-test.fio --bs="${bs}" --rw="${rw}"
--output="fio-${bs}-${rw}.txt"
done
done
The initiator has 40 CPUs on 4 NUMA nodes (no HT) and 64GB RAM. I'm not
sure how much in term of numbers I can share from the old patchset (will
ask Hannes on Monday), but I'm aware I'll have to when I retested with
your new patches and we want to compare the results.
Byte,
Johannes
More information about the dm-devel
mailing list