[dm-devel] [PATCH 1/1] dm mpath: add IO affinity path selector

Mike Christie michael.christie at oracle.com
Wed Oct 28 16:01:17 UTC 2020


On 10/27/20 7:55 AM, Mike Snitzer wrote:
> On Thu, Oct 22 2020 at  8:27pm -0400,
> Mike Christie <michael.christie at oracle.com> wrote:
> 
>> This patch adds a path selector that selects paths based on a CPU to
>> path mapping the user passes in and what CPU we are executing on. The
>> primary user for this PS is where the app is optimized to use specific
>> CPUs so other PSs undo the apps handy work, and the storage and it's
>> transport are not a bottlneck.
>>
>> For these io-affinity PS setups a path's transport/interconnect
>> perf is not going to flucuate a lot and there is no major differences
>> between paths, so QL/HST smarts do not help and RR always messes up
>> what the app is trying to do.
>>
>> On a system with 16 cores, where you have a job per CPU:
>>
>> fio --filename=/dev/dm-0 --direct=1 --rw=randrw --bs=4k \
>> --ioengine=libaio --iodepth=128 --numjobs=16
>>
>> and a dm-multipath device setup where each CPU is mapped to one path:
>>
>> // When in mq mode I had to set dm_mq_nr_hw_queues=$NUM_PATHS.
> 
> OK, the modparam was/is a means to an end but the default of 1 is very
> limiting (especially in that it becomes one-size-fits-all, which isn't
> true, for all dm-multipath devices in the system).
> 
> If you have any ideas for what a sane heuristic would be for
> dm_mq_nr_hw_queues I'm open to suggestions.  But DM target <-> DM core
> <-> early block core interface coordination is "fun". ;)
I do not have any good ideas.


> 
>> // Bio mode also showed similar results.
>> 0 16777216 multipath 0 0 1 1 io-affinity 0 16 1 8:16 1 8:32 2 8:64 4
>> 8:48 8 8:80 10 8:96 20 8:112 40 8:128 80 8:144 100 8:160 200 8:176
>> 400 8:192 800 8:208 1000 8:224 2000 8:240 4000 65:0 8000
>>
>> we can see a IOPs increase of 25%.
> 
> Great. What utility/code are you using to extract the path:cpu affinity?
> Is it array specific?  Which hardware pins IO like this?

It is not specific to an array.

We use it for iscsi. We have fast networks and arrays, but to better 
utilize them you need to use multiple iscsi sessions (tcp 
connection/socket). So you typically set it up like how nvme/tcp does 
its connections/queues by default where that driver will create a TCP 
connection per CPU then map the connection to a hw queue/ctx. For iscsi, 
we set the session's IO xmit thread's affinity, then setup networking so 
the socket/connection's IO is routed to the same CPU. We then create N 
sessions and do multipath over them.

> 
> Will you, or others, be enhancing multipath-tools to allow passing such
> io-affinity DM multipath tables?

Yeah, I am working on that.




More information about the dm-devel mailing list