[dm-devel] Improve processing efficiency for addition and deletion of multipath devices

Zdenek Kabelac zkabelac at redhat.com
Mon Nov 28 11:51:49 UTC 2016


Dne 28.11.2016 v 11:42 Hannes Reinecke napsal(a):
> On 11/28/2016 11:06 AM, Zdenek Kabelac wrote:
>> Dne 28.11.2016 v 03:19 tang.junhui at zte.com.cn napsal(a):
>>> Hello Christophe, Ben, Hannes, Martin, Bart,
>>> I am a member of host-side software development team of ZXUSP storage
>>> project
>>> in ZTE Corporation. Facing the market demand, our team decides to
>>> write code to
>>> promote multipath efficiency next month. The whole idea is in the mail
>>> below.We
>>> hope to participate in and make progress with the open source
>>> community, so any
>>> suggestion and comment would be welcome.
>>>
>>
>>
>> Hi
>>
>> First - we are aware of these issue.
>>
>> The solution proposed in this mail would surely help - but there is
>> likely a bigger issue to be solved first.
>>
>> The core trouble is to avoid  'blkid' disk identification to be executed.
>> Recent version of multipath is already marking plain 'RELOAD' operation
>> of table (which should not be changing disk content) with extra DM bit,
>> so udev rules ATM skips 'pvscan' - we also would like to extend the
>> functionality to skip rules more and reimport existing 'symlinks' from
>> udev database (so they would not get deleted).
>>
>> I believe the processing of udev rules is 'relatively' quick as long
>> as it does not need to read/write ANYTHING from real disks.
>>
> Hmm. You sure this is an issue?
> We definitely need to skip uevent handling when a path goes down (but I
> think we do that already), but for 'add' events we absolutely need to
> call blkid to figure out if the device has changed.
> There are storage arrays out there who use a 'path down/path up' cycle
> to inform initiators about any device layout change.
> So we wouldn't be able to handle those properly if we don't call blkid here.

The core trouble is -


With multipath device - you ONLY want to 'scan' device (with blkid)  when
only the initial first member device of multipath gets in.

So you start multipath (resume -> CHANGE) - it should be the ONLY place
to run 'blkid' test (which really goes though over 3/4MB of disk read,
to check if there is not ZFS somewhere)

Then any next disk being a member of multipath (recognized by 'multipath -c',
should NOT scan)  - as far  as  I can tell current order is opposite,
fist there is  'blkid' (60) and then rule (62) recognizes a mpath_member.

Thus every add disk fires very lengthy blkid scan.

Of course I'm not here an expert on dm multipath rules so passing this on to 
prajnoha@ -  but I'd guess this is primary source of slowdowns.

There should be exactly ONE blkid for a single multipath device - as
long as 'RELOAD' only  add/remove  paths  (there is no reason to scan
component devices)

>
>> So while aggregation of 'uevents' in multipath would 'shorten' queue
>> processing of events - it would still not speedup scan alone.
>>
>> We need to drastically shorten unnecessary disk re-scanning.
>>
>> Also note - if you have a lot of disks -  it might be worth to checkout
>> whether udev picks  'right amount of udev workers'.
>> There is heuristic logic to avoid system overload - but might be worth
>> to check if in you system with your amount of CPU/RAM/DISKS  the
>> computed number is the best for scaling - i.e. if you double the amount
>> of workers - do you
>> get any better performance ?
>>
> That doesn't help, as we only have one queue (within multipath) to
> handle all uevents.

This was meant for systems with many different multipath devices.
Obviously would not help with a single multipath device.

Regards

Zdenek





More information about the dm-devel mailing list