[dm-devel] Improve processing efficiency for addition and deletion of multipath devices

tang.junhui at zte.com.cn tang.junhui at zte.com.cn
Thu Dec 1 01:16:22 UTC 2016


Hello Ben, Hannes 

I'm sorry for the late reply.

> You can't just get the wwid with no work (look at all work uev_add_path
> does, specifically alloc_path_with_pathinfo). Now you could reorder
> this, but there isn't much point, since it is doing useful things, like
> checking if this is a spurious uevent, and necessary things, like
> figuring out the device type and using that that the configuration to
> figure out HOW to get the wwid.

IMO, WWID can be geted from uevent, since it has a ID_SERIAL filed in 
uevent body.

> It seems like what you want to do is to
> call uev_add_path multiple times, but defer most of the work that
> ev_add_path does (creating or updating the multipath device), until
> you've processed all that paths.

Not exactly, the input parameter "struct uevent *uev" is a batch of 
merged uevents, so uev_add_path() is called once to process the merged 
uevent.

> split off uev_add_path() and
> ev_add_path().
> Then uev_add_path could generate a list of fully-formed paths which
> ev_add_path() would process.
> IE generalize coalesce_paths() to work on a passed-in list rather than
> the normal vecs->pathvec.

Hannes, I think my thoughts are close to your idea now, In uev_add_path(), 

we get all information of the merged paths, and then call ev_add_path() 
to create or update the multipath device. Maybe the different between us 
is "processing all the same type uevents(add, etc.) in ev_add_path()" or 
"processing all the same type uevents(add, etc.) which came from the 
same LUN in ev_add_path()". I think your idea can also reach the goal, and
reduce the DM reload times. So we will try to code as your idea, and 
in list_merger_uevents(&uevq_tmp) only merger the same type(add, etc.) 
uevents,
add stop mergering when another type uevents are occured.

Thanks all,
Tang
 



发件人:         Hannes Reinecke <hare at suse.de>
收件人:         Benjamin Marzinski <bmarzins at redhat.com>, 
tang.junhui at zte.com.cn, 
抄送:   dm-devel at redhat.com, zhang.kai16 at zte.com.cn, Martin Wilck 
<mwilck at suse.com>, Bart Van Assche <bart.vanassche at sandisk.com>
日期:   2016/11/28 23:46
主题:   Re: [dm-devel] Improve processing efficiency for addition and 
deletion of multipath devices
发件人: dm-devel-bounces at redhat.com



On 11/28/2016 04:25 PM, Benjamin Marzinski wrote:
> On Mon, Nov 28, 2016 at 10:19:15AM +0800, tang.junhui at zte.com.cn wrote:
>>    Hello Christophe, Ben, Hannes, Martin, Bart,
>>    I am a member of host-side software development team of ZXUSP 
storage
>>    project
>>    in ZTE Corporation. Facing the market demand, our team decides to 
write
>>    code to
>>    promote multipath efficiency next month. The whole idea is in the 
mail
>>    below.We
>>    hope to participate in and make progress with the open source 
community,
>>    so any
>>    suggestion and comment would be welcome.
> 
> Like I mentioned before, I think this is a good idea in general, but the
> devil is in the details here.
> 
>>
>>    Thanks,
>>    Tang
>>
>> 
------------------------------------------------------------------------------------------------------------------------------
>> 
------------------------------------------------------------------------------------------------------------------------------
>>    1.        Problem
>>    In these scenarios, multipath processing efficiency is low:
>>    1) Many paths exist in each multipath device,
>>    2) Devices addition or deletion during iSCSI login/logout or FC link
>>    up/down.
> 
> <snip>
> 
>>    4.        Proposal
>>    Other than processing uevents one by one, uevents which coming from 
the
>>    same LUN devices can be mergered to one, and then uevent processing
>>    thread only needs to process it once, and it only produces one DM 
addition
>>    uevent which could reduce system resource consumption.
>>
>>    The example in Chapter 2 is continued to use to explain the 
proposal:
>>    1) Multipath receives block device addition uevents from udev:
>>    UDEV  [89068.806214] add 
>>     /devices/platform/host3/session44/target3:0:0/3:0:0:0/block/sdc 
(block)
>>    UDEV  [89068.909457] add 
>>     /devices/platform/host3/session44/target3:0:0/3:0:0:2/block/sdg 
(block)
>>    UDEV  [89068.944956] add 
>>     /devices/platform/host3/session44/target3:0:0/3:0:0:1/block/sde 
(block)
>>    UDEV  [89068.959215] add 
>>     /devices/platform/host5/session46/target5:0:0/5:0:0:0/block/sdh 
(block)
>>    UDEV  [89068.978558] add 
>>     /devices/platform/host5/session46/target5:0:0/5:0:0:2/block/sdk 
(block)
>>    UDEV  [89069.004217] add 
>>     /devices/platform/host5/session46/target5:0:0/5:0:0:1/block/sdj 
(block)
>>    UDEV  [89069.486361] add 
>>     /devices/platform/host4/session45/target4:0:0/4:0:0:1/block/sdf 
(block)
>>    UDEV  [89069.495194] add 
>>     /devices/platform/host4/session45/target4:0:0/4:0:0:0/block/sdd 
(block)
>>    UDEV  [89069.511628] add 
>>     /devices/platform/host4/session45/target4:0:0/4:0:0:2/block/sdi 
(block)
>>    UDEV  [89069.716292] add 
>>     /devices/platform/host6/session47/target6:0:0/6:0:0:0/block/sdl 
(block)
>>    UDEV  [89069.748456] add 
>>     /devices/platform/host6/session47/target6:0:0/6:0:0:1/block/sdm 
(block)
>>    UDEV  [89069.789662] add 
>>     /devices/platform/host6/session47/target6:0:0/6:0:0:2/block/sdn 
(block)
>>
>>    2) Multipath merges these 12 uevents to 3 internal uvents
>>    UEVENT add sdc sdh sdd sdl
>>    UEVENT add sde sdj sdf sdm
>>    UEVENT add sdg sdk sdi sdn
>>
>>    3) Multipath process these 3 uevents one by one, and only produce 3
>>    addition
>>    DM uvents, no dm change uevent exists.
>>    KERNEL[89068.899614] add      /devices/virtual/block/dm-2 (block)
>>    KERNEL[89068.955364] add      /devices/virtual/block/dm-3 (block)
>>    KERNEL[89069.018903] add      /devices/virtual/block/dm-4 (block)
> 
> Just because I'm pedantic: There will, of cource, be dm change events.
> Without them, you couldn't have a multipath device. Whenever you load a
> table in a dm device (including during the initial creation), you get a
> change event.
> 
>>    4) Udev process these uevents above, and transfer it to multipath
>>    UDEV  [89068.926428] add      /devices/virtual/block/dm-2 (block)
>>    UDEV  [89069.007511] add      /devices/virtual/block/dm-3 (block)
>>    UDEV  [89069.098054] add      /devices/virtual/block/dm-4 (block)
> 
> multipathd ignores add events for dm devices (look at uev_trigger). A dm
> device isn't set up until it's initial change event happens.
> 
>>    5) Multipath processes these uevents above, and finishes the 
creation of
>>    multipath
>>    devices.
>>
>>    5.        Coding
>>    After taking over uevents form uevent listening thread, uevent 
processing
>>    thread can
>>    merger these uevents before processing��
>>    int uevent_dispatch(int (*uev_trigger)(struct uevent *, void *
>>    trigger_data),
>>                void * trigger_data)
>>    {
>>        ...
>>        while (1) {
>>            ...
>>            list_splice_init(&uevq, &uevq_tmp);
>>            ...
>>            list_merger_uevents(&uevq_tmp);
>>            service_uevq(&uevq_tmp);
>>        }
>>        ...
>>    }
>>    In structure of ��struct uevent�� , an additional member of 
��char
>>    wwid[WWID_SIZE]�� is
>>    added to record each device WWID for addition or change uevent to 
identify
>>    whether
>>    these uevents coming from the same LUN, and an additional member of
>>    ��struct list_head merger_node�� is added to record the list of 
uevents
>>    which having been
>>    merged with this uevent:
>>    struct uevent {
>>        struct list_head node;
>>        struct list_head merger_node;
>>        char wwid[WWID_SIZE]
>>        struct udev_device *udev;
>>        ...
>>    };
> 
> You can't just get the wwid with no work (look at all work uev_add_path
> does, specifically alloc_path_with_pathinfo). Now you could reorder
> this, but there isn't much point, since it is doing useful things, like
> checking if this is a spurious uevent, and necessary things, like
> figuring out the device type and using that that the configuration to
> figure out HOW to get the wwid. It seems like what you want to do is to
> call uev_add_path multiple times, but defer most of the work that
> ev_add_path does (creating or updating the multipath device), until
> you've processed all that paths.
> 
Which is kinda what I've proposed; split off uev_add_path() and
ev_add_path().
Then uev_add_path could generate a list of fully-formed paths which
ev_add_path() would process.
IE generalize coalesce_paths() to work on a passed-in list rather than
the normal vecs->pathvec.

Or consider using vecs->pathvec to hold 'orphan' paths when processing
uevents; then we could add the paths to the pathvec in uev_add_path()
and have another thread going through the list of paths and trying to
call coalesce_paths() for the changed mpps.
Will be a tad tricky, as we'd need to identify which mpps needs to be
updated; but nothing too awkward.

Meanwhile I'm working on converting vecs->pathvec and vecs->mpvec to RCU
lists; that should get rid of the need for most locks, which I suspect
being the main reason for the scaling issues.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                                 Teamlead Storage & 
Networking
hare at suse.de +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

--
dm-devel mailing list
dm-devel at redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/dm-devel/attachments/20161201/1b8a1267/attachment.htm>


More information about the dm-devel mailing list