<font size=2>Hello Ben, Hannes </font>
<br>
<br><font size=2>I'm sorry for the late reply.</font>
<br>
<br><font size=2>> You can't just get the wwid with no work (look at
all work uev_add_path<br>
> does, specifically alloc_path_with_pathinfo). Now you could reorder<br>
> this, but there isn't much point, since it is doing useful things,
like<br>
> checking if this is a spurious uevent, and necessary things, like<br>
> figuring out the device type and using that that the configuration
to<br>
> figure out HOW to get the wwid.</font>
<br>
<br><font size=2>IMO, WWID can be geted from uevent, since it has a ID_SERIAL
filed in uevent body.</font>
<br>
<br><font size=2>> It seems like what you want to do is to<br>
> call uev_add_path multiple times, but defer most of the work that<br>
> ev_add_path does (creating or updating the multipath device), until<br>
> you've processed all that paths.</font>
<br>
<br><font size=2>Not exactly, the input parameter "struct uevent *uev"
is a batch of </font>
<br><font size=2>merged uevents, so uev_add_path() is called once to process
the merged </font>
<br><font size=2>uevent.</font>
<br>
<br><font size=2>> split off uev_add_path() and<br>
> ev_add_path().<br>
> Then uev_add_path could generate a list of fully-formed paths which<br>
> ev_add_path() would process.<br>
> IE generalize coalesce_paths() to work on a passed-in list rather
than<br>
> the normal vecs->pathvec.</font>
<br>
<br><font size=2>Hannes, I think my thoughts are close to your idea now,
In uev_add_path(), </font>
<br><font size=2>we get all information of the merged paths, and then call
ev_add_path() </font>
<br><font size=2>to create or update the multipath device. Maybe the different
between us </font>
<br><font size=2>is "processing all the same type uevents(add, etc.)
in ev_add_path()" or </font>
<br><font size=2>"processing all the same type uevents(add, etc.)
which came from the </font>
<br><font size=2>same LUN in ev_add_path()". I think your idea can
also reach the goal, and</font>
<br><font size=2>reduce the DM reload times. So we will try to code as
your idea, and </font>
<br><font size=2 face="Times New Roman">in list_merger_uevents(&uevq_tmp)</font><font size=2>
only merger the same type(add, etc.) uevents,</font>
<br><font size=2>add stop mergering when another type uevents are occured.</font>
<br>
<br><font size=2>Thanks all,</font>
<br><font size=2>Tang</font>
<br><tt><font size=2> </font></tt>
<br>
<br>
<br>
<br><font size=1 color=#5f5f5f face="sans-serif">发件人:
</font><font size=1 face="sans-serif">Hannes Reinecke
<hare@suse.de></font>
<br><font size=1 color=#5f5f5f face="sans-serif">收件人:
</font><font size=1 face="sans-serif">Benjamin Marzinski
<bmarzins@redhat.com>, tang.junhui@zte.com.cn, </font>
<br><font size=1 color=#5f5f5f face="sans-serif">抄送:
</font><font size=1 face="sans-serif">dm-devel@redhat.com,
zhang.kai16@zte.com.cn, Martin Wilck <mwilck@suse.com>, Bart Van
Assche <bart.vanassche@sandisk.com></font>
<br><font size=1 color=#5f5f5f face="sans-serif">日期:
</font><font size=1 face="sans-serif">2016/11/28
23:46</font>
<br><font size=1 color=#5f5f5f face="sans-serif">主题:
</font><font size=1 face="sans-serif">Re: [dm-devel]
Improve processing efficiency for addition and deletion of multipath devices</font>
<br><font size=1 color=#5f5f5f face="sans-serif">发件人:
</font><font size=1 face="sans-serif">dm-devel-bounces@redhat.com</font>
<br>
<hr noshade>
<br>
<br>
<br><tt><font size=2>On 11/28/2016 04:25 PM, Benjamin Marzinski wrote:<br>
> On Mon, Nov 28, 2016 at 10:19:15AM +0800, tang.junhui@zte.com.cn wrote:<br>
>> Hello Christophe, Ben, Hannes, Martin, Bart,<br>
>> I am a member of host-side software development team
of ZXUSP storage<br>
>> project<br>
>> in ZTE Corporation. Facing the market demand, our
team decides to write<br>
>> code to<br>
>> promote multipath efficiency next month. The whole
idea is in the mail<br>
>> below.We<br>
>> hope to participate in and make progress with the
open source community,<br>
>> so any<br>
>> suggestion and comment would be welcome.<br>
> <br>
> Like I mentioned before, I think this is a good idea in general, but
the<br>
> devil is in the details here.<br>
> <br>
>><br>
>> Thanks,<br>
>> Tang<br>
>><br>
>> ------------------------------------------------------------------------------------------------------------------------------<br>
>> ------------------------------------------------------------------------------------------------------------------------------<br>
>> 1. Problem<br>
>> In these scenarios, multipath processing efficiency
is low:<br>
>> 1) Many paths exist in each multipath device,<br>
>> 2) Devices addition or deletion during iSCSI login/logout
or FC link<br>
>> up/down.<br>
> <br>
> <snip><br>
> <br>
>> 4. Proposal<br>
>> Other than processing uevents one by one, uevents
which coming from the<br>
>> same LUN devices can be mergered to one, and then
uevent processing<br>
>> thread only needs to process it once, and it only
produces one DM addition<br>
>> uevent which could reduce system resource consumption.<br>
>><br>
>> The example in Chapter 2 is continued to use to explain
the proposal:<br>
>> 1) Multipath receives block device addition uevents
from udev:<br>
>> UDEV [89068.806214] add <br>
>> /devices/platform/host3/session44/target3:0:0/3:0:0:0/block/sdc
(block)<br>
>> UDEV [89068.909457] add <br>
>> /devices/platform/host3/session44/target3:0:0/3:0:0:2/block/sdg
(block)<br>
>> UDEV [89068.944956] add <br>
>> /devices/platform/host3/session44/target3:0:0/3:0:0:1/block/sde
(block)<br>
>> UDEV [89068.959215] add <br>
>> /devices/platform/host5/session46/target5:0:0/5:0:0:0/block/sdh
(block)<br>
>> UDEV [89068.978558] add <br>
>> /devices/platform/host5/session46/target5:0:0/5:0:0:2/block/sdk
(block)<br>
>> UDEV [89069.004217] add <br>
>> /devices/platform/host5/session46/target5:0:0/5:0:0:1/block/sdj
(block)<br>
>> UDEV [89069.486361] add <br>
>> /devices/platform/host4/session45/target4:0:0/4:0:0:1/block/sdf
(block)<br>
>> UDEV [89069.495194] add <br>
>> /devices/platform/host4/session45/target4:0:0/4:0:0:0/block/sdd
(block)<br>
>> UDEV [89069.511628] add <br>
>> /devices/platform/host4/session45/target4:0:0/4:0:0:2/block/sdi
(block)<br>
>> UDEV [89069.716292] add <br>
>> /devices/platform/host6/session47/target6:0:0/6:0:0:0/block/sdl
(block)<br>
>> UDEV [89069.748456] add <br>
>> /devices/platform/host6/session47/target6:0:0/6:0:0:1/block/sdm
(block)<br>
>> UDEV [89069.789662] add <br>
>> /devices/platform/host6/session47/target6:0:0/6:0:0:2/block/sdn
(block)<br>
>><br>
>> 2) Multipath merges these 12 uevents to 3 internal
uvents<br>
>> UEVENT add sdc sdh sdd sdl<br>
>> UEVENT add sde sdj sdf sdm<br>
>> UEVENT add sdg sdk sdi sdn<br>
>><br>
>> 3) Multipath process these 3 uevents one by one,
and only produce 3<br>
>> addition<br>
>> DM uvents, no dm change uevent exists.<br>
>> KERNEL[89068.899614] add /devices/virtual/block/dm-2
(block)<br>
>> KERNEL[89068.955364] add /devices/virtual/block/dm-3
(block)<br>
>> KERNEL[89069.018903] add /devices/virtual/block/dm-4
(block)<br>
> <br>
> Just because I'm pedantic: There will, of cource, be dm change events.<br>
> Without them, you couldn't have a multipath device. Whenever you load
a<br>
> table in a dm device (including during the initial creation), you
get a<br>
> change event.<br>
> <br>
>> 4) Udev process these uevents above, and transfer
it to multipath<br>
>> UDEV [89068.926428] add /devices/virtual/block/dm-2
(block)<br>
>> UDEV [89069.007511] add /devices/virtual/block/dm-3
(block)<br>
>> UDEV [89069.098054] add /devices/virtual/block/dm-4
(block)<br>
> <br>
> multipathd ignores add events for dm devices (look at uev_trigger).
A dm<br>
> device isn't set up until it's initial change event happens.<br>
> <br>
>> 5) Multipath processes these uevents above, and finishes
the creation of<br>
>> multipath<br>
>> devices.<br>
>><br>
>> 5. Coding<br>
>> After taking over uevents form uevent listening thread,
uevent processing<br>
>> thread can<br>
>> merger these uevents before processing��<br>
>> int uevent_dispatch(int (*uev_trigger)(struct uevent
*, void *<br>
>> trigger_data),<br>
>> void *
trigger_data)<br>
>> {<br>
>> ...<br>
>> while (1) {<br>
>> ...<br>
>> list_splice_init(&uevq,
&uevq_tmp);<br>
>> ...<br>
>> list_merger_uevents(&uevq_tmp);<br>
>> service_uevq(&uevq_tmp);<br>
>> }<br>
>> ...<br>
>> }<br>
>> In structure of ��struct uevent�� , an additional
member of ��char<br>
>> wwid[WWID_SIZE]�� is<br>
>> added to record each device WWID for addition or
change uevent to identify<br>
>> whether<br>
>> these uevents coming from the same LUN, and an additional
member of<br>
>> ��struct list_head merger_node�� is added
to record the list of uevents<br>
>> which having been<br>
>> merged with this uevent:<br>
>> struct uevent {<br>
>> struct list_head node;<br>
>> struct list_head merger_node;<br>
>> char wwid[WWID_SIZE]<br>
>> struct udev_device *udev;<br>
>> ...<br>
>> };<br>
> <br>
> You can't just get the wwid with no work (look at all work uev_add_path<br>
> does, specifically alloc_path_with_pathinfo). Now you could reorder<br>
> this, but there isn't much point, since it is doing useful things,
like<br>
> checking if this is a spurious uevent, and necessary things, like<br>
> figuring out the device type and using that that the configuration
to<br>
> figure out HOW to get the wwid. It seems like what you want to do
is to<br>
> call uev_add_path multiple times, but defer most of the work that<br>
> ev_add_path does (creating or updating the multipath device), until<br>
> you've processed all that paths.<br>
> <br>
Which is kinda what I've proposed; split off uev_add_path() and<br>
ev_add_path().<br>
Then uev_add_path could generate a list of fully-formed paths which<br>
ev_add_path() would process.<br>
IE generalize coalesce_paths() to work on a passed-in list rather than<br>
the normal vecs->pathvec.<br>
<br>
Or consider using vecs->pathvec to hold 'orphan' paths when processing<br>
uevents; then we could add the paths to the pathvec in uev_add_path()<br>
and have another thread going through the list of paths and trying to<br>
call coalesce_paths() for the changed mpps.<br>
Will be a tad tricky, as we'd need to identify which mpps needs to be<br>
updated; but nothing too awkward.<br>
<br>
Meanwhile I'm working on converting vecs->pathvec and vecs->mpvec
to RCU<br>
lists; that should get rid of the need for most locks, which I suspect<br>
being the main reason for the scaling issues.<br>
<br>
Cheers,<br>
<br>
Hannes<br>
-- <br>
Dr. Hannes Reinecke
Teamlead Storage & Networking<br>
hare@suse.de
+49 911 74053 688<br>
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg<br>
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton<br>
HRB 21284 (AG Nürnberg)<br>
<br>
--<br>
dm-devel mailing list<br>
dm-devel@redhat.com<br>
</font></tt><a href="https://www.redhat.com/mailman/listinfo/dm-devel"><tt><font size=2>https://www.redhat.com/mailman/listinfo/dm-devel</font></tt></a>
<br>