[dm-devel] [PATCH RFC 0/3] multipath-tools: coalesce heterogenous paths by referencing method

Mon Jul 24 07:41:57 UTC 2017

Hello Martin,
Thanks for your comments. Please find my comments inline.

On 2017/7/21 18:21, Martin Wilck wrote:
> Hello Guan,
> 
> On Fri, 2017-07-21 at 13:07 +0800, Guan Junxiong wrote:
>> This three patches support coalescing heterogenous paths by
>> referencing
>> another path identifier. This is useful in the scenario of migrating
>> data
>> for heterogenous arrays without interrupting uplayer transaction.
> 
> Maybe I'm completely misunderstanding the intention of your patch, but
> from what I gathered I think this is a  is a *very dangerous thing* to
> begin with.
> 
> Can you please explain "migrating data for heterogeneous arrays" in
> more detail. What exactly is happening here? What does the admin do, in
> what order? What's happening on the storage side?
> 

Given that Server S is running up-layer transaction based on an old Storage B,
the admin wants to migrate data from Storage B to Storage A without pausing
transaction, the workflow is like this:

First, in the storage side, the admin connects the new storage side (denoted
as Storage A)  with the old storage side (denoted as Storage B). In the connection,
Storage A acts as an host/initiator and Storage B is an target so A knows
information of B , such size , UID and SN and so on.

Second, The administrator modifies the coalescing rules of multipath by certain
method (such as altering multipath.conf or udev rules or executing simple multipath CLI
command if existed) to prepare for the new added paths (of Storage A) which should
be coalesced with the old path (of Storage B).

Third, the admin connects Storage A with Server S , then the new path shows up in S
. With the modified coalescing rules ,the new path is coalesced into Storage B  so that
IO can go though the new path, via Storage A ,to Storage B. That means Storage A acts
as a temporary proxy to keep transaction running.

Finally, Storage A starts to migrate data from Storage B and decides which
IO request from Serve S should be dispatched to A or B. When job done, the admin can
remove the path of Storage B.

> Say you have two different disks sda and sdd, and you use 
> 
>> uid_reference  "sd[a-c]  sdd"
> 
> With your patch applied, these devices show up with different WWIDs in
> the system, but multipathd pretends that sda has the same WWID as sdd
> and coalesces the two into one map.
> 
> Under normal conditions, this would inevitably cause data corruption,
> unless the disks are really actually the same (but if that's the case,
> why do they show different WWIDs?), or some entity (the storage array?)
> mirrors the data between the disks behind the scenes> I'd like to understand what's going on and how you are prevent data
> corruption in this scenario.
>

Yes, certain entity mirrors data between the disks. In the above scenario,
it migrates data from source B to A and "route" IO request from server.
The entity knows which data has been migrated and which not. So data corruption
can be avoided.

> Am I understanding correctly that this would be a temporary situation
> during some sort of data migration procedure? If yes, what happens
> after the migration is finished? If this is actually a transient
> condition, I don't think using multipath.conf for this is a good idea.

This data migration procedure is temporary (maybe it takes 1 hour to finish)
but we need to keep the coalescing rules persistent across reboot. If it
is not persistent, a new mapper device will created when the server is reboot.
Using multipath.conf or udev rules files，the coalescing rules can keep persistent
across reboot.

> Multipath.conf is normally used to store persistent system state.
> Suppose someone adds a uid_reference to multipath conf, migrates data,
> and forgets to remove the uid_reference from multipath.conf (and
> restart multipathd) afterwards. If disks are added later, and
> multipathd uses the uid_reference mapping for two unrelated disks, data
> corruption will occur. *This is dangerous*. It'd be safer if you'd use
> mapping by WWID ("pretend that WWID x is actually WWID y") rather than
> mapping by device name.

Aligned with you. Better to use WWID to avoid dangerous things but we really
need another automatic way to figure out what the old path' WWID is used for
multipath. (You know, uid_attribute contains  ID_Serial, ID_UID and ID_WWN).

I will send out an updated patch using WWID after enough rounds of communication.

> Maybe I'm missing something important here, therefore I'm asking for
> more explanation.
> 
> Anyway, I'm wondering if multipath configuration is the right place to
> apply a "fake" configuration like this. Have you considered doing this
> on the udev level? Udev has the advantage that udev immediately sees
> added or modified rules files. So if you want to pretend temporarily
> that sda is indeed the same disk as sdd, you could insert appropriate
> temporary udev rules to do that. This would have the additional benefit
> that not only multipathd sees the mangled WWID but also the rest of the
> system (IOW, multipathd's view of the system would be consistent with
> the world outside multipathd).

"udev level " solution is a good thing. Thanks.
I still have a doubt whether a udev rules is enough to meet this fake configuration.
IOW ,in addition to udev rules , do we still have to write a C program to fetch an set
sth . I will write a demo of this to see whether it has problems or not.

Regards,
Guan Junxiong

> Postponing a detailed technical patch review until these questions are
> clarified.
> 
> Regards,
> Martin
>