[dm-devel] reinstate path not working

Tejaswini Poluri tejaswinipoluri3 at gmail.com
Tue Jun 23 10:18:26 UTC 2015


Hi Ben,

This is regarding the add map issue I have been discussing. Posting the
issue again to remind.

*Case 1 : remove and add map.*
root at x86-generic-64:~# multipathd -k'show maps'
name    sysfs uuid
dmpath0 dm-0  1IET_00010001
root at x86-generic-64:~# multipathd -k'remove map dmpath0'
ok
root at x86-generic-64:~# multipathd -k'show maps'
root at x86-generic-64:~# multipathd -k'add map dmpath0'
ok
root at x86-generic-64:~# multipathd -k'show maps'
root at x86-generic-64:~#
Once a map is removed, we are able to add it only using #multipath  command
and not using multipathd tools.

I have fixed the problem with two approaches. I would like you to review
the same.
*Patch1* : By making 'remove map dmpath0' to remove only the map and not
the device. I have added extra functions discard_map and dm_remove_map so
as to not interfere with the existing code.

*Patch 2*: The approach you have suggested.By getting wwid from the mapname
and doing coalesce_paths. I have just moved the following code in
ev_add_map to cli_add_map.

r = get_refwwid(dev, DEV_DEVMAP, vecs->pathvec,&refwwid);

        if (refwwid) {
                r = coalesce_paths(vecs, NULL, refwwid,0);
                dm_lib_release();
        }

changed dev to param.

I have tested the same in all 3 versions -0.4.8, 0.4.9 and 0.5.0. It would
be great if you can review the same so that it doesn't cause any extra side
effects.

I guess Patch2 is the way u have suggested me in the previous mail. Please
review it and share your views.

Regards,
Tejaswini


On Fri, Jun 12, 2015 at 2:21 AM, Benjamin Marzinski <bmarzins at redhat.com>
wrote:

> On Wed, Jun 10, 2015 at 11:46:51AM +0530, Tejaswini Poluri wrote:
> >
> >    >    We are testing multipathd tools with all the possible options
> and the
> >    >    following fails.
> >    >
> >    >    Case 1 : remove and add map.
> >    >    root at x86-generic-64:~# multipathd -k'show maps'
> >    >    name    sysfs uuid
> >    >    dmpath0 dm-0  1IET_00010001
> >    >    root at x86-generic-64:~# multipathd -k'remove map dmpath0'
> >    >    ok
> >    >    root at x86-generic-64:~# multipathd -k'show maps'
> >    >    root at x86-generic-64:~# multipathd -k'add map dmpath0'
> >    >    ok
> >    >    root at x86-generic-64:~# multipathd -k'show maps'
> >    >    root at x86-generic-64:~#
> >    >    Once a map is removed, we are able to add it only using
> #multipath
> >    >    command and not using multipathd tools.
> >
> >    It is working the way it was designed, but possibly it would make
> sense
> >    to change the design.
> >
> >    You have mentioned that it would make sense to change the design to
> add
> >    map. Are there plans to change the design ?
> >    I am trying to understand the code flow to change the design. Can you
> >    guide me if we should stop removing the device from in the remove map
> code
> >    flow or start adding the device and the map in the add map code flow.
> >
> >     have tried to understand the remove map code flow of multipathd in
> 0.4.8
> >    code.
>
> I think that we want multipath to actually remove the map (instead of
> just not monitoring it) when you call "remove map <map>". We just want
> "add map <map>" to try to create the map if it doesn't exist.  To do
> that, you would need to first firgure out what WWID is associated with
> <map>. Presumably, <map> could either be an alias, wwid, or even the
> name of a path in the map. Once you found the map, you would have to
> call the code to create the map.
>
> Also, to answer your IRC question, no the 0.4.8 code is not still being
> developed upstream.  All upstream patches only go against the current
> head. There are no other upstream branches.
>
> -Ben
> >
> >    ev_remove_map (char * devname, struct vectors * vecs)
> >
> >              flush_map(mpp, vecs);
> >
> >                           dm_flush_map(mpp->alias, DEFAULT_TARGET);
> >
> >                                     if (!dm_map_present(mapname))
> >
> >                                            return 0;
> >
> >           if (dm_type(mapname, type) <= 0)
> >
> >                   return 1;
> >
> >           if (dm_remove_partmaps(mapname))
> >
> >                   return 1;
> >
> >           if (dm_get_opencount(mapname)) {
> >
> >                   condlog(2, "%s: map in use", mapname);
> >
> >                   return 1;
> >
> >           }
> >
> >           r = dm_simplecmd(DM_DEVICE_REMOVE, mapname);
> >
> >           if (r) {
> >
> >                   condlog(4, "multipath map %s removed", mapname);
> >
> >                                                  return 0;
> >
> >                                           }
> >
> >
> >
> >                            orphan_paths(vecs->pathvec, mpp);
> >
> >                            remove_map(mpp, vecs, stop_waiter_thread, 1);
> >
> >    Is removing this below line, the right step to stop removing the
> device ?
> >    r = dm_simplecmd(DM_DEVICE_REMOVE, mapname);
> >
> >    Regards,
> >
> >    Tejaswini
> >
> >    On Mon, Jun 8, 2015 at 11:15 AM, Tejaswini Poluri
> >    <[1]tejaswinipoluri3 at gmail.com> wrote:
> >
> >      Thanks a lot Ben for the quick and detailed reply. I have been
> >      struggling to understand and conclude the issues with multipath as
> I am
> >      the only one working from my team. Your inputs help me a lot. Thanks
> >      again.
> >      Regards,
> >      Tejaswini
> >      On Sat, Jun 6, 2015 at 3:36 AM, Benjamin Marzinski
> >      <[2]bmarzins at redhat.com> wrote:
> >
> >        On Fri, Jun 05, 2015 at 02:31:20PM +0530, Tejaswini Poluri wrote:
> >        >    Hii Ben,
> >        >
> >        >    We are testing multipathd tools with all the possible
> options and
> >        the
> >        >    following fails.
> >        >
> >        >    Case 1 : remove and add map.
> >        >    root at x86-generic-64:~# multipathd -k'show maps'
> >        >    name    sysfs uuid
> >        >    dmpath0 dm-0  1IET_00010001
> >        >    root at x86-generic-64:~# multipathd -k'remove map dmpath0'
> >        >    ok
> >        >    root at x86-generic-64:~# multipathd -k'show maps'
> >        >    root at x86-generic-64:~# multipathd -k'add map dmpath0'
> >        >    ok
> >        >    root at x86-generic-64:~# multipathd -k'show maps'
> >        >    root at x86-generic-64:~#
> >        >    Once a map is removed, we are able to add it only using
> >        #multipath
> >        >    command and not using multipathd tools.
> >
> >        It is working the way it was designed, but possibly it would make
> >        sense
> >        to change the design.  The "remove map" command, not only stops
> >        multipathd from monitoring the multipath device, but it removes it
> >        from
> >        the system as well.  The "add map" command makes multipath
> monitor an
> >        already existing multipath device that in wasn't previously
> >        monitoring.
> >        These commands do this for historical reasons.  multipathd wasn't
> >        originally in charge of creating multipath devices, multipath
> was.
> >        Once
> >        it had created the device, it ran
> >
> >        multipathd -k"add map <MAP>"
> >
> >        to make multipathd start monitoring it.  However things haven't
> worked
> >        this way since RHEL4, so possibly "add map" should actually
> create the
> >        device if it doesn't currently exist.
> >        >    Case 2 : Active paths  test case
> >        >    # while true ; do sleep 3 ; multipathd -k'remove path sdb' ;
> >        multipathd
> >        >    -k'add path sdb' ; multipathd -k'show maps status' ; done
> >        >    ok
> >        >    ok
> >        >    name failback queueing paths dm-st
> >        >    dmpath0 - - 1 active // It should be 2.
> >
> >        This is simply a timing issue.  What you are seeing it the number
> of
> >        active paths.  These are paths that the kernel can use. The "add
> path"
> >        command doesn't update the kernel state.  This happens later in
> >        response
> >        to the kernel reloading the device table. So, in a second or two,
> this
> >        will say 2, as expected.
> >
> >        >    We would like to know if the test cases are valid and if
> these
> >        are bugs or
> >        >    any design issues.
> >        >
> >        >    Case 3 : Fail path and reinstate path
> >        >    root at x86-generic-64:~# multipathd -k"fail path sdc";
> multipathd
> >        >    -k'reinstate path sdc'; multipathd -k"show paths";
> >        >    >    [ 3962.708523] device-mapper: multipath: Failing path
> 8:32.
> >        >    >    ok
> >        >    >    ok
> >        >    >    hcil    dev dev_t pri dm_st   chk_st   next_check
> >        >    >    4:0:0:1 sdc 8:32  1   [active][faulty] .......... 1/20
> >        <==CHECK
> >        >    >    5:0:0:1 sdd 8:48  1   [active][ready]  XX........ 4/20
> >        >    sdc path becomes [active][ready] only after the polling
> interval
> >        but not
> >        >    immediately after the reinstate path command.
> >        >    You have answered that this is a design issue. But we have
> heard
> >        from our
> >        >    test team that the same test case works in RHEL6. Did you
> observe
> >        it?
> >        >    I am also finding that the test cases fail because we are
> trying
> >        to do
> >        >    multiple commands at one shot.  Please share your thoughts so
> >        that it
> >        >    could help me in debugging the issues further.
> >        >
> >
> >        It's totally possible that the checker state is immediately
> updated in
> >        RHEL6.  Like I said before, what it currently does, although
> correct,
> >        is confusing, and perhaps we need a different checker state for
> paths
> >        where the "fail path" command has been used.
> >
> >        -Ben
> >        >    Regards,
> >        >    Tejaswini
> >        >    On Tue, May 19, 2015 at 5:37 PM, Tejaswini Poluri
> >        >    <[1][3]tejaswinipoluri3 at gmail.com> wrote:
> >        >
> >        >      Thanks a lot Ben. I will look into it more.
> >        >      On Mon, May 18, 2015 at 9:57 PM, Benjamin Marzinski
> >        >      <[2][4]bmarzins at redhat.com> wrote:
> >        >
> >        >        On Mon, May 18, 2015 at 02:09:27PM +0530, Tejaswini
> Poluri
> >        wrote:
> >        >        >    Hii,
> >        >        >    We are trying to test multipath setup in our target
> and
> >        tried the
> >        >        various
> >        >        >    commands of multipathd demaon and we find the
> following
> >        error:
> >        >        >    root at x86-generic-64:~# multipathd -k"fail path
> sdc";
> >        multipathd
> >        >        >    -k'reinstate path
> >        >        >    sdc'; multipathd -k"show paths";
> >        >        >    [ 3962.708523] device-mapper: multipath: Failing
> >        path 8:32.
> >        >        >    ok
> >        >        >    ok
> >        >        >    hcil    dev dev_t pri dm_st   chk_st   next_check
> >        >        >    4:0:0:1 sdc 8:32  1   [active][faulty] ..........
> 1/20
> >          <<<===
> >        >        CHECK
> >        >        >    5:0:0:1 sdd 8:48  1   [active][ready]  XX........
> 4/20
> >        >        >    sdc path becomes [active][ready] only after the
> polling
> >        interval
> >        >        but not
> >        >        >    immediately after the reinstate path command.
> >        >        >    I am observing this in latest multipath tools in
> ubuntu
> >        machine
> >        >        as well.
> >        >        >    Please let me know if its a known issue or if I am
> doing
> >        >        something wrong.
> >        >        >    Regards.
> >        >        >    Tejaswini
> >        >
> >        >        the reinstate command is supposed to reinstate the device
> >        with the
> >        >        kernel, and it does that. The checker state doesn't
> change
> >        until the
> >        >        next time that the path is checked.  I agree that it's
> odd
> >        that the
> >        >        check state switches to faulty as soon as you fail the
> path,
> >        but it
> >        >        doesn't switch back until the next check after you
> reinistate
> >        it.
> >        >
> >        >        The issue is that multipathd needs to override the
> checker
> >        output,
> >        >        so that a failed path won't be immeditately reinstated.
> Once
> >        the
> >        >        path comes back, multipathd wants to record the switch
> in the
> >        checker
> >        >        thread, so that it can refresh path information what
> wasn't
> >        >        automatically done when the path was reinstated.
> However, it
> >        may make
> >        >        more sense to have a different checker state for when the
> >        device is
> >        >        in the failed state, so that it's obvious that the
> checker
> >        state is
> >        >        being overruled.
> >        >
> >        >        -Ben
> >        >
> >        >        > --
> >        >        > dm-devel mailing list
> >        >        > [3][5]dm-devel at redhat.com
> >        >        > [4][6]https://www.redhat.com/mailman/listinfo/dm-devel
> >        >
> >        >        --
> >        >        dm-devel mailing list
> >        >        [5][7]dm-devel at redhat.com
> >        >        [6][8]https://www.redhat.com/mailman/listinfo/dm-devel
> >        >
> >        > References
> >        >
> >        >    Visible links
> >        >    1. mailto:[9]tejaswinipoluri3 at gmail.com
> >        >    2. mailto:[10]bmarzins at redhat.com
> >        >    3. mailto:[11]dm-devel at redhat.com
> >        >    4. [12]https://www.redhat.com/mailman/listinfo/dm-devel
> >        >    5. mailto:[13]dm-devel at redhat.com
> >        >    6. [14]https://www.redhat.com/mailman/listinfo/dm-devel
> >
> > References
> >
> >    Visible links
> >    1. mailto:tejaswinipoluri3 at gmail.com
> >    2. mailto:bmarzins at redhat.com
> >    3. mailto:tejaswinipoluri3 at gmail.com
> >    4. mailto:bmarzins at redhat.com
> >    5. mailto:dm-devel at redhat.com
> >    6. https://www.redhat.com/mailman/listinfo/dm-devel
> >    7. mailto:dm-devel at redhat.com
> >    8. https://www.redhat.com/mailman/listinfo/dm-devel
> >    9. mailto:tejaswinipoluri3 at gmail.com
> >   10. mailto:bmarzins at redhat.com
> >   11. mailto:dm-devel at redhat.com
> >   12. https://www.redhat.com/mailman/listinfo/dm-devel
> >   13. mailto:dm-devel at redhat.com
> >   14. https://www.redhat.com/mailman/listinfo/dm-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/dm-devel/attachments/20150623/8228ca91/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0.4.8_add_map_patch1
Type: application/octet-stream
Size: 2760 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/dm-devel/attachments/20150623/8228ca91/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0.5.0_add_map_patch2
Type: application/octet-stream
Size: 959 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/dm-devel/attachments/20150623/8228ca91/attachment-0001.obj>


More information about the dm-devel mailing list