<div dir="ltr"><br><div class="gmail_quote"><br><br><div dir="ltr"><div><div class="h5"><pre style="white-space:pre-wrap"><span style="color:black">Hello,</span></pre><pre style="white-space:pre-wrap">I'm working on device-mapper multipath (dm-multipath).<br></pre><pre style="white-space:pre-wrap"><span style="color:black">This patch set adds a new hook for device-mapper in deciding the health of the</span></pre><pre style="white-space:pre-wrap"><span style="color:black">Of the multipath which helps in getting the deterministic Application IO throughput.</span></pre><pre style="white-space:pre-wrap"><span style="color:black"> </span></pre><pre style="white-space:pre-wrap"><span style="color:black">This patch set is preliminary tested on active-active 2 paths storage.</span></pre><pre style="white-space:pre-wrap"><span style="color:black">But the patch set still needs work and is not ready for inclusion.</span></pre><pre style="white-space:pre-wrap"><span style="color:black">I'm posting it because I'd like to get comments about high-level</span></pre><pre style="white-space:pre-wrap"><span style="color:black">design before going further in details.</span></pre><pre style="white-space:pre-wrap"><span style="color:black"> </span></pre><pre style="white-space:pre-wrap"><span style="color:black"> </span></pre><pre style="white-space:pre-wrap"><span style="color:black"> </span></pre><pre style="white-space:pre-wrap"><span style="color:black">This patch set should be applied on top of 3.10.0 #18</span></pre><pre style="white-space:pre-wrap"><span style="color:black"> </span></pre><pre style="white-space:pre-wrap"><span style="color:black"> </span></pre><pre style="white-space:pre-wrap"><span style="color:black">==============================<wbr>==============================<wbr>========</span></pre><pre style="white-space:pre-wrap"><span style="color:black">Background</span></pre><pre style="white-space:pre-wrap"><span style="color:black">=-=-=-=-=-=</span></pre><pre style="white-space:pre-wrap"><span style="color:black"> </span></pre><pre style="white-space:pre-wrap;margin-left:0.5in"><span style="font-family:arial,sans-serif;color:black">•<span style="font-variant-numeric:normal;font-stretch:normal;font-size:7pt;line-height:normal;font-family:"times new roman"">        </span></span><span style="color:black">“Sick but not Dead” MPIO Path</span></pre><pre style="white-space:pre-wrap;margin-left:1in"><span style="font-family:arial,sans-serif;color:black">‒<span style="font-variant-numeric:normal;font-stretch:normal;font-size:7pt;line-height:normal;font-family:"times new roman"">       </span></span><span style="color:black">Path goes into Failed state because of path IO error as seen by DM driver</span></pre><pre style="white-space:pre-wrap;margin-left:1in"><span style="font-family:arial,sans-serif;color:black">‒<span style="font-variant-numeric:normal;font-stretch:normal;font-size:7pt;line-height:normal;font-family:"times new roman"">       </span></span><span style="color:black">When the multipath daemon issues TUR command  finds health of the failed path is good, makes the same path into Active state</span></pre><pre style="white-space:pre-wrap;margin-left:1in"><span style="font-family:arial,sans-serif;color:black">‒<span style="font-variant-numeric:normal;font-stretch:normal;font-size:7pt;line-height:normal;font-family:"times new roman"">       </span></span><span style="color:black">Path repeatedly toggles between Failed and Active Path States</span></pre><pre style="white-space:pre-wrap;margin-left:1.5in"><span style="font-family:arial,sans-serif;color:black">•<span style="font-variant-numeric:normal;font-stretch:normal;font-size:7pt;line-height:normal;font-family:"times new roman"">        </span></span><span style="color:black">DM IO is retried on path where we are hitting multiple errors.</span></pre><pre style="white-space:pre-wrap;margin-left:1.5in"><span style="font-family:arial,sans-serif;color:black">•<span style="font-variant-numeric:normal;font-stretch:normal;font-size:7pt;line-height:normal;font-family:"times new roman"">        </span></span><span style="color:black">Causing erratic (non-deterministic) Application IO throughput</span></pre><pre style="white-space:pre-wrap"><span style="color:black"> </span></pre><pre style="white-space:pre-wrap"><font color="#000000">The current existing DM layer doesn't consider the amount of errors to decide the health of the path.</font></pre><pre style="white-space:pre-wrap"><span style="color:black">Since the failed path is becoming active immediately when the tur command succeeds the end user will be in a</span></pre><pre style="white-space:pre-wrap"><span style="color:black">Assumption that all the multipaths are in good state.</span></pre><pre style="white-space:pre-wrap"><font color="#000000">When we run some of the field tests with this scenario we saw a non-deterministic io throughput</font></pre><pre style="white-space:pre-wrap"><span style="color:black"> </span></pre><pre style="white-space:pre-wrap"><span style="color:black"> </span></pre><pre style="white-space:pre-wrap"><span style="color:black"> </span></pre><pre style="white-space:pre-wrap"><span style="color:black"> </span></pre><pre style="white-space:pre-wrap"><span style="color:black">==============================<wbr>==============================<wbr>=========</span></pre><pre style="white-space:pre-wrap"><span style="color:black">Design Overview</span></pre><pre style="white-space:pre-wrap"><span style="color:black">=-=-=-=-=-=-=-=-=</span></pre><pre style="white-space:pre-wrap"><span style="color:black"> </span></pre><pre style="white-space:pre-wrap;margin-left:0.5in"><span style="font-family:arial,sans-serif;color:black">•<span style="font-variant-numeric:normal;font-stretch:normal;font-size:7pt;line-height:normal;font-family:"times new roman"">        </span></span><span style="color:black">Deterministically bring the path to “Faulty” state </span></pre><pre style="white-space:pre-wrap;margin-left:1in"><span style="font-family:arial,sans-serif;color:black">‒<span style="font-variant-numeric:normal;font-stretch:normal;font-size:7pt;line-height:normal;font-family:"times new roman"">       </span></span><span style="color:black">Configure per-DM device data with</span></pre><pre style="white-space:pre-wrap;margin-left:1.5in"><span style="font-family:arial,sans-serif;color:black">•<span style="font-variant-numeric:normal;font-stretch:normal;font-size:7pt;line-height:normal;font-family:"times new roman"">        </span></span><span style="color:black">IO error threshold and time window for the error threshold to be hit</span><b style="font-family:arial,sans-serif"><i><span style="font-size:11pt;font-family:calibri,sans-serif;color:rgb(31,73,125)"> </span></i></b></pre><pre style="white-space:pre-wrap;margin-left:1in"><span style="font-family:arial,sans-serif;color:black">‒<span style="font-variant-numeric:normal;font-stretch:normal;font-size:7pt;line-height:normal;font-family:"times new roman"">       </span></span><span style="color:black">Declare a path Faulty when error threshold is hit within the configured time window </span></pre><pre style="white-space:pre-wrap;margin-left:1in"><span style="font-family:arial,sans-serif;color:black">‒<span style="font-variant-numeric:normal;font-stretch:normal;font-size:7pt;line-height:normal;font-family:"times new roman"">       </span></span><span style="color:black">Place the path in the failed state for a predefined time configured by the administrator</span></pre><pre style="white-space:pre-wrap;margin-left:1in"><span style="color:black"> using the config file</span></pre><pre style="white-space:pre-wrap;margin-left:1in"><span style="font-family:arial,sans-serif;color:black">‒<span style="font-variant-numeric:normal;font-stretch:normal;font-size:7pt;line-height:normal;font-family:"times new roman"">       </span></span><span style="color:black">Even though multipath daemon validates the path using TUR command which succeeds</span></pre><pre style="white-space:pre-wrap;margin-left:1in"><font color="#000000"> and tries to re-instantiate the path ignore the re-instantiate of the path for a predefined time if the err threshold is hit.</font></pre><pre style="white-space:pre-wrap;margin-left:0.5in"><span style="font-family:arial,sans-serif;color:black">•<span style="font-variant-numeric:normal;font-stretch:normal;font-size:7pt;line-height:normal;font-family:"times new roman"">        </span></span><span style="color:black">Give time for Administrator to correct the “Sick But not Dead” path and bring Path to Active</span></pre><pre style="white-space:pre-wrap;margin-left:0.5in"><span style="font-family:arial,sans-serif;color:black">•<span style="font-variant-numeric:normal;font-stretch:normal;font-size:7pt;line-height:normal;font-family:"times new roman"">        </span></span><span style="color:black">Auto Enablement of a Faulty Path to Active State after a fixed time duration (given as a config data for each DM)</span></pre><pre style="white-space:pre-wrap;margin-left:1in"><span style="font-family:arial,sans-serif;color:black">‒<span style="font-variant-numeric:normal;font-stretch:normal;font-size:7pt;line-height:normal;font-family:"times new roman"">       </span></span><span style="color:black">Admin can set the Deterministic MPIO behavior on per-DM device basis</span></pre><pre style="white-space:pre-wrap;margin-left:73.2pt"><span style="font-family:calibri,sans-serif;color:black">-<span style="font-variant-numeric:normal;font-stretch:normal;font-size:7pt;line-height:normal;font-family:"times new roman"">         </span></span><span style="color:black">It implies the failed path will be reinstantiated  either by admin or when the timeout expires.</span></pre><pre style="white-space:pre-wrap;margin-left:0.5in"><span style="font-family:arial,sans-serif;color:black">•<span style="font-variant-numeric:normal;font-stretch:normal;font-size:7pt;line-height:normal;font-family:"times new roman"">        </span></span><span style="color:black">The above configs will be made persistent across server reboots</span></pre><pre style="white-space:pre-wrap"><span style="color:black"> </span></pre><pre style="white-space:pre-wrap"><span style="color:black">Expected benefit:</span></pre><pre style="white-space:pre-wrap"><span style="color:black">-Deterministic Application IO throughput</span><span style="color:rgb(31,73,125)">.</span><span style="color:black"></span></pre><pre style="white-space:pre-wrap"><span style="color:black">-We can give a time for the administrator to analyze the path failure and recover the path.</span></pre><pre style="white-space:pre-wrap"><span style="font-size:11pt;font-family:calibri,sans-serif;color:rgb(31,73,125)">-</span><span style="color:black"> user space tools need minimum change .</span></pre><pre style="white-space:pre-wrap"><span style="font-size:11pt;font-family:calibri,sans-serif;color:rgb(31,73,125)"> </span></pre></div></div><pre style="white-space:pre-wrap"><span style="font-size:11pt;font-family:calibri,sans-serif;color:rgb(31,73,125)">The above feature will be enabled only if the corresponding variables are defined in multipath.conf</span></pre><span class=""><pre style="white-space:pre-wrap"><span style="color:black"> </span></pre><pre style="white-space:pre-wrap"><span style="color:black">Since these changes are irrespective of the underlying algorithms which they are using in dm layer.</span></pre><pre style="white-space:pre-wrap"><span style="color:black">The changes are applied in dm.c and dm-mpath.c</span></pre><pre style="white-space:pre-wrap"><span style="color:black"> </span></pre><pre style="white-space:pre-wrap"><span style="color:black">alloc_dev(),reinstate_path(),p<wbr>arse_path(),fail_path() are the functions which are going to be changed.</span></pre><pre style="white-space:pre-wrap"><span style="color:black"> </span></pre><pre style="white-space:pre-wrap"><span style="color:black"> </span></pre><pre style="white-space:pre-wrap"><span style="color:black">Need more comments on this as we started the testing and the results look determenestic.</span></pre><pre style="white-space:pre-wrap"><span style="color:black"><br></span></pre><pre style="white-space:pre-wrap"><span style="color:black"><br></span></pre></span><pre style="white-space:pre-wrap"><span style="color:black">Regards,</span></pre><pre style="white-space:pre-wrap"><span style="color:black">Muneendra.</span></pre></div><div class="HOEnZb"><div class="h5"><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Dec 15, 2016 at 3:00 PM, muneendra kumar <span dir="ltr"><<a href="mailto:muneendra737@gmail.com" target="_blank">muneendra737@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><pre><span style="color:black">Hello,<span></span></span></pre><pre><span style="color:black">This is the place where iam currently working and the details are given below</span></pre><pre><span style="color:black"> </span></pre><pre><span style="color:black">I'm working on device-mapper multipath (dm-multipath).<span></span></span></pre><pre><span style="color:black">This patch set adds a new hook for device-mapper in deciding the health of the<span></span></span></pre><pre><span style="color:black">Of the multipath which helps in getting the deterministic Application IO throughput.<span></span></span></pre><pre><span style="color:black"> </span></pre><pre><span style="color:black">This patch set is preliminary tested on active-active 2 paths storage.<span></span></span></pre><pre><span style="color:black">But the patch set still needs work and is not ready for inclusion.<span></span></span></pre><pre><span style="color:black">I'm posting it because I'd like to get comments about high-level<span></span></span></pre><pre><span style="color:black">design before going further in details.<span></span></span></pre><pre><span style="color:black"> </span></pre><pre><span style="color:black"> </span></pre><pre><span style="color:black"> </span></pre><pre><span style="color:black">This patch set should be applied on top of 3.10.0 #18<span></span></span></pre><pre><span style="color:black"> </span></pre><pre><span style="color:black"> </span></pre><pre><span style="color:black">==============================<wbr>==============================<wbr>========<span></span></span></pre><pre><span style="color:black">Background<span></span></span></pre><pre><span style="color:black">=-=-=-=-=-=<span></span></span></pre><pre><span style="color:black"> </span></pre><pre style="margin-left:0.5in"><span style="font-family:arial,sans-serif;color:black">•<span style="font-variant-numeric:normal;font-stretch:normal;font-size:7pt;line-height:normal;font-family:"times new roman"">        </span></span><span style="color:black">“Sick but not Dead” MPIO Path<span></span></span></pre><pre style="margin-left:1in"><span style="font-family:arial,sans-serif;color:black">‒<span style="font-variant-numeric:normal;font-stretch:normal;font-size:7pt;line-height:normal;font-family:"times new roman"">       </span></span><span style="color:black">Path goes into Failed state because of path IO error as seen by DM driver<span></span></span></pre><pre style="margin-left:1in"><span style="font-family:arial,sans-serif;color:black">‒<span style="font-variant-numeric:normal;font-stretch:normal;font-size:7pt;line-height:normal;font-family:"times new roman"">       </span></span><span style="color:black">When the multipath daemon issues TUR command  finds health of the failed path is good, makes the same path into Active state<span></span></span></pre><pre style="margin-left:1in"><span style="font-family:arial,sans-serif;color:black">‒<span style="font-variant-numeric:normal;font-stretch:normal;font-size:7pt;line-height:normal;font-family:"times new roman"">       </span></span><span style="color:black">Path repeatedly toggles between Failed and Active Path States<span></span></span></pre><pre style="margin-left:1.5in"><span style="font-family:arial,sans-serif;color:black">•<span style="font-variant-numeric:normal;font-stretch:normal;font-size:7pt;line-height:normal;font-family:"times new roman"">        </span></span><span style="color:black">DM IO is retried on path where we are hitting multiple errors.<span></span></span></pre><pre style="margin-left:1.5in"><span style="font-family:arial,sans-serif;color:black">•<span style="font-variant-numeric:normal;font-stretch:normal;font-size:7pt;line-height:normal;font-family:"times new roman"">        </span></span><span style="color:black">Causing erratic (non-deterministic) Application IO throughput<span></span></span></pre><pre><span style="color:black"> </span></pre><pre><font color="#000000">The current existing DM layer doesn't consider the amount of errors to decide the health of the path.<span></span></font></pre><pre><span style="color:black">Since the failed path is becoming active immediately when the tur command succeeds the end user will be in a<span></span></span></pre><pre><span style="color:black">Assumption that all the multipaths are in good state.<span></span></span></pre><pre><font color="#000000">When we run some of the field tests with this scenario we saw a non-deterministic io throughput<span></span></font></pre><pre><span style="color:black"> </span></pre><pre><span style="color:black"> </span></pre><pre><span style="color:black"> </span></pre><pre><span style="color:black"> </span></pre><pre><span style="color:black">==============================<wbr>==============================<wbr>=========<span></span></span></pre><pre><span style="color:black">Design Overview<span></span></span></pre><pre><span style="color:black">=-=-=-=-=-=-=-=-=<span></span></span></pre><pre><span style="color:black"> </span></pre><pre style="margin-left:0.5in"><span style="font-family:arial,sans-serif;color:black">•<span style="font-variant-numeric:normal;font-stretch:normal;font-size:7pt;line-height:normal;font-family:"times new roman"">        </span></span><span style="color:black">Deterministically bring the path to “Faulty” state <span></span></span></pre><pre style="margin-left:1in"><span style="font-family:arial,sans-serif;color:black">‒<span style="font-variant-numeric:normal;font-stretch:normal;font-size:7pt;line-height:normal;font-family:"times new roman"">       </span></span><span style="color:black">Configure per-DM device data with<span></span></span></pre><pre style="margin-left:1.5in"><span style="font-family:arial,sans-serif;color:black">•<span style="font-variant-numeric:normal;font-stretch:normal;font-size:7pt;line-height:normal;font-family:"times new roman"">        </span></span><span style="color:black">IO error threshold and time window for the error threshold to be hit</span><b style="font-family:arial,sans-serif"><i><span style="font-size:11pt;font-family:calibri,sans-serif;color:rgb(31,73,125)"> </span></i></b></pre><pre style="margin-left:1in"><span style="font-family:arial,sans-serif;color:black">‒<span style="font-variant-numeric:normal;font-stretch:normal;font-size:7pt;line-height:normal;font-family:"times new roman"">       </span></span><span style="color:black">Declare a path Faulty when error threshold is hit within the configured time window <span></span></span></pre><pre style="margin-left:1in"><span style="font-family:arial,sans-serif;color:black">‒<span style="font-variant-numeric:normal;font-stretch:normal;font-size:7pt;line-height:normal;font-family:"times new roman"">       </span></span><span style="color:black">Place the path in the failed state for a predefined time configured by the administrator</span></pre><pre style="margin-left:1in"><span style="color:black"> using the config file<span></span></span></pre><pre style="margin-left:1in"><span style="font-family:arial,sans-serif;color:black">‒<span style="font-variant-numeric:normal;font-stretch:normal;font-size:7pt;line-height:normal;font-family:"times new roman"">       </span></span><span style="color:black">Even though multipath daemon validates the path using TUR command which succeeds<span></span></span></pre><pre style="margin-left:1in"><font color="#000000"> and tries to re-instantiate the path ignore the re-instantiate of the path for a predefined time if the err threshold is hit.<span></span></font></pre><pre style="margin-left:0.5in"><span style="font-family:arial,sans-serif;color:black">•<span style="font-variant-numeric:normal;font-stretch:normal;font-size:7pt;line-height:normal;font-family:"times new roman"">        </span></span><span style="color:black">Give time for Administrator to correct the “Sick But not Dead” path and bring Path to Active<span></span></span></pre><pre style="margin-left:0.5in"><span style="font-family:arial,sans-serif;color:black">•<span style="font-variant-numeric:normal;font-stretch:normal;font-size:7pt;line-height:normal;font-family:"times new roman"">        </span></span><span style="color:black">Auto Enablement of a Faulty Path to Active State after a fixed time duration (given as a config data for each DM)<span></span></span></pre><pre style="margin-left:1in"><span style="font-family:arial,sans-serif;color:black">‒<span style="font-variant-numeric:normal;font-stretch:normal;font-size:7pt;line-height:normal;font-family:"times new roman"">       </span></span><span style="color:black">Admin can set the Deterministic MPIO behavior on per-DM device basis<span></span></span></pre><pre style="margin-left:73.2pt"><span style="font-family:calibri,sans-serif;color:black">-<span style="font-variant-numeric:normal;font-stretch:normal;font-size:7pt;line-height:normal;font-family:"times new roman"">         </span></span><span style="color:black">It implies the failed path will be reinstantiated  either by admin or when the timeout expires.<span></span></span></pre><pre style="margin-left:0.5in"><span style="font-family:arial,sans-serif;color:black">•<span style="font-variant-numeric:normal;font-stretch:normal;font-size:7pt;line-height:normal;font-family:"times new roman"">        </span></span><span style="color:black">The above configs will be made persistent across server reboots<span></span></span></pre><pre><span style="color:black"> </span></pre><pre><span style="color:black">Expected benefit:<span></span></span></pre><pre><span style="color:black">-Deterministic Application IO throughput</span><span style="color:rgb(31,73,125)">.</span><span style="color:black"><span></span></span></pre><pre><span style="color:black">-We can give a time for the administrator to analyze the path failure and recover the path.<span></span></span></pre><pre><span style="font-size:11pt;font-family:calibri,sans-serif;color:rgb(31,73,125)">-</span><span style="color:black"> user space tools need minimum change .<span></span></span></pre><pre><span style="font-size:11pt;font-family:calibri,sans-serif;color:rgb(31,73,125)"><span> </span></span></pre><pre><span style="color:black"> </span></pre><pre><span style="color:black">Since these changes are irrespective of the underlying algorithms which they are using in dm layer.<span></span></span></pre><pre><span style="color:black">The changes are applied in dm.c and dm-mpath.c<span></span></span></pre><pre><span style="color:black"> </span></pre><pre><span style="color:black">alloc_dev(),reinstate_path(),p<wbr>arse_path(),fail_path() are the functions which are going to be changed.<span></span></span></pre><pre><span style="color:black"> </span></pre><pre><span style="color:black"> </span></pre><pre><span style="color:black">Need more comments on this as we started the testing and the results look determenestic.<span></span></span></pre></div><div class="m_451764365367411905HOEnZb"><div class="m_451764365367411905h5"><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Dec 5, 2016 at 9:35 PM, muneendra kumar <span dir="ltr"><<a href="mailto:muneendra737@gmail.com" target="_blank">muneendra737@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Thanks a lot for sharing the info.<div>I will discuss the problem in detail in my earlier mail,</div><div><br></div><div>Regards,</div><div>Muneendra.</div></div><div class="m_451764365367411905m_2417041120066841534HOEnZb"><div class="m_451764365367411905m_2417041120066841534h5"><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Dec 5, 2016 at 5:45 PM, Zdenek Kabelac <span dir="ltr"><<a href="mailto:zkabelac@redhat.com" target="_blank">zkabelac@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Dne 5.12.2016 v 07:29 muneendra kumar napsal(a):<span><br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Hi,<br>
This is a general question.<br>
If i do any changes in both multipath tool and dm driver (kernel).<br>
How do i push my changes into main stream.<br>
Can someone explain me the process so that it will help me a lot.<br>
<br>
</blockquote>
<br>
<br></span>
Hi<br>
<br>
You propose your changes here on the list - you get a review and<br>
it the patches are found useful - maintainer of dm subsystem<br>
will accept them.<br>
<br>
Note - it's usually better to ask and discuss 'ahead' what is your problem<br>
and how do you want to improve/fix it.<br>
So you avoid losing time on implementing unacceptable patch.<br>
<br>
Regards<span class="m_451764365367411905m_2417041120066841534m_981175539199487597HOEnZb"><font color="#888888"><br>
<br>
Zdenek<br>
<br>
<br>
<br>
</font></span></blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></div><br></div>