[dm-devel] dm-multipath - IO queue dispatch based on FPIN Congestion/Latency notifications.

Erwin van Londen erwin at erwinvanlonden.net
Wed Mar 31 08:12:57 UTC 2021


Hello Hannes,

Thanks for responding.

On Wed, 2021-03-31 at 09:25 +0200, Hannes Reinecke wrote:
> Hi Erwin,
> 
> On 3/31/21 2:22 AM, Erwin van Londen wrote:
> > Hello Muneendra, benjamin,
> > 
> > The fpin options that are developed do have a whole plethora of
> > options
> > and do not mainly trigger paths being in a marginal state. Th mpio
> > layer
> > could utilise the various triggers like congestion and latency and
> > not
> > just use a marginal state as a decisive point. If a path is
> > somewhat
> > congested the amount of io's dispersed over these paths could just
> > be
> > reduced by a flexible margin depending on how often and which fpins
> > are
> > actually received. If for instance and fpin is recieved that an
> > upstream
> > port is throwing physical errors you may exclude is entirely from
> > queueing IO's to it. If it is a latency related problem where
> > credit
> > shortages come in play you may just need to queue very small IO's
> > to it.
> > The scsi CDB will tell the size of the IO. Congestion notifications
> > may
> > just be used for potentially adding an artificial  delay to reduce
> > the
> > workload on these paths and schedule them on another.
> > 
> As correctly noted, FPINs come with a variety of options.
> And I'm not certain we can everything correctly; a degraded path is
> simple, but for congestion there is only _so_ much we can do.
> The typical cause for congestion is, say, a 32G host port talking to
> a
> 16G (or even 8G) target port _and_ a 32G target port.
Congestion can also be caused by a change in workload characteristics
where, for example, read and write workload start interfering. The
funnel principle would not apply in that case.
> 
> So the host cannot 'tune down' it's link to 8G; doing so would impact
> performance on the 32G target port.
> (And we would suffer reverse congestion whenever that target port
> sends
> frames).
> 
> And throttling things on the SCSI layer only helps _so_ much, as the
> real congestion is due to the speed with which the frames are
> sequenced
> onto the wire. Which is not something we from the OS can control.
If you can interleave IOs with an artificial delay depending on the
type and frequency these FPINS arrive you would be able to prevent
latency buildup in the san.
> 
> From another POV this is arguably a fabric mis-design; so it _could_
> be
> alleviated by separating out the ports with lower speeds into its own
> zone (or even on a separate SAN); that would trivially make the
> congestion go away.
The entire FPIN concept was designed to be able to provide clients with
the option to respond and react to changing behaviours in sans. A mis-
design is often not really the case but ongoing changes and continuous
provisioning is  mainly contributing to the case. 
> 
> But for that the admin first should be _alerted_, and this really is
> my
> primary goal: having FPINs showing up in the message log, to alert
> the
> admin that his fabric is not performing well.
I think the FC drivers are already having facilities to do that or they
will have that shortly. dm-multipath is not really required to handle
the notifications but would be useful if actions have been done based
on fpins. 
> 
> A second step will be to massaging FPINs into DM multipath, and have
> it
> influencing the path priority or path status. But this is currently
> under discussion how it could be integrated best.
OK
> 
> > Not really sure what the possibilities are from a DM-Multipath
> > viewpoint, but I feel if the OS options are not properly aligned
> > with
> > what the FC protocol and HBA drivers are able to provide we may
> > miss a
> > good opportunity to optimize the dispersion of IO's and improve
> > overall
> > performance. 
> > 
> Looking at the size of the commands is one possibility, but at this
> time
> this presumes too much on how we _think_ FPINs will be generated.
> I'd rather do some more tests to figure out under which circumstances
> we
> can expect which type of FPINs, and then start looking for ways on
> how
> to integrate them.
The FC protocol only describes the framework and not the values that
need to be adhered to. That depends on the end devices and their
capabilities. 
> 
> Cheers,
> 
> Hannes
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/dm-devel/attachments/20210331/442b32ac/attachment.htm>


More information about the dm-devel mailing list