[libvirt] [RFC] [PATCH 3/3 v2] vepa+vsi: Some experimental code for 802.1Qbh

Dave Allan dallan at redhat.com
Sun May 23 01:24:59 UTC 2010


On Sat, May 22, 2010 at 12:17:05PM -0700, Scott Feldman wrote:
> On 5/22/10 11:34 AM, "Dave Allan" <dallan at redhat.com> wrote:
> 
> > On Sat, May 22, 2010 at 11:14:20AM -0400, Stefan Berger wrote:
> >> On Fri, 2010-05-21 at 23:35 -0700, Scott Feldman wrote:
> >>> On 5/21/10 6:50 AM, "Stefan Berger" <stefanb at linux.vnet.ibm.com> wrote:
> >>> 
> >>>> This patch may get 802.1Qbh devices working. I am adding some code to
> >>>> poll for the status of an 802.1Qbh device and loop for a while until the
> >>>> status indicates success. This part for sure needs more work and
> >>>> testing...
> >>> 
> >>> I think we can drop this patch 3/3.  For bh, we don't want to poll for
> >>> status because it may take awhile before status of other than in-progress is
> >>> indicated.  Link UP on the eth is the async notification of status=success.
> >> 
> >> The idea was to find out whether the association actually worked and if
> >> not either fail the start of the VM or not hotplug the interface. If we
> >> don't do that the user may end up having a VM that has no connectivity
> >> (depending on how the switch handles an un-associated VM) and start
> >> debugging all kinds of things... Really, I would like to know if
> >> something went wrong. How long would we have to wait for the status to
> >> change? How does a switch handle traffic from a VM if the association
> >> failed? At least for 802.1Qbg we were going to get failure notification.
> > 
> > I tend to agree that we should try to get some indication of whether
> > the associate succeeded or failed.  Is the time that we would have to
> > poll bounded by anything, or is it reasonably short?
> 
> It's difficult to put an upper bound on how long to poll.  In most case,
> status would be available in a reasonably short period of time, but the
> upper bound depends on activity external to the host.

That makes sense.  The timeout should be a configurable value.  What
do you think is a reasonable default?

> > Mostly I'm concerned about the failure case: how would the user know
> > that something has gone wrong, and where would information to debug
> > the problem appear?
> 
> Think of it as equivalent to waiting to get link UP after plugging in a
> physical cable into a physical switch port.  In some cases negotiation of
> the link may take on the order of seconds.  Depends on the physical media,
> of course.  A user can check for link UP using ethtool or ip cmd.
> Similarly, a user can check for association status using ip cmd, once we
> extend ip cmd to know about virtual ports (patch for ip cmd coming soon).

That's the way I was thinking about it as well.  The difference I see
between an actual physical cable and what we're doing here is that if
you're in the data center and you plug in a cable, you're focused on
whether the link comes up.  Here, the actor is likely to be an
automated process, and users will simply be presented with a VM with
no or incorrect connectivity, and they will have no idea what
happened.  It's just not supportable to provide them with no
indication of what failed or why.

Dave




More information about the libvir-list mailing list