[libvirt] [Fwd: [PATCH v9] add 802.1Qbh and 802.1Qbg handling]

Thu May 27 13:59:33 UTC 2010

On Thu, 2010-05-27 at 15:37 +0200, Arnd Bergmann wrote:
> On Thursday 27 May 2010, Stefan Berger wrote:
> > > 
> > > But this still goes down the IFLA_PORT_SELF route because you pass PORT_SELF_VF
> > > even for nltarget_kernel==false, where it makes no sense.
> > > Maybe make the above
> > > 
> > >       if (vf == PORT_SELF_VF && nltarget_kernel)
> > > 
> > 
> > So I'll pass vf = 0 and be done with it? I don't find this check for
> > nltarget_kernel particularly intuitive. Why should the function create
> > different messages for a kernel driver versus a user space daemon? Is
> > this a technology specific thing that would prevent this type of message
> > from being sent. Obviously it does work for Scott's 802.1Qbh part.
> 
> The difference is that enic (and likely any other driver implementing
> Qbg or Qbh in the firmware) has access to the underlying data channel.
> IFLA_PORT_SELF essentially means that we ask a logical device to associate
> itself with the switch, which is very unlike the case where we ask a master
> device to associate a slave to the switch.
> 
> Neither the PORT_SELF nor the VF_PORTS list are exactly what we really
> want, but the VF_PORTS stuff is closer. PORT_SELF does not work to start
> with, because it does not allow to query information about more than one
> VF.
> VF_PORTS is a bit fishy because we don't actually have VFs but an arbitrary
> number of software defined ports, but I think it's close enough.
> We could also invent another list for software ports that are identified
> by uuid instead of VF, but that would require defining attributes in the
> kernel that are only used in user space.

Time is fleeting ...

> 
> > > > +    if (vf != PORT_SELF_VF) {
> > > > +        /* end nesting of vfports */
> > > > +        vfports->rta_len = (char *)nlm + nlm->nlmsg_len - (char *)vfports;
> > > > +    }
> > > 
> > > Here too.
> > > 
> > > > +    if (nltarget_kernel) {
> > > > +        if (nlComm(nlm, &recvbuf, &recvbuflen) < 0)
> > > > +            return -1;
> > > > +    } else {
> > > > +        if (nlCommWaitSuccess(nlm, RTMGRP_LINK, &recvbuf, &recvbuflen,
> > > > +                              5 * MICROSEC_PER_SEC) < 0)
> > > > +            return -1;
> > > > +    }
> > > 
> > > I don't understand this part yet. Do we need this difference?
> > 
> > From my experience with experiments that I have made I can only say that
> > we do need it. The sending part is a little different and the receiveing
> > part needs to filter out noise and only pick the response to the request
> > that was previously sent.
> 
> I'm even more confused now. Why should the response be different
> from the response we get from the kernel? What's different on the
> sending side other than the PID?
> Also, what is the RTMGRP_LINK argument used for? I thought we don't
> need multicast any more because we don't target kernel and lldpad
> in the same message but only one of them.

Fact is that if I set RTMGRP_LINK to 0 here on libvirt only side, the
dummy server doesn't get a message. If I set it to 0 on both side, the
dummy server also doesn't get a message. I think it's necessary for
user-space-to-user-space communication, but I am also only learning
about these types of sockets while I 'go'.

> 
> > > > +    while (--repeats >= 0) {
> > > > +        rc = link_dump(nltarget_kernel, NULL, ifindex, tb, &recvbuf);
> > > > +        if (rc)
> > > > +            goto err_exit;
> > > > +        rc = getPortProfileStatus(tb, vf, &status);
> > > > +        if (rc == 0) {
> > > > +            if (status == PORT_PROFILE_RESPONSE_SUCCESS ||
> > > > +                status == PORT_VDP_RESPONSE_SUCCESS) {
> > > > +                break;
> > > > +            } else if (status == PORT_PROFILE_RESPONSE_INPROGRESS) {
> > > > +                // keep trying...
> > > > +            } else {
> > > > +                virReportSystemError(EINVAL,
> > > > +                    _("error %d during port-profile setlink on ifindex %d"),
> > > > +                    status, ifindex);
> > > > +                rc = 1;
> > > > +                break;
> > > > +            }
> > > 
> > > Hmm, we seem to be missing an INPROGRESS status for Qbg. Any suggestions
> > 
> > You mean that's not defined in the (pre-)standard?
> 
> Yes, the Qbg wire protocol has no need for this, because the
> messages are only sent after the state has changed, we never
> see a VDP message with an incomplete status in there, so there
> is no need to specify it in Qbg, but we need something in the netlink
> protocol.

Yes, I would suggest to mimic 802.1Qbh in this case.

> 
> > > what we should return there? Should we possibly just leave out
> > > IFLA_PORT_RESPONSE in order to signal INPROGRESS, as in not clear yet?
> > 
> > We want to be able to signal failure in case the switch setup failed so
> > that we don't have a malfunctioning network interface where the user
> > then doesn't know what to debug. In case of failure we simply wouldn't
> > start the VM or hotplug the interface and return an indication that
> > something went wrong during the negotiation with the switch.
> 
> So leaving out IFLA_PORT_RESPONSE would work for that, right?

We need to poll for the current status. At least in 802.1Qbh case using
an RTM_GETLINK type of message. Don't know how we could leave out
IFLA_PORT_RESPONSE...

> 
> The cases we need to care about are:
> 
> getPortProfileStatus returns without IFLA_PORT_RESPONSE -> keep trying

Why not just mimic 802.1Qbh and return an INPROGRESS status for as long
as it's not success ?

> getPortProfileStatus returns success from IFLA_PORT_RESPONSE -> done
> getPortProfileStatus returns failure from IFLA_PORT_RESPONSE -> give up

> > > > +    rc = doPortProfileOpCommon(nltarget_kernel,
> > > > +                               physdev_ifname, physdev_ifindex,
> > > > +                               macaddr,
> > > > +                               vlanid,
> > > > +                               NULL,
> > > > +                               &portVsi,
> > > > +                               virtPort->u.virtPort8021Qbg.instanceID,
> > > > +                               NULL,
> > > > +                               PORT_SELF_VF,
> > > > +                               op);
> > > 
> > > This is where we pass PORT_SELF_VF together with nltarget_kernel=false,
> > > as mentioned above.
> > > 
> > 
> > nltarget_kernel is in fact false here. So now if I change the
> > PORT_SELF_VF to '0', then I suppose it will be ok? The vf in the netlink
> > message to lldpad will then show 0 rather than PORT_SELF_VF (-1). I
> > guess 0 wouldn't have the meaning of the 0-st virtual function, but
> > counting would start at 1?
> 

Actually what I said above was quite right. PORT_SELF_VF (-1) will not
be shown in the request, but in the -1 case the request simply won't
have the IFLA_PORT_VF attribute.

> That's something we need to define now. You already put vf=0 into the
> IFLA_VF_MAC and IFLA_VF_VLAN attributes, and you leave out the IFLA_PORT_VF

No, I just copy whatever vf holds in there, which also may be
VF_PORT_SELF.

> attribute for Qbg, which is probably the best approximation we can get.
> I think we should just mandate this for the request, and let lldpad
> decide on the fake vf number, which it returns to getPortProfileStatus.
> 

Just be a bit more clear about the exact parameter to pass.

   Stefan

> 	Arnd