[libvirt] [Fwd: [PATCH v9] add 802.1Qbh and 802.1Qbg handling]

Thu May 27 15:43:09 UTC 2010

On Thursday 27 May 2010, Stefan Berger wrote:
> On Thu, 2010-05-27 at 15:37 +0200, Arnd Bergmann wrote:

> > I'm even more confused now. Why should the response be different
> > from the response we get from the kernel? What's different on the
> > sending side other than the PID?
> > Also, what is the RTMGRP_LINK argument used for? I thought we don't
> > need multicast any more because we don't target kernel and lldpad
> > in the same message but only one of them.
> 
> Fact is that if I set RTMGRP_LINK to 0 here on libvirt only side, the
> dummy server doesn't get a message. If I set it to 0 on both side, the
> dummy server also doesn't get a message. I think it's necessary for
> user-space-to-user-space communication, but I am also only learning
> about these types of sockets while I 'go'.

AFAICT, RTMGRP_LINK makes it a multicast message, which means that
anyone can receive it, which a unicast message will only be sent
to the task that has a matching pid.

> > > You mean that's not defined in the (pre-)standard?
> > 
> > Yes, the Qbg wire protocol has no need for this, because the
> > messages are only sent after the state has changed, we never
> > see a VDP message with an incomplete status in there, so there
> > is no need to specify it in Qbg, but we need something in the netlink
> > protocol.
> 
> Yes, I would suggest to mimic 802.1Qbh in this case.

Like what? The first 256 result numbers in the netlink protocol
are meant to be the same as the ones in the wire protocol, the
next 256 are for the Qbh protocol.
We could of course define yet another range for the inprogress
result and possibly future extensions, starting at 512 if you
insist on requiring the IFLA_PORT_RESPONSE.

I originally wanted to defer the response until we hear back
from the switch, but that may take a really long time
(over a minute with all the VDP timeouts). That would be
cleaner IMHO, but you may not want to wait that long in libvirt.

> > > > what we should return there? Should we possibly just leave out
> > > > IFLA_PORT_RESPONSE in order to signal INPROGRESS, as in not clear yet?
> > > 
> > > We want to be able to signal failure in case the switch setup failed so
> > > that we don't have a malfunctioning network interface where the user
> > > then doesn't know what to debug. In case of failure we simply wouldn't
> > > start the VM or hotplug the interface and return an indication that
> > > something went wrong during the negotiation with the switch.
> > 
> > So leaving out IFLA_PORT_RESPONSE would work for that, right?
> 
> We need to poll for the current status. At least in 802.1Qbh case using
> an RTM_GETLINK type of message. Don't know how we could leave out
> IFLA_PORT_RESPONSE...

The response can simply contain all attributes of the request
but not contain an IFLA_PORT_RESPONSE attribute inside of the IFLA_VF_PORT.

> > The cases we need to care about are:
> > 
> > getPortProfileStatus returns without IFLA_PORT_RESPONSE -> keep trying
> 
> Why not just mimic 802.1Qbh and return an INPROGRESS status for as long
> as it's not success ?

because we have not defined such a value yet. We can either define
one now and change the if_link.h file or define INPROGRESS to be
the absence of a IFLA_PORT_RESPONSE attribute.

> > getPortProfileStatus returns success from IFLA_PORT_RESPONSE -> done
> > getPortProfileStatus returns failure from IFLA_PORT_RESPONSE -> give up
> 
> > > > > +    rc = doPortProfileOpCommon(nltarget_kernel,
> > > > > +                               physdev_ifname, physdev_ifindex,
> > > > > +                               macaddr,
> > > > > +                               vlanid,
> > > > > +                               NULL,
> > > > > +                               &portVsi,
> > > > > +                               virtPort->u.virtPort8021Qbg.instanceID,
> > > > > +                               NULL,
> > > > > +                               PORT_SELF_VF,
> > > > > +                               op);
> > > > 
> > > > This is where we pass PORT_SELF_VF together with nltarget_kernel=false,
> > > > as mentioned above.
> > > > 
> > > 
> > > nltarget_kernel is in fact false here. So now if I change the
> > > PORT_SELF_VF to '0', then I suppose it will be ok? The vf in the netlink
> > > message to lldpad will then show 0 rather than PORT_SELF_VF (-1). I
> > > guess 0 wouldn't have the meaning of the 0-st virtual function, but
> > > counting would start at 1?
> > 
> 
> Actually what I said above was quite right. PORT_SELF_VF (-1) will not
> be shown in the request, but in the -1 case the request simply won't
> have the IFLA_PORT_VF attribute.

Right.

> > That's something we need to define now. You already put vf=0 into the
> > IFLA_VF_MAC and IFLA_VF_VLAN attributes, and you leave out the IFLA_PORT_VF
> 
> No, I just copy whatever vf holds in there, which also may be
> VF_PORT_SELF.

Ah, that should be fine as well.

> > attribute for Qbg, which is probably the best approximation we can get.
> > I think we should just mandate this for the request, and let lldpad
> > decide on the fake vf number, which it returns to getPortProfileStatus.
> > 
> 
> Just be a bit more clear about the exact parameter to pass.

I'd suggest doing a request with

vfinfo_attr[IFLA_VF_MAC] = { .vf = -1u, .mac = OUR_MAC };
vfinfo_attr[IFLA_VF_VLAN] = { .vf = -1u, vlan = OUT_VLAN };
vf_port_attr[IFLA_PORT_VF] = NULL; /* don't send this at all */
vf_port_attr[IFLA_PORT_RESPONSE] = NULL; /* also don't send this */
vf_port_attr[IFLA_PORT_*] = WHATEVER_WE_NEED;

and waiting for the receiver to choose a free VF and ack this message with

vfinfo_attr[IFLA_VF_MAC] = { .vf = CHOSEN_VF, .mac = OUR_MAC };
vfinfo_attr[IFLA_VF_VLAN] = { .vf = CHOSEN_VF, vlan = OUT_VLAN };
vf_port_attr[IFLA_PORT_VF] = { CHOSEN_VF }; /* to identify the vf_port_attr */
vf_port_attr[IFLA_PORT_RESPONSE] = NULL; /* no response yet */
vf_port_attr[IFLA_PORT_RESPONSE] = PORT_RESPONSE_INPROGRESS; /* alternative,
							 if you prefer */
vf_port_attr[IFLA_PORT_*] = WHATEVER_WE_REQUESTED;

I don't care which of the two responses we put in there, we just need to
make a decision here since we forgot to define one before.

When waiting for the request to complete, we can then check for the
specific vf returned from lldpad (or the kernel, if we ever need to go there).

	Arnd