[libvirt] <interface type='direct'>

Laine Stump laine at laine.org
Sat Sep 3 19:49:44 UTC 2016


On 09/03/2016 11:08 AM, Moshe Levi wrote:
>
>> -----Original Message-----
>> From: sendmail [mailto:justsendmailnothingelse at gmail.com] On Behalf Of
>> Laine Stump
>> Sent: Thursday, September 01, 2016 5:59 PM
>> To: Libvirt <libvir-list at redhat.com>
>> Cc: Moshe Levi <moshele at mellanox.com>; Edan David
>> <edand at mellanox.com>
>> Subject: Re: [libvirt] <interface type='direct'>
>>
>> On 09/01/2016 04:05 AM, Moshe Levi wrote:
>>> Hi,
>>>
>>> In OpenStack we have a port type macvtap.
>>> Mavtap port is just a tap device connected to VF.
>>>
>>> In Libvirt the guest xml look like
>>> <interface type='direct'>
>>>     <mac address='fa:16:3e:b1:06:4e'/>
>>>     <source dev='p1p6' mode='passthrough'/>
>>>     <target dev='macvtap1'/>
>>>     <model type='virtio'/>
>>>     <driver name='vhost'/>
>>>     <alias name='net0'/>
>>>     <address type='pci' domain='0x0000' bus='0x00' slot='0x03'
>>> function='0x0'/> </interface>
>>>
>>>
>>> In the hypervisor we can see that the  mac of the VF which is
>>> fa:16:3e:f3:9b:e8 - is set by OpenStack see [1]
>>> 9: ens3f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq
>> master ovs-system state UP mode DEFAULT group default qlen 1000
>>>       link/ether 7c:fe:90:29:24:4e brd ff:ff:ff:ff:ff:ff
>>>       vf 0 MAC 00:00:00:00:00:00, spoof checking off, link-state disable
>>>       vf 1 MAC 00:00:00:00:00:00, spoof checking off, link-state disable
>>>       vf 2 MAC fa:16:3e:f3:9b:e8, vlan 48, spoof checking on, link-state enable
>>>       vf 3 MAC fa:16:3e:f6:02:c8, vlan 48, spoof checking on,
>>> link-state enable
>>> 41: ens3f4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq
>> state UP mode DEFAULT group default qlen 1000
>>>       link/ether fa:16:3e:f6:02:c8 brd ff:ff:ff:ff:ff:ff
>>> 42: macvtap0 at ens3f4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu
>> 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 500
>>>       link/ether fa:16:3e:f6:02:c8, brd ff:ff:ff:ff:ff:ff
>>>
>>> The netdevice of the VF which is ens3f4 has also the same mac. This
>>> mac is set when using Libvirt 1.2.2 (Ubuntu 14.04), But when we tested
>> with new Libvirt versions >= 1.2.17 (Fedora 23/Ubuntu 16.04) the mac
>> netdevice of the VF (ens3f4) is not set.
>>> This change in Libvirt breaks the guest from getting DHCP in OpenStack.
>>> Do you know why the behavior change in newer releases?
>> The MAC address is now set with a netlink command to set the VFINFO of
>> the particular VF# of the PF. This change was made in response to a bug
>> report stating that once the MAC address had been set for a hostdev
>> assignment of a VF (in which case this method is required), it was no longer
>> possible to set the MAC address for macvtap passthrough (the VF driver
>> would complain "MAC has been administratively set", on Intel igbvf at least).	
>> Unfortunately I recently found that when you set the MAC address in this
>> manner, it doesn't take effect on the actual device
>> - it's only saved in memory to be applied the *next time* the host driver is
>> rebound to the VF.
> Are saying that the change was to update the MAC of the VF?
> So I don’t understand how this effect the issue that  VF netdevice  MAC don't get set

Look at the explanation in commit cb3fe38c and also
https://bugzilla.redhat.com/show_bug.cgi?id=1113474

That commit switched from using a simple ioctl(SIOCSIFHWADDR) to the 
VF's netdev name, to using a netlink RTM_SETLINK message to the netdev 
of the *PF* for the given VF.

This was done because the latter is the *only* way you can set the MAC 
address for a VF that you're going to assign to the guest with vfio 
device assignment, and once you've set the MAC address that way, future 
attempts to set the MAC address with ioctl(SIOCSIGHWADDR) result in 
failure and a kernel log like this:

kernel: igb 0000:0e:00.1: VF 1 attempted to override administratively set MAC address
kernel: Reload the VF driver to resume operations

Looking into the kernel, it appears that once the MAC address for a VF 
has been set via RTM_SETLINK, the igb driver (and I believe also the 
ixgbe driver, not sure about others) doesn't allow it to be changed via 
ioctl until the PF driver is reloaded (which can't realistically be done 
on an active system)


But recently there was another report of the MAC address not getting set 
properly for macvtap passthrough mode when the device is an SRIOV VF (I 
can't find it in bugzilla, so it must have been an email to one of the 
lists) and when I tried it myself I found they were correct - in the 
output of "ip link show" the MAC address showed in the list of VFs under 
the PF is correctly modified, but it's not set properly in the VF's 
netdev instance - apparently the MAC addresses in the VF list aren't set 
in the VF's netdev immediately, they're just saved to be set *the next 
time the VF is re-bound to the VF netdev driver*. I think in the past 
the interface may have been in promiscuous mode so it didn't matter, but 
now it isn't? I'm not sure as I haven't had much time to investigate.

Does that make any more sense now?

>> Since I don't see a reasonably efficient way to get this to work, I need to
>> make a patch to revert to the old behavior, and we'll then just have to tell
>> people "If you do hostdev device assignment of VFs, then you can't later re-
>> use the same device for macvtap passthrough mode".
>>
>> (actually, I *think* an alternative would be to unbind/rebind the host driver
>> to the VF after setting the VF MAC address, but that seems a bit
>> disruptive/extreme to  work around a problem that is probably only seen in
>> QE labs, but not in the real world (realistically, production systems likely use
>> either hostdev or macvtap, and don't switch back and forth between them).
>>
>> A question - I notice you have the vlan set for the VF. Does *that* properly
>> take effect? (it's set in the same manner as the MAC address, via a netlink
>> command to set the VFINFO)
> I am not sure what you mean, but we set the vlan  in OpenStack after we create the guest xml.

What command do you use to set it? Do you use "ip link set $PF vf $VF# 
vlan $VLANID" ? I think that's what it's showing here:

https://review.openstack.org/#/c/364121/1/nova/network/linux_net.py

(I don't know my way around openstack code, but arrived at that page via 
clicking on links from a google search)

> In OpenStack we put the MAC of the VF and the vlan using iproute2.
> I just want to know if that should be the part of Libvirt setting mac/vlan or
> Libvirt  just create the macvtap interface and we should put the mac/vlan?

libvirt *should* do it.

>>> We have a WIP patch in OpenStack  for setting also the mac for the
>> netdevice of the VF  [2]. Just wanted to know that this is the correct
>> approach.
> Can you confirm that setting the VF netdevice mac in OpenStack is a reasonable workaround for the newer Libvirt versions?

If libvirt isn't getting the job done, and you can set it yourself, then 
that's a workaround. I don't know that I'd call it "reasaonable" though. 
If everybody puts in special code to workaround bugs in libvirt (which 
is apparently what's been done) rather than actually reporting the bug 
(what you're doing now - Thanks!) then we are tricked into thinking that 
either the code works, or that nobody is using it so it doesn't matter 
if it's broken.

The *best* way of overcoming this problem is to fix libvirt so it does 
what it's supposed to do.

It's possible we can make it work by adding some operation after we send 
the RTM_SETLINK (maybe unbind the VF from its netdev driver, then 
re-bind, but that seems so drastic and time consuming!), or maybe we'll 
have to revert to using ioctl(SIOCSIFHWADDR), but of course that will 
fail if the interface has been used for hostdev assignment since the 
last host reboot.

It's interesting that openstack is apparently using the RTM_SETLINK 
method to set the mac address (afaik, that's what is used by the "ip 
link set $pf_ifname vf $vf_num mac $mac_addr vlan $vlanid" command 
that's shown in the bit of code from nova/network/linux_net.py at the 
link I posted above).

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20160903/71517222/attachment-0001.htm>


More information about the libvir-list mailing list