[libvirt] <interface type='hostdev'>vf configuration cleanup when VM is delete

Laine Stump laine at laine.org
Wed Dec 16 17:22:53 UTC 2015


On 12/16/2015 07:56 AM, Moshe Levi wrote:
>
> To clean up the VF I use
>
> ip link set dev p4p2 vf 0 mac 0 and it working
>

Now *that* is interesting...

> 24: enp3s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq 
> master ovs-system state UP mode DEFAULT group default qlen 1000
>
>     link/ether e4:1d:2d:a5:f1:22 brd ff:ff:ff:ff:ff:ff
>
>     vf 0 MAC 00:00:00:00:00:00, spoof checking off, link-state auto
>
>     vf 1 MAC 00:00:00:00:00:00, spoof checking off, link-state auto
>
>     vf 2 MAC 00:00:00:00:00:b1, vlan 190, spoof checking off, 
> link-state enable
>
>     vf 3 MAC aa:bb:cc:00:00:12, vlan 190, spoof checking off, 
> link-state enable
>
> [root at r-ufm160 devstack]# ip link set dev enp3s0f0 vf 3 mac 0
>
> [root at r-ufm160 devstack]# ip link show enp3s0f0
>
> 24: enp3s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq 
> master ovs-system state UP mode DEFAULT group default qlen 1000
>
>     link/ether e4:1d:2d:a5:f1:22 brd ff:ff:ff:ff:ff:ff
>
>     vf 0 MAC 00:00:00:00:00:00, spoof checking off, link-state auto
>
>     vf 1 MAC 00:00:00:00:00:00, spoof checking off, link-state auto
>
>     vf 2 MAC 00:00:00:00:00:b1, vlan 190, spoof checking off, 
> link-state enable
>
>     vf 3 MAC 00:00:00:00:00:b1, vlan 190, spoof checking off, 
> link-state enable
>
> It just put the address 00:00:00:00:00:b1 which I don’t know why, but 
> as I remember the same behavior is in intel cards (I think is related 
> to iproute)
>

I just tried this with the igb driver on both 2.6.32 and 4.1 kernels, 
and a plain "0" is successful for me too. But, as you've experienced, it 
doesn't actually set the MAC address to 00:00:00:00:00:00, but instead 
puts random numbers in the final two bytes :-/

So I investigated further, and found that if I use:

   ip link set dev p4p2 vf 0 mac 00:00:00:00:00 <-- note 5 bytes, not 6

then all bytes except the *final* byte are 0, and the final byte is two 
seemingly random bytes. But if I re-run the same command many times I 
find that it just rotates between 10 or so different values; not so 
random (when I give "0", or "00:00:00:00" to ip link set, the 2nd to 
last byte is always the *exact same* value.

So I looked in the source for the ip utility (in the iproute package) 
and I found that the function parsing mac addresses from the commandline 
just creates the buffer on the stack, doesn't initialize it, then parses 
in as many digits as you specify, leaving the rest with whatever 
happened to be sitting on the stack at the time :-O.

In other words, it's just a happy coincidence of a bug in iproute's mac 
address parser that "ip link set .... mac 0" happens to be successful 
(and that bytes 2-4 are 0 and 5-6 are non-0).

I really don't know where to start / what to do with this information. 
There is obviously a bug in iproute that should be fixed, but if it is 
fixed before all the places in the kernel are adjusted to allow an all-0 
MAC, then users will be complaining that their script which was working 
for years and years (although probably not doing exactly what they 
believed) is suddenly broken. And who knows what Hell-fury will be 
unleashed by some unknown bit of code in the kernel if a 0 mac address 
suddenly shows up for the first time ever. Sigh.

(BTW, Cisco's enic driver, on the other hand, doesn't support setting VF 
MAC addresses via a netlink message to the PF *at all* (so libvirt has 
to make special accommodations), but it happily accepts requests to 
directly set the MAC address to 00:00:00:00:00:00 via 
ioctl(SIOCSIFHWADDR) (and the interface MAC address really does get set 
to all 0's). There is a script for ovirt that uses a MAC address of all 
0's to recognize that an interface is unused, and can thus be included 
in a pool of interfaces in a libvirt network. That won't work with any 
other SRIOV drivers though, because even if they initialize their VF 
macs to 0 (e.g. mlx and *new* (3.10+) igb (but *not* 2.6.32 igb!)), they 
can't be set back to 0 when they are once again unused. Again sigh.)

> I used fedora 2.1 with kernel 4.1.13-100.
>
> The most annoying part is that in OpenStack  if I use an SR-IOV VF 
> (interface hostdev) for VM and delete it I can’t reuse it for macvtap 
> (interface direct) so I have to clean the mac
>
> by running ip link set dev p4p2 vf 0 mac 0
>
> I guess I will need to workaround it in OpenStack.
>
> *From:*sendmail [mailto:justsendmailnothingelse at gmail.com] *On Behalf 
> Of *Laine Stump
> *Sent:* Tuesday, December 15, 2015 9:45 PM
> *To:* Libvirt <libvir-list at redhat.com>
> *Cc:* Moshe Levi <moshele at mellanox.com>; vyasevic at redhat.com
> *Subject:* Re: [libvirt] <interface type='hostdev'>vf configuration 
> cleanup when VM is delete
>
> On 12/15/2015 01:34 PM, Laine Stump wrote:
>
>     On 12/13/2015 10:51 AM, Moshe Levi wrote:
>
>         Hi,
>
>         I have a setup with libvirt 1.3.0 and OpenStack trunk.
>
>         Before launched the VM ip link command show the following VF
>         mac/vlan configuration [1]
>
>         When I launch a VM with <interface type='hostdev'> via
>         openstack api (OpenStack direct port)
>
>         I can see that the VF get the mac/vlan according to libvrit
>         xml [2] and ip link command  [3], but when I delete the VM the
>         mac/vlan config are still shown as in [3] and not restored to [1]
>
>         Shouldn’t  libvirt restore the mac/vlan to [1].
>
>         The same problem exists when using <interface type='direct'>
>         (OpenStack macvtap port)  but just for the MAC configuration
>         of the VF.
>
>
>     What libvirt does is to restore the MAC address to whatever it was
>     before we set it up for use with a guest. Although there are some
>     sriov net drivers that (for some unfathomable reason) think it's
>     cool to assign a random MAC address to each VF at boot time, the
>     "normal" thing is for the VFs to have a MAC address of all 0's to
>     start with. So libvirt should be saving 00:00:00:00:00:00 (it will
>     be in the file /var/run/libvirt/hostdevmgr/$ifname_vf$vfnum) then
>     setting the MAC to use; when done, libvirt will read the
>     00:00:00:00:00:00 and use netlink to set the MAC address, but this
>     is apparently failing.
>
>     I checked on my Fedora 22 system with the igb driver, and found
>     that if the MAC address was originally set to something other than
>     0's, it was restored properly by libvirt, but if it was set to all
>     0's originally, the attempt to set it back to 0 would fail.
>
>     I then tried doing the same thing with the "ip" utility:
>
>         # ip link set dev p4p2 vf 0 mac 00:00:00:00:00:00
>
>     and I get the following response:
>
>         RTNETLINK answers: Invalid argument
>
>     So it appears that either the kernel or the NIC driver is refusing
>     to set the MAC address to all 0's. I'm reasonably certain this is
>     a regression in the kernel,
>
>
> Sigh. It appears that this has "always" been the case - I just checked 
> on a 2.6.32-573 RHEL kernel, and a 3.10.x RHEL7.2 kernel, and 4.1 
> (Fedora 22) and both of them also refuse to set the MAC address to 
> 00:00:00:00:00:00. I'm not sure if this limitation is in the NIC 
> driver or some basic code in the kernel.
>
>
>
>     although I can't say how long it's been there, as I don't normally
>     pay attention to this (and as I said, many SRIOV NIC drivers don't
>     default their VFs to 0 MAC addresses)
>
>     What distro and kernel are you using for your tests?
>
>
>
>         [1]  - 24: enp3s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu
>         1500 qdisc mq master ovs-system state UP mode DEFAULT group
>         default qlen 1000
>
>             link/ether e4:1d:2d:a5:f1:22 brd ff:ff:ff:ff:ff:ff
>
>             vf 0 MAC 00:00:00:00:00:00, spoof checking off, link-state
>         auto
>
>             vf 1 MAC 00:00:00:00:00:00, spoof checking off, link-state
>         auto
>
>             vf 2 MAC 00:00:00:00:00:00, spoof checking off, link-state
>         auto
>
>             vf 3 MAC 00:00:00:00:00:00, spoof checking off, link-state
>         auto
>
>         [2] - <interface type='hostdev' managed='yes'>
>
>           <mac address=' fa:16:3e:11:af:fe '/>
>
>           <driver name='kvm'/>
>
>           <source>
>
>             <address type='pci' domain='0x0000' bus='0x02' slot='0x00'
>         function='0x7'/>
>
>           </source>
>
>           <vlan>
>
>             <tag id='190'/>
>
>           </vlan>
>
>           <alias name='hostdev0'/>
>
>           <address type='pci' domain='0x0000' bus='0x00' slot='0x04'
>         function='0x0'/>
>
>         </interface>
>
>         [3] 24: enp3s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500
>         qdisc mq master ovs-system state UP mode DEFAULT group default
>         qlen 1000
>
>             link/ether e4:1d:2d:a5:f1:22 brd ff:ff:ff:ff:ff:ff
>
>             vf 0 MAC 00:00:00:00:00:00, spoof checking off, link-state
>         auto
>
>             vf 1 MAC 00:00:00:00:00:00, spoof checking off, link-state
>         auto
>
>             vf 2 MAC 00:00:00:00:00:00, spoof checking off, link-state
>         auto
>
>             vf 3 MAC fa:16:3e:11:af:fe, vlan 190, spoof checking off,
>         link-state enable
>
>
>
>
>         --
>
>         libvir-list mailing list
>
>         libvir-list at redhat.com <mailto:libvir-list at redhat.com>
>
>         https://www.redhat.com/mailman/listinfo/libvir-list
>
>     F15
>
>
>
>     --
>
>     libvir-list mailing list
>
>     libvir-list at redhat.com <mailto:libvir-list at redhat.com>
>
>     https://www.redhat.com/mailman/listinfo/libvir-list
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20151216/8e9c7532/attachment-0001.htm>


More information about the libvir-list mailing list