[libvirt] PCI passthrough/SR-IOV on Cavium cn889x

Ciprian Barbu Ciprian.Barbu at enea.com
Wed Mar 21 15:46:01 UTC 2018


Hello,

In the context of running Openstack on a cluster of Cavium ThunderX cn8890 aarch64 servers, we are trying to attach virtual functions to a VM.

First some introduction. This Cavium SoC has a different approach to Virtual Functions than on x86 NICs, in which VFs are always enabled and there are two types of VFs and *one single* PF, as follows:
- primary VFs - these are in fact assigned by the system to the physical ports of the server, e.g em2p1s0f1, em2p1s0f3 etc below.
- secondary VFs - the main purpose of these is to provide additional HW queues under SW control (usually DPDK applications) by automatically binding them to the needed physical port.
- one single "physical" function, device 0002:01:00.0 below, which to the best of my knowledge acts merely as a stub and cannot be assigned an interface name.

Below is the output of "dpdk-devbind.py -s" which provides some useful information.

Network devices using DPDK-compatible driver ============================================
0002:01:00.2 'Device a034' drv=vfio-pci unused=nicvf

Network devices using kernel driver
===================================
0000:01:10.0 'THUNDERX BGX (Common Ethernet Interface)' if= drv=thunder-BGX unused=thunder_bgx,vfio-pci
0000:01:10.1 'THUNDERX BGX (Common Ethernet Interface)' if= drv=thunder-BGX unused=thunder_bgx,vfio-pci
0002:01:00.0 'THUNDERX Network Interface Controller' if= drv=thunder-nic unused=nicpf,vfio-pci
0002:01:00.1 'Device a034' if=em2p1s0f1 drv=thunder-nicvf unused=nicvf,vfio-pci
0002:01:00.3 'Device a034' if=em2p1s0f3 drv=thunder-nicvf unused=nicvf,vfio-pci
0002:01:00.4 'Device a034' if=em2p1s0f4 drv=thunder-nicvf unused=nicvf,vfio-pci
0002:01:00.5 'Device a034' if=em2p1s0f5 drv=thunder-nicvf unused=nicvf,vfio-pci
0002:01:00.6 'Device a034' if= drv=thunder-nicvf unused=nicvf,vfio-pci
0002:01:00.7 'Device a034' if= drv=thunder-nicvf unused=nicvf,vfio-pci
0002:01:01.0 'Device a034' if= drv=thunder-nicvf unused=nicvf,vfio-pci

Now for the problem. I don't have a domain definition because libvirt fails to start a domain, but I might be able to find what nova generates. But what it tries to do is passthrough em2p1s0f3, address 0002:01:00.3:
<interface type='hostdev' managed='yes'>
  <source>
    <address type='pci' domain='0x0002' bus='0x1' slot='0x0' function='0x3'/>
  </source>
</interface>

You can find attached a trimmed libvirtd.log where the main error is:
43236: error : virPCIGetVirtualFunctionInfo:2927 : internal error: The PF device for VF /sys/bus/pci/devices/0002:01:00.3 has no network device name

I have actually spent a few days trying to do some hacks and learn some more. The main idea is that virPCIGetVirtualFunctionInfo fails to find the physical name for the virtual device at address 0002:01:00.3, which as I explained in the introduction is something that this Cavium SoC does not do.

Looking further down the stream, almost all of the helper functions need a linkdev for the physical function, which means that making libvirt work on this system means some heavy refactoring, a solution being to use the sysfs path rather than the interface name.
This will not work 100% from what I've seen, at least virNetDevGetVfConfig uses netlink to save the admin MAC (part of virNetDevSaveNetConfig), and netlink needs the ifname.

So I'm quite stuck on finding a workaround/fix for this platform which would potentially be something upstreamable, so that we, ENEA, don't burden with maintaining an ugly hack. Right now we are using libvirt 3.5.0 but we can upgrade to something newer if need.

The question(s) thus, are
1. is this problem known in the libvirt community?
2. Is there any plan to make it work?
3. Can you give some pointers on an approach to adapt libvirt to this system?
4. Maybe it's worth changing the kernel to assign a sort of dummy interface to the physical function?

Thanks and sorry for the long email,
/Ciprian
-------------- next part --------------
A non-text attachment was scrubbed...
Name: libvirtd_fragment.log
Type: application/octet-stream
Size: 3144 bytes
Desc: libvirtd_fragment.log
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20180321/4b3d141c/attachment-0001.obj>


More information about the libvir-list mailing list