[libvirt] RFC: managing "pci passthrough" usage of sriov VFs via a new network forward type

Daniel P. Berrange berrange at redhat.com
Tue Aug 23 10:50:56 UTC 2011


On Mon, Aug 22, 2011 at 05:17:25AM -0400, Laine Stump wrote:
> For some reason beyond my comprehension, the designers of SRIOV
> ethernet cards decided that the virtual functions (VF) of the card
> (each VF corresponds to an ethernet device, e.g. "eth10") should
> each be given a new+different+random MAC address each time the
> hardware is rebooted. 

[...snip...]

> This makes using SRIOV VFs via PCI passthrough very unpalatable. The
> problem can be solved by setting the MAC address of the ethernet
> device prior to assigning it to the guest, but of course the
> <hostdev> element used to assign PCI devices to guests has no place
> to specify a MAC address (and I'm not sure it would be appropriate
> to add something that function-specific to <hostdev>).

In discussions at the KVM forum, other related problems were
noted too. Specifically when using an SRIOV VF with VEPA/VNLink
we need to be able to set the port profile on the VF before
assigning it to the guest, to lock down what the guest can
do. We also likely need to a specify a VLAN tag on the NIC.
The VLAN tag is actally something we need to be able todo
for normal non-PCI passthrough usage of SRIOV networks too.

>                                                         Dave Allan
> and I have discussed a different possible method of eliminating this
> problem (using a new forward type for libvirt networks) that I've
> outlined below. Please let me know what you think - is this
> reasonable in general? If so, what about the details? If not, any
> counter-proposals to solve the problem?

The issue I see is that if an application wants to know what
PCI devices have been assigned to a guest, they can no longer
just look at <hostdev> elements. They also need to look at
<interface> elements. If we follow this proposed model in other
areas, we could end up with PCI devices appearing as <disks>
<controllers> and who knows what else. I think this is not
very desirable for applications, and it is also not good for
our internal code that manages PCI devices. ie the security
drivers now have to look at many different places to find
what PCI devices need labelling.

> One problem this doesn't solve is that when a guest is migrated, the
> PCI info for the allocated ethernet device on the destination host
> will almost surely be different. Is there any provision for dealing
> with this in the device passthrough code? If not, then migration
> will still not be possible.

Migration is irrelevant with PCI passthrough, since we reject any
attempt to migrate a guest with assigned PCI devices. A management
app must explicitly hot-unplug all PCI devices before doing any
migration, and plug back in new ones after migration finishes.

> Although I realize that many people are predisposed to not like the
> idea of PCI passthrough of ethernet devices (including me), it seems
> that it's going to be used, so we may as well provide the management
> tools to do it in a sane manner.

Reluctantly I think we need to provide the neccessary information
underneath the <hostdev> element. Fortunately we already have an
XML schema for port profile and such things, that we share between
the <interface> device element and the <network> schema. 

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|




More information about the libvir-list mailing list