Re: [libvirt] [RFC PATCH] hostdev: add support for "managed='detach'"

On Tue, Mar 15, 2016 at 02:21:35PM -0400, Laine Stump wrote:
> On 03/15/2016 01:00 PM, Daniel P. Berrange wrote:
> >On Mon, Mar 14, 2016 at 03:41:48PM -0400, Laine Stump wrote:
> >>Suggested by Alex Williamson.
> >>
> >>If you plan to assign a GPU to a virtual machine, but that GPU happens
> >>to be the host system console, you likely want it to start out using
> >>the host driver (so that boot messages/etc will be displayed), then
> >>later have the host driver replaced with vfio-pci for assignment to
> >>the virtual machine.
> >>
> >>However, in at least some cases (e.g. Intel i915) once the device has
> >>been detached from the host driver and attached to vfio-pci, attempts
> >>to reattach to the host driver only lead to "grief" (ask Alex for
> >>details). This means that simply using "managed='yes'" in libvirt
> >>won't work.
> >>
> >>And if you set "managed='no'" in libvirt then either you have to
> >>manually run virsh nodedev-detach prior to the first start of the
> >>guest, or you have to have a management application intelligent enough
> >>to know that it should detach from the host driver, but never reattach
> >>to it.
> >>
> >>This patch makes it simple/automatic to deal with such a case - it
> >>adds a third "managed" mode for assigned PCI devices, called
> >>"detach". It will detach ("unbind" in driver parlance) the device from
> >>the host driver prior to assigning it to the guest, but when the guest
> >>is finished with the device, will leave it bound to vfio-pci. This
> >>allows re-using the device for another guest, without requiring
> >>initial out-of-band intervention to unbind the host driver.
> >You say that managed=yes causes pain upon re-attachment and that
> >apps should use managed=detach to avoid it, but how do management
> >apps know which devices are going to cause pain ? Libvirt isn't
> >providing any info on whether a particular device id needs to
> >use managed=yes vs managed=detach, and we don't want to be asking
> >the user to choose between modes in openstack/ovirt IMHO. I think
> >thats a fundamental problem with inventing a new value for managed
> >here.
> My suspicion is that in many/most cases users don't actually need for the
> device to be re-bound to the host driver after the guest is finished with
> it, because they're only going to use the device to assign to a different
> guest anyway. But because managed='yes' is what's supplied and is the
> easiest way to get it setup for assignment to a guest, that's what they use.
> As a matter of fact, all this extra churn of changing the driver back and
> forth for devices that are only actually used when they're bound to vfio-pci
> just wastes time, and makes it more likely that libvirt and its users will
> reveal and get caught up in the effects of some strange kernel driver
> loading/unloading bug (there was recently a bug reported like this;
> unfortunately the BZ record had customer info in it, so it's not publicly
> accessible :-( )
> So beyond making this behavior available only when absolutely necessary, I
> think it is useful in other cases, at the user's discretion (and as I
> implied above, I think that if they understood the function and the
> tradeoffs, most people would choose to use managed='detach' rather than
> managed='yes')

IIUC, in managed=yes mode we explicitly track whether the device was
originally attached to a host device driver. ie we only re-attach
the device to the host when guest shuts down, if it was attached to
the host at guest startup.

We already have a virNodeDeviceDetach() API that can be used to
detach a device from the host driver explicitly.

So applications can in fact already achieve what you describe in
terms of managed=detach, by simply calling virNodeDeviceDetach()
prior to starting the guest with cold plugged PCI devices / hotplugging
the PCI device.

IOW, even if we think applications should be using managed=detach,
they can already do so via existing libvirt APIs.

