[libvirt] [RFC PATCH] hostdev: add support for "managed='detach'"

Tue Mar 22 23:03:22 UTC 2016

On 03/14/2016 03:41 PM, Laine Stump wrote:
> Suggested by Alex Williamson.
> 
> If you plan to assign a GPU to a virtual machine, but that GPU happens
> to be the host system console, you likely want it to start out using
> the host driver (so that boot messages/etc will be displayed), then
> later have the host driver replaced with vfio-pci for assignment to
> the virtual machine.
> 
> However, in at least some cases (e.g. Intel i915) once the device has
> been detached from the host driver and attached to vfio-pci, attempts
> to reattach to the host driver only lead to "grief" (ask Alex for
> details). This means that simply using "managed='yes'" in libvirt
> won't work.
> 
> And if you set "managed='no'" in libvirt then either you have to
> manually run virsh nodedev-detach prior to the first start of the
> guest, or you have to have a management application intelligent enough
> to know that it should detach from the host driver, but never reattach
> to it.
> 
> This patch makes it simple/automatic to deal with such a case - it
> adds a third "managed" mode for assigned PCI devices, called
> "detach". It will detach ("unbind" in driver parlance) the device from
> the host driver prior to assigning it to the guest, but when the guest
> is finished with the device, will leave it bound to vfio-pci. This
> allows re-using the device for another guest, without requiring
> initial out-of-band intervention to unbind the host driver.
> ---
> 
> I'm sending this with the "RFC" tag because I'm concerned it might be
> considered "feature creep" by some (although I think it makes at least
> as much sense as "managed='yes'") and also because, even once (if) it
> is ACKed, I wouldn't want to push it until abologna is finished
> hacking around with the driver bind/unbind code - he has enough grief
> to deal with without me causing a bunch of merge conflicts :-)
> 

[...]

Rather the burying this in one of the 3 other conversations that have
been taking place - I'll respond at the top level. I have been following
the conversation, but not at any great depth...  Going to leave the
should we have "managed='detach'" conversation alone (at least for now).

Not that I want to necessarily dip my toes into these shark infested
waters; however, one thing keeps gnawing at my fingers from hanging onto
the don't get involved in this one ledge ;-)... That one thing is the
problem to me is less libvirt's ability to manage whether the devices
are or aren't detached from the host, but rather that lower layers (e.g.
kernel) aren't happy over the frequency of such requests. Thrashing for
any system isn't fun, but it's a lot easier to tell someone else to stop
doing it since it hurts when you do that. Tough to find a happy medium
between force user to detach rather than let libvirt manage and letting
libvirt be the culprit causing angst for the lower layers.

So, would it work to have some intermediary handle this thrashing by
creating some sort of "blob" that will accept responsibility for
reattaching the device to the host "at some point in time" as long as no
one else has requested to use it?

Why not add an attribute (e.g. delay='n') that is only valid when
managed='yes' to the device which means, rather than immediately
reattaching this to the host when the guest is destroy, libvirt will
delay the reattach by 'n' seconds. That way someone that knows they're
going to have a device used by multiple guests that could be thrashing
heavily in the detach -> reattach -> detach -> reattach -> etc loop
would be able to make use of an optimization of sorts that just places
the device back in the inactive list (as if it were detached manually),
then starts a thread that will reawaken when a timer fires to handle the
reattach. The thread would be destroyed in the event that something
codes along and uses (e.g. places back into the active list).

The caveats that come quickly to mind in using this is that devices that
were being managed could be left on the inactive list if the daemon dies
or is restarted, but I think it is detectable at restart so it may not
be totally bad. Also, failure to reattach is left in some thread which
has no one to message to other than libvirtd log files. Both would have
to be noted with any description of this.

Not all devices will want this delay logic and I think it's been pointed
out that there is a "known list" of them.  In the long run it allows
some control by a user to decide how much rope they'd like to have to
hang themselves.

John

Not sure if Gerd, Alex, and Martin are regular libvir-list readers, so I
did CC them just in case so that it's easier for them to respond if they
so desire since they were at part of the discussions in this thread.