[libvirt] virNodeDevice APIs and VFIO

Tue Apr 23 16:34:05 UTC 2013

On 04/23/2013 11:46 AM, Daniel P. Berrange wrote:
> On Tue, Apr 23, 2013 at 10:52:09AM -0400, Laine Stump wrote:
>> Yesterday for the first time I consciously noticed the
>> virNodeDeviceDettach and virNodeDeviceReAttach APIs, and found that they
>> are hardcoded to bind to/unbind from the pci-stub driver for qemu, and
>> the "pciback" driver for Xen. If we want these APIs to be useful for
>> VFIO, they will need to bind to the vfio driver instead, but there is
>> currently no method in those APIs to specify which driver to bind to.
>>
>> I guess we could do this with new virNodeDeviceDetachFlags() and
>> virNodeDeviceReAttachFlags() APIs which have a flag to indicate vfio,
>> but before going to that trouble I'd like to know if these APIs are
>> actually used or if they are deprecated (they don't seem to be of any
>> use if the hostdev devices you're assigning have "managed='yes'" - as
>> far as I can see, setting managed='yes' just makes the bind/unbind from
>> the stub driver an automatic part of assigning/un-assigning the device
>> to a guest).
> The rationale for managed="no", is that in more security paranoid/locked
> down environments, admins will not want libvirtd screwing around with
> kernel module drivers. The admin would manually setup "pciback" binding
> ahead of time when configuring the host. I think we need to continue to
> support this scenario in the VFIO world.

Okay. But then in that case, since they don't want libvirtd screwing
around with kernel module drivers when starting a domain, they also
wouldn't want libvirtd screwing around with kernel module drivers at any
other time either (i.e. via virNodeDeviceDetach(), right?

(I suppose once we have more fine-grained authorization of the libvirt
API, it might be possible for one user to call virNodeDeviceDetach() and
another could only start a domain with <hostdev managed='no'>, but is
the "fine grainedness" really going to be down to the level of which
attributes are allowed in certain domain XML elements?)

>
> If we go down the route of extending <hostdev> to have support for
>
>     <driver name="vfio|qemu"/>

Hehe. That's *almost" exactly the patchset I'm working on (I was
thinking of sending out just the config-change patch now to get it
reviewed early and avoid the last minute rush before the freeze).
However, since the legacy backend in this case is handled by the kernel
KVM module, I had named them "vfio" and "kvm" (I originally thought to
use "qemu", but then considered it might confuse people into thinking
that it was something done by qemu in usermode).

>
> Then, this says that the virNodeDevice{Detach,ReAttach} methods need
> to have new variants which take a 'const char *drivername' parameter.
>
> NB, the values of 'drivername' would match those used in the domain
> XML <driver> element - they would *not* be kernel module names.
>
> So we'd call
>
>    virNodeDeviceDetachDriver(dev, "qemu");
>    virNodeDeviceDetachDriver(dev, "vfio");

(of course with a "flags" argument at the end :-)

That's a bit confusing, since "detach" in this case really means "detach
it from whatever driver it's currently bound to, and bind it to the
'vfio' driver". But just from reading the function name and args, it
*seems* to mean "detach this device from the 'vfio' driver". On the
other hand, I don't see a better alternative.

And then what about the converse - virNodeDeviceAttachDriver(dev,
"vfio") - that really means "detach it from vfio, and attach it to....
well actually *nowhere*! (that's what virNodeDeviceReAttach does - it's
not necessarily the inverse of virNodeDeviceDettach)

Maybe we need a totally generic virNodeDeviceAttach() that allows us to
specify a stub driver name, a "real" driver name, or NULL (in which case
it would just be unbound from the current driver)? But there again you
get back to the problem of specifying the actual name of a driver vs. an
abstracted name like "kvm" or "vfio" (the actual device to bind to for
vfio is called "vfio-pci").

Oh, and to top it all off, vfio has a concept of device "groups"; some
groups of devices must either be all assigned to the same guest, or if
not all are wanted, the others must be held in isolation (to preclude
their use by other guests or even by the host). I guess we could handle
that with a flag.