[vfio-users] Passing arbitrary IRQ to guest?

Micah Morton mortonm at chromium.org
Fri Jan 31 23:18:00 UTC 2020


On Wed, Dec 11, 2019 at 3:09 PM Alex Williamson
<alex.williamson at redhat.com> wrote:
>
> On Wed, 11 Dec 2019 14:40:56 -0800
> Micah Morton <mortonm at chromium.org> wrote:
>
> > On Wed, Dec 11, 2019 at 10:44 AM Alex Williamson
> > <alex.williamson at redhat.com> wrote:
> > >
> > > On Wed, 11 Dec 2019 09:37:57 -0800
> > > Micah Morton <mortonm at chromium.org> wrote:
> > >
> > > > On Tue, Dec 10, 2019 at 4:00 PM Alex Williamson
> > > > <alex.williamson at redhat.com> wrote:
> > > > >
> > > > > On Mon, 9 Dec 2019 14:18:50 -0800
> > > > > Micah Morton <mortonm at chromium.org> wrote:
> > > > >
> > > > > > On Thu, Sep 5, 2019 at 12:22 PM Micah Morton <mortonm at chromium.org> wrote:
> > > > > > >
> > > > > > > On Wed, Aug 28, 2019 at 3:22 PM Alex Williamson
> > > > > > > <alex.williamson at redhat.com> wrote:
> > > > > > > >
> > > > > > > > On Wed, 28 Aug 2019 09:39:57 -0700
> > > > > > > > Micah Morton <mortonm at chromium.org> wrote:
> > > > > > > >
> > > > > > > > > On Mon, Aug 5, 2019 at 11:14 PM Gerd Hoffmann <kraxel at redhat.com> wrote:
> > > > > > > > > >
> > > > > > > > > > On Mon, Aug 05, 2019 at 12:50:00PM -0700, Micah Morton wrote:
> > > > > > > > > > > On Thu, Aug 1, 2019 at 10:36 PM Gerd Hoffmann <kraxel at redhat.com> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > >   Hi,
> > > > > > > > > > > >
> > > > > > > > > > > > > From my perspective, as a low-speed device where we don't really need
> > > > > > > > > > > > > the benefits of an IOMMU, I'd be more inclined to look at why it
> > > > > > > > > > > > > doesn't work with evdev.  We already have a tablet device in QEMU,
> > > > > > > > > > > > > what's it take to connect that to evdev?  Cc'ing Gerd as maybe he's
> > > > > > > > > > > > > already though about touchpad support.  Thanks,
> > > > > > > > > > > >
> > > > > > > > > > > > It's not clear why the touchpad doesn't work.  Possibly using libinput
> > > > > > > > > > > > helps, https://git.kraxel.org/cgit/qemu/log/?h=sirius/display-drm has
> > > > > > > > > > > > some code.  Wiring up to input-linux isn't done yet though, only the
> > > > > > > > > > > > drm ui uses libinput support so far.
> > > > > > > > > > >
> > > > > > > > > > > To be clear are you saying that its a known issue that the touchpad
> > > > > > > > > > > doesn't work in VM guest with QEMU and evdev?
> > > > > > > > > >
> > > > > > > > > > There are other reports of touchpad problems.  I don't know whenever
> > > > > > > > > > that is a general problem or specific to some devices.
> > > > > > > > > >
> > > > > > > > > > libinput knows quirks for lots of input devices.  When passing through
> > > > > > > > > > the evdev to the guest as virtio device libinput can't see the device
> > > > > > > > > > identity and thus can't apply quirks.  Which might be the reason the
> > > > > > > > > > touchpad doesn't work.  Using libinput on the host side might fix this.
> > > > > > > > > >
> > > > > > > > > > cheers,
> > > > > > > > > >   Gerd
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > I was able to get physical passthrough of the touchpad working in the
> > > > > > > > > VM guest by forwarding the IRQ to the guest using the kvm/qemu/vfio
> > > > > > > > > framework.
> > > > > > > > >
> > > > > > > > > So basically I wrote extensions to kvm/qemu/vfio to allow for
> > > > > > > > > forwarding arbitrary IRQs to the guest (the IRQ doesn't have to be
> > > > > > > > > associated with any vfio-pci or vfio-platform device). I could clean
> > > > > > > > > up the patches and upstream them (or think about it) if you folks
> > > > > > > > > think anyone else might want to use this functionality? Then again as
> > > > > > > > > Alex said before you still need to communicate to the VM which IRQ to
> > > > > > > > > use for this device (in my case I did this by modifying ACPI stuff in
> > > > > > > > > SeaBIOS, not sure how it could be incorporated into vfio).
> > > > > > > >
> > > > > > > > This seems like something that's not too difficult to hack together,
> > > > > > > > but quite a lot harder to generalize into something that's useful
> > > > > > > > beyond this specific hardware.  There's a path to do so via the vfio
> > > > > > > > API, using a device specific interrupt to expose this IRQ and a
> > > > > > > > capability to convey how that IRQ is associated so that QEMU could
> > > > > > > > automatically create some AML.  Defining that interaction is far from
> > > > > > >
> > > > > > > Yeah, seems like the user being willing to modify the virtual bios to
> > > > > > > add AML info is a pretty fringe use case.
> > > > > > >
> > > > > > > > trivial, but before we even approach that, how does vfio-pci learn to
> > > > > > > > associate this IRQ with a device without growing a full software stack
> > > > > > > > specific to the PCI device, or class of PCI devices?  We have some
> > > > > > >
> > > > > > > I guess I was envisioning a command line argument to qemu, something
> > > > > > > like `-device vfio-pci,host=00:0X.0,irq-passthrough=X`. Then again the
> > > > > > > command line would at least need to specify level vs edge triggered I
> > > > > > > think, and maybe other things I haven't thought of (in addition to
> > > > > > > telling the guest these things through AML).
> > > > > > >
> > > > > > > > hacks in vfio, but they're usually for devices that can work on any
> > > > > > > > system, not specific devices on specific systems.  I wouldn't be
> > > > > > > > willing to support that unless it's at least got some obvious
> > > > > > > > extensibility to work elsewhere.  Thanks,
> > > > > >
> > > > > > Hi Alex,
> > > > > >
> > > > > > What about the possibility of having some blob of ASL/AML be an input
> > > > > > to qemu, that way one could do something like "qemu-system-x86_64 ...
> > > > > > -device vfio-pci,host=01:00.0,aml-file=/path/to/asl"?
> > > > > >
> > > > > > Qemu already has to set up a bunch of ASL for the guest, so this blob
> > > > > > could just be added on as another piece that is passed to the virtual
> > > > > > BIOS, and if there is an "Interrupt" field in the ASL then qemu calls
> > > > > > the VFIO ioctl for setting up IRQ forwarding for the IRQ associated
> > > > > > with the platform device that hangs off of the PCI bus controller. I
> > > > > > believe this solution would avoid needing any device-specific hacks in
> > > > > > VFIO.
> > > > >
> > > > > So you're asking QEMU to not only pass the ASL to the guest, but parse
> > > > > it itself to learn about the interrupt objects it might contain, which
> > > > > it would then setup through vfio.  But then vfio-pci in the kernel
> > > >
> > > > Yes. There are probably ways parsing could be avoided with some extra args
> > > >
> > > > > itself needs to become an i2c driver in order to probe the i2c bus
> > > > > downstream of the PCI endpoint to find these platform devices and then
> > > > > involve the ACPI subsystem to retrieve the IRQ information for this
> > > > > downstream device, which then becomes a device specific IRQ on the
> > > > > exposed PCI endpoint... sure, that's not a hack at all :-\
> > > >
> > > > I might be missing something, but I was envisioning QEMU passing all
> > > > the info that VFIO needs to emulate the interrupt for the guest (e.g.
> > > > irq number, edge vs level triggered, active low vs high, etc). Are you
> > > > saying you don't think this is feasible and VFIO will always have to
> > > > go to the ACPI subsystem for certain info? Maybe there's more info
> > > > related to interrupts that I'm not considering?
> > >
> > > "QEMU passing all the info that VFIO needs to emulate the interrupt".
> > > What is the physical source of this interrupt?  Is QEMU asking the host
> >
> > A physical device with an IRQ line hooked up to the IOAPIC in the
> > host. This device being on a bus whose controller (a PCI endpoint)
> > gets passed to the guest.
>
> Right, so as far a vfio knows this is a completely arbitrary IRQ, as
> you asked for in Subject:
>
> > > kernel, via vfio, to expose an arbitrary IRQ to userspace, or is this
> > > maybe only adding the ACPI glue in the guest to associate an already
> > > exposed IRQ of the assigned PCI device (INTx?) as an ACPI object for
> > > the contained platform device?  The former is what I'm afraid we're
> > > trying to do (we can't simply trust userspace to give it access to
> > > arbitrary physical resources), the latter seems far more feasible in the
> > > scope you're suggesting.
> >
> > I was proposing the former, but hadn't thought about it from the angle
> > of trusting userspace being problematic -- I was just thinking of a
> > way it could be done with minimal logic in VFIO. I see what you mean
> > from the previous message about how it would take a bunch more
> > functionality in VFIO to verify that the IRQ QEMU wants to pass to the
> > guest is associated with a device on a bus that is already being
> > passed to the guest.
> >
> > I guess I'm not super clear on VFIO's threat model when it comes to
> > carrying out ioctls from userspace. Is an unclaimed IRQ (if it's
> > claimed, then VFIO will fail the request_irq() call) a sensitive
>
> We can't simply rely on driver ordering to decide whether userspace
> gets to claim an IRQ that might later prevent a host driver from
> operating correctly.  Maybe, for example, that IRQ is shared with some
> component in a laptop dock and the driver for that component isn't
> necessarily loaded at boot time.
>
> > physical resource that is dangerous to expose to root-level userspace?
> > Doesn't userspace need root to do any of this device passthrough stuff
> > anyway?
>
> No, it doesn't.  Admin privileges are not required beyond binding the
> device to a vfio bus driver and granting the user permissions to the
> resulting vfio group device file.  Setting the user's locked memory
> limits sufficiently is also required for use cases like VMs.  We assume
> a normal unprivileged, untrusted user for interaction through the vfio
> API.
>
> > So if root can attach and detach drivers to hardware devices
> > and so forth, doesn't that seem no less powerful than claiming an IRQ
> > line and letting userspace read/mask/unmask it? I'm sure you've
> > thought about this much more extensively than I have so let me know if
> > you have documentation or thoughts on this.
>
> Driver binding is an entirely separate stage in the process that does
> require privilege.  At the point where QEMU is actually interacting
> with a device where it could request vfio to grab an unrelated IRQ,
> there is no requirement or expectation of privilege.  A libvirt managed
> VM using vfio-pci only privileges the QEMU process with access to the
> device file(s) and sufficient locked memory limits in a default
> configuration.  Thanks,
>
> Alex
>

Thanks for the responses and sorry for the long delay in responding.

Maybe a potential option is to create a separate kernel module for
what I want to do. It probably can't be a platform driver per se
(don't think attaching to a platform device while its bus controller
is attached to vfio-pci makes sense), but it could be a module that
allows for forwarding arbitrary IRQs by calling the appropriate ioctls
from the VMM.

A bit of the VFIO eventfd/irqfd logic would have to be copied to this
hypothetical module, but that may be worth it to avoid retrofitting
vfio-pci to accomplish something for which it wasn't designed. Any use
of such a module could be considered logically equivalent to full
privilege on the system.

I'm just now thinking through this. Feel free to comment, or not. Your
previous messages have me convinced that adding this functionality to
vfio-pci may not be the best solution.

Thanks, Micah





More information about the vfio-users mailing list