[vfio-users] Avoiding VFIO NOIOMMU taint in safe situations

Alex Williamson alex.williamson at redhat.com
Thu Apr 4 16:24:31 UTC 2019


On Wed, 3 Apr 2019 23:31:22 -0500
Shawn Anastasio <shawn at anastas.io> wrote:

> On 4/3/19 10:23 PM, Alex Williamson wrote:
> > On Wed, 3 Apr 2019 22:01:14 -0500
> > Shawn Anastasio <shawn at anastas.io> wrote:
> >   
> >> Hello all,
> >>
> >> I'm currently writing an application that makes use of Qemu's ivshmem
> >> shared memory mechanism, which exposes shared memory regions from the
> >> host via PCI-E BARs. MSI-X interrupts that are tied to host eventfds are
> >> also exposed.
> >>
> >> Since ivshmem doesn't have an in-tree kernel driver, I have been using
> >> VFIO's NOIOMMU mode to interface with the device. This works wonderfully
> >> for both BAR mapping and MSI-X interrupts. Unfortunately though, binding
> >> the ivshmem device to vfio_pci to use it in this way results in a kernel
> >> taint. I understand that this is because without an IOMMU, VFIO/Linux
> >> has no way of preventing devices from performing malicious access to
> >> other system memory. In the case of ivshmem though, the device does not
> >> have any DMA capabilities.  
> > 
> > The MSI-X interrupt is a DMA.  
> I hadn't realized this. That means then without an IOMMU, an
> MSI-X capable device is capable of reading/writing arbitrary
> memory?

Writing at least, this is why even with an IOMMU there's an opt-in if
that IOMMU lacks interrupt remapping support.

> >> This has created a situation in which the
> >> safest possible way to access the device (a kernel driver would be
> >> inherently less safe, UIO can't access the MSI-X functionality of the
> >> device) results in a kernel taint, when other, less safe methods don't.  
> > 
> > MSI-X support in UIO was rejected because MSI-X is a DMA and UIO does
> > not support devices that do DMA.  Vfio-noiommu was a compromise to
> > allow using the vfio API, but recognizing that it's inherently unsafe.
> >   
> >> In light of this, I propose a change to the VFIO framework that would
> >> allow use cases such as this without a kernel taint. One solution I see
> >> is only tainting when PCI devices with DMA capabilities are bound to
> >> VFIO. It is my understanding that a device's DMA capability can be
> >> determined by checking the Bus Mastering flag in the device's PCI
> >> configuration space, so something like this should be feasible.  
> > 
> > The bus master bit is not a capability for probing, enabling bus master
> > allows a device to perform DMA, including signaling via MSI
> > interrupts.  No bus master, no MSI.
> >   
> >> Perhaps an additional NOIOMMU mode could be introduced which only allows
> >> devices which meet this criteria, too (VFIO_NOIOMMU_NODMA_IOMMU?).
> >> Along with a separate Kconfig option, this would allow users to enable
> >> this safe usage at kernel build time, while still preventing the
> >> possibility of an unsafe DMA capable device from being used.
> >>
> >> I'm curious to hear feedback on this. If this is something that can be
> >> merged, I'd be more than happy to write a patch.  
> > 
> > Add a vIOMMU to your VM configuration (ie. intel-iommu) and use proper
> > vfio in the guest.  Thanks,  
> I had looked into this, but my application also targets ppc64, and a
> cross-platform is therefore necessary.
> 
> Strangely enough when booting a VM on ppc64, the kernel /does/ report
> an IOMMU, but there's only 1 group that contains all devices, so it
> doesn't seem usable.

Yes, AIUI ppc64 PAPR machines always have an IOMMU and there is a
SPAPR IOMMU model in vfio.  Maybe work with QEMU ppc64 developers to
figure out how the ivshmem device can be in its own group.  This
probably requires configuring the VM with another PCI host bridge and
attaching the ivshmem device under it.

> I guess it all boils down to this - does this usage of VFIO-NOIOMMU
> with an MSI-X device constitute a security risk? If so, it seems
> I'll have no choice but to write a kernel driver for a cross-platform
> solution.

There is no property we can detect about a PCI device to determine that
it doesn't support DMA.  All PCI device have DMA available to them.
Clearly we can't simply enforce that bus master is never enabled
because that breaks your use case of needing MSI interrupts and
presumes devices actually honor that bit and don't have more nefarious
ways of enabling it.  So if we have no way to know the device
capabilities or the intention of the user, or exploitability of the
user, I don't see how we can create a policy that singles out this use
case as trusted.  Thanks,

Alex




More information about the vfio-users mailing list