[vfio-users] problem with passthrough and unsafe interrupts

Janusz januszmk6 at gmail.com
Sat Sep 12 21:41:51 UTC 2015


W dniu 12.09.2015 o 23:19, Janusz pisze:
> W dniu 12.09.2015 o 22:44, Alex Williamson pisze:
>> On Sat, Sep 12, 2015 at 12:59 PM, Janusz <januszmk6 at gmail.com
>> <mailto:januszmk6 at gmail.com>> wrote:
>>
>>     W dniu 12.09.2015 o 19:57, Alex Williamson pisze:
>>>     On Sat, Sep 12, 2015 at 11:04 AM, Janusz <januszmk6 at gmail.com>
>>>     wrote:
>>>
>>>         Hello,
>>>
>>>         I have a question about allowing unsafe interrupts, what
>>>         exactly is
>>>         happening when we allow it? why its unsafe?
>>>
>>>
>>>     On a standard PC, Message Signaled Interrupts (MSI) are
>>>     triggered via a DMA write by the device to a special address
>>>     range.  In the early revisions of VT-d, the IOMMU did not
>>>     protect this range, which allowed devices to effectively spoof
>>>     interrupts from other devices and potentially run attacks
>>>     against the host using DMA writes to this interrupt block. 
>>>     Later versions of VT-d introduced the interrupt remapping
>>>     feature which, among other things, protects this range so that a
>>>     device can only signal the interrupts programmed for it.
>>>
>>>     Another thing that interrupt remapping does is provide a
>>>     translation between IOAPIC and X2APIC interrupt domains on the
>>>     system.  Pretty much all new Intel processors support X2APIC,
>>>     which necessitates interrupt remapping support.  So, I'd be
>>>     really surprised if your new skylake system doesn't have
>>>     interrupt remapping.  You should really only need to enable the
>>>     unsafe interrupts option if you try to assign the device, it
>>>     fails, and dmesg tells you that you need it.  And only then if
>>>     you trust your guests not to be malicious to the host.
>>
>>     thanks for explanation
>>
>>>      
>>>
>>>         I am asking because I am not able to passthrough my gpu
>>>         without allowing
>>>         unsafe interrupts, and have problems in running my virtual
>>>         machine as it
>>>         sometimes rebooting its self before its able to run windows.
>>>         I know that
>>>         there is also problem with using iGPU and not uefi bios, but
>>>         its also
>>>         happening with OVMF and passing uefi rom for my gpu (except
>>>         that with
>>>         OVMF sometimes I also get very high load). without passing VGA,
>>>         everyting works fine. Is it possible that its because those
>>>         unsafe
>>>         interrupts?
>>>         My hardware: i7 6700k, MSI z170a M7, Sapphire R9 290
>>>
>>>
>>>     Congratulations, you're the first person to show up with a
>>>     Skylake system, welcome to trailblazing a new Intel platform. 
>>>     When you say you're not able to assign the GPU without the
>>>     unsafe interrupts option, does that mean when you run the VM
>>>     QEMU exits and dmesg reports:
>>>
>>>     "No interrupt remapping support.  Use the module param
>>>     "allow_unsafe_interrupts" to enable VFIO IOMMU support on this
>>>     platform"
>>>
>>>     That would be pretty epic if Intel dropped interrupt remapping
>>>     support on Skylake.  Or perhaps you just mean that you have
>>>     something closer to working with unsafe interrupts.  If the
>>>     hardware supports interrupt remapping, then the unsafe option
>>>     makes absolutely no difference.  It's only meant to allow an
>>>     otherwise un-allowed scenario by making the user opt-in to the
>>>     security risk.
>>
>>     actually I didn't have CONFIG_IRQ_REMAP in kernel config set...
>>     now, when I set vfio_iommu_type1.allow_unsafe_interrupts=0, its
>>     working
>>
>>>
>>>         those are my options for non-uefi:
>>>         -enable-kvm -m 10000 -cpu host -smp
>>>         8,cores=4,threads=2,sockets=1
>>>         -device
>>>         ioh3420,bus=pci.0,addr=1c.0,multifunction=on,port=1,chassis=1,id=root.1
>>>         -device vfio-pci,host=01:00.0,multifunction=on,x-vga=on
>>>         -device vfio-pci,host=01:00.1
>>>
>>>
>>>     Looks like you're adding an ioh3420 for absolutely no reason. 
>>>     Same with the multifunction option.
>>
>>     didn't know about that, I took it from some howto on
>>     wiki.debian.org <http://wiki.debian.org>, disabled it now
>>
>>>      
>>>
>>>         and for uefi:
>>>
>>>         -drive
>>>         if=pflash,format=raw,readonly,file=/home/janusz/uefi/OVMF_CODE.fd
>>>         -drive if=pflash,format=raw,file=/home/janusz/uefi/OVMF_VARS.fd
>>>         -enable-kvm -m 10000 -cpu host -smp
>>>         8,cores=4,threads=2,sockets=1
>>>         -device
>>>         vfio-pci,host=01:00.0,romfile=/home/janusz/uefi/uefi-vga.bin,multifunction=on
>>>         -device vfio-pci,host=01:00.1
>>>
>>>
>>>     Useless multifunction option again.  Why does this one specify a
>>>     ROM file?  Does the on-card ROM not support UEFI?
>>
>>     Yes, my GPU doesn't support UEFI, but I found UEFI rom in web for
>>     this GPU
>>
>>
>>>      
>>>
>>>         + options with assigning /dev/sdb and two usb devices
>>>
>>>         I am using ovmf build recently from master, kernel 4.2.0 and
>>>         qemu-2.4.50
>>>
>>>
>>>         My iommu groups:
>>>
>>>         /sys/kernel/iommu_groups/0/devices/0000:00:00.0
>>>
>>>      
>>>
>>>         /sys/kernel/iommu_groups/1/devices/0000:00:01.0
>>>         /sys/kernel/iommu_groups/1/devices/0000:01:00.0
>>>         /sys/kernel/iommu_groups/1/devices/0000:01:00.1
>>>
>>>
>>>     So this ought to be the group we care about.
>>>      
>>>
>>>         /sys/kernel/iommu_groups/2/devices/0000:00:02.0
>>>         /sys/kernel/iommu_groups/3/devices/0000:00:08.0
>>>         /sys/kernel/iommu_groups/4/devices/0000:00:14.0
>>>         /sys/kernel/iommu_groups/4/devices/0000:00:14.2
>>>         /sys/kernel/iommu_groups/5/devices/0000:00:15.0
>>>         /sys/kernel/iommu_groups/5/devices/0000:00:15.1
>>>         /sys/kernel/iommu_groups/6/devices/0000:00:16.0
>>>         /sys/kernel/iommu_groups/7/devices/0000:00:17.0
>>>
>>>      
>>>
>>>         /sys/kernel/iommu_groups/8/devices/0000:00:1c.0
>>>         /sys/kernel/iommu_groups/8/devices/0000:00:1c.2
>>>         /sys/kernel/iommu_groups/8/devices/0000:00:1c.7
>>>         /sys/kernel/iommu_groups/8/devices/0000:02:00.0
>>>         /sys/kernel/iommu_groups/8/devices/0000:03:00.0
>>>         /sys/kernel/iommu_groups/8/devices/0000:04:00.0
>>>
>>>
>>>     Lovely, all the PCH root ports are grouped together, people
>>>     looking for hardware please take note.
>>>      
>>>
>>>         /sys/kernel/iommu_groups/9/devices/0000:00:1e.0
>>>         /sys/kernel/iommu_groups/10/devices/0000:00:1f.0
>>>         /sys/kernel/iommu_groups/10/devices/0000:00:1f.2
>>>         /sys/kernel/iommu_groups/10/devices/0000:00:1f.3
>>>         /sys/kernel/iommu_groups/10/devices/0000:00:1f.4
>>>
>>>         and lspci:
>>>
>>>         00:00.0 Host bridge: Intel Corporation Sky Lake Host Bridge/DRAM
>>>         Registers (rev 07)
>>>
>>>      
>>>
>>>         00:01.0 PCI bridge: Intel Corporation Sky Lake PCIe
>>>         Controller (x16)
>>>         (rev 07)
>>>
>>>      
>>>
>>>         00:02.0 VGA compatible controller: Intel Corporation Sky
>>>         Lake Integrated
>>>         Graphics (rev 06)
>>>
>>>      
>>>
>>>         00:08.0 System peripheral: Intel Corporation Sky Lake
>>>         Gaussian Mixture Model
>>>
>>>      
>>>
>>>         00:14.0 USB controller: Intel Corporation Sunrise Point-H
>>>         USB 3.0 xHCI
>>>         Controller (rev 31)
>>>         00:14.2 Signal processing controller: Intel Corporation
>>>         Sunrise Point-H
>>>         Thermal subsystem (rev 31)
>>>
>>>
>>>     Don't you like how they've put the USB3 controller on a
>>>     multifunction device with some random thermal management device,
>>>     without isolation of course.  Wonder what chaos assigning that
>>>     to a VM would cause.
>>>      
>>>
>>>         00:15.0 Signal processing controller: Intel Corporation
>>>         Sunrise Point-H
>>>         LPSS I2C Controller #0 (rev 31)
>>>         00:15.1 Signal processing controller: Intel Corporation
>>>         Sunrise Point-H
>>>         LPSS I2C Controller #1 (rev 31)
>>>
>>>      
>>>
>>>         00:16.0 Communication controller: Intel Corporation Sunrise
>>>         Point-H CSME
>>>         HECI #1 (rev 31)
>>>
>>>      
>>>
>>>         00:17.0 SATA controller: Intel Corporation Device a102 (rev 31)
>>>
>>>      
>>>
>>>         00:1c.0 PCI bridge: Intel Corporation Sunrise Point-H PCI
>>>         Express Root
>>>         Port #1 (rev f1)
>>>         00:1c.2 PCI bridge: Intel Corporation Sunrise Point-H PCI
>>>         Express Root
>>>         Port #3 (rev f1)
>>>         00:1c.7 PCI bridge: Intel Corporation Sunrise Point-H PCI
>>>         Express Root
>>>         Port #8 (rev f1)
>>>         00:1e.0 Signal processing controller: Intel Corporation
>>>         Sunrise Point-H
>>>         LPSS UART #0 (rev 31)
>>>
>>>      
>>>
>>>         00:1f.0 ISA bridge: Intel Corporation Sunrise Point-H LPC
>>>         Controller
>>>         (rev 31)
>>>         00:1f.2 Memory controller: Intel Corporation Sunrise Point-H
>>>         PMC (rev 31)
>>>         00:1f.3 Audio device: Intel Corporation Sunrise Point-H HD
>>>         Audio (rev 31)
>>>         00:1f.4 SMBus: Intel Corporation Sunrise Point-H SMBus (rev 31)
>>>
>>>
>>>     Oh look, the audio controller that used to be a nice separate
>>>     device is now also buried into a multifunction device without
>>>     isolation, no more assigning that to a VM.
>>>      
>>>
>>>         01:00.0 VGA compatible controller: Advanced Micro Devices, Inc.
>>>         [AMD/ATI] Hawaii PRO [Radeon R9 290]
>>>
>>>         01:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI]
>>>         Device aac8
>>>
>>>      
>>>
>>>         02:00.0 USB controller: ASMedia Technology Inc. Device 1242
>>>         03:00.0 Ethernet controller: Qualcomm Atheros Device e0a1
>>>         (rev 10)
>>>         04:00.0 Ethernet controller: Broadcom Corporation NetXtreme
>>>         BCM5721
>>>         Gigabit Ethernet PCI Express (rev 21)
>>>
>>>
>>>         If its not problem of unsafe interrupts, does anyone know
>>>         why this can
>>>         happen?
>>>
>>>
>>>     I don't see any reason to implicate unsafe interrupts.  It would
>>>     be nice to understand exactly what happens without specifying
>>>     unsafe interrupts.  It is possible to turn off interrupt
>>>     remapping support with kernel config options and boot options,
>>>     so please make sure you have CONFIG_IRQ_REMAP=y in your kernel
>>>     and aren't disabling it on the commandline.
>>
>>     So, now with unsafe interrupts turned off it works, but any idea
>>     why for the most of time virtaul machine is restarting itself?
>>     when I am trying OVMF, its restarting sometimes when windows is
>>     starting to load (both windows installation process and installed
>>     OS, windows 8 and windows 10), sometimes at tainocore logo, and
>>     sometimes this huge cpu usage... Also I noticed that in boot menu
>>     of ovmf its detecting only 2GHz CPU and 6GB ram, but on ram
>>     testing at start - its detecting proper number.
>>     Without OVMF I have no idea in what step its restarting as
>>     monitor gets signal only when windows loads, and after windows
>>     loads there is no problem with using VM, but i know its
>>     restarting as there is problem with using iGPU with vga
>>     passthrough - colors on one of my monitors are corrupted and when
>>     I restore it with changing mode by xrandr, it breaks again until
>>     windows boots. Also I hear some weird noise in sound when VM is
>>     starting and this noise is also indicating that VM is restarting
>>     (I have pulseaudio). The iGPU problem is also the reason why I
>>     want to use OVMF so I don't have to fix colors or patch kernel
>>     and disable some of iGPU functions, but with OVMF restarting
>>     problem happens more frequently.
>>
>>
>> So it sounds like you're attempting to do VGA mode assignment with
>> i915 as the host graphics without patching your kernel for the i915
>> VGA arbitration issue.  That's just not going to work.  All of the
>> VGA region accesses for your assigned device are being intercepted by
>> i915, causing your color issues and who knows what other damage to
>> the integrity of your system.  VGA mode assignment with Intel
>> integrated primary host graphics requires both a kernel patch and a
>> commandline option to enable it, you're playing with your own fate
>> otherwise.
>>
>> On the OVMF side, you have no log messages from QEMU/libvirt
>> indicating a VM reboot and nothing in dmesg when a VM reboot occurs? 
>> Is there a BSOD from the guest or just a silent reboot?  Is there
>> some reason you're running the development branch of QEMU rather than
>> 2.4.0 proper?  Do you have the AMD Catalyst driver installed in the
>> guest?
> No BSOD, only silent reboot or reset on uefi bios display, also reset
> issue for gpu (that was already fixed I think in some kernel/qemu
> version for hawaii, monitor still gets old display sometimes after
> turining off or reset VM), in dmesg I found only those:
>
> [10145.621272] vgaarb: device changed decodes:
> PCI:0000:01:00.0,olddecodes=io+mem,decodes=io+mem:owns=io+mem
> [10145.641778] vgaarb: device changed decodes:
> PCI:0000:01:00.0,olddecodes=io+mem,decodes=io+mem:owns=io+mem
> [10145.760058] [drm:check_wm_state] *ERROR* mismatch in DDB state pipe
> A plane 1 (expected (0,0), found (0,289))
> [10145.760061] [drm:check_wm_state] *ERROR* mismatch in DDB state pipe
> A cursor (expected (0,0), found (289,297))
> [10145.760062] [drm:check_wm_state] *ERROR* mismatch in DDB state pipe
> B plane 1 (expected (0,0), found (297,586))
> [10145.760063] [drm:check_wm_state] *ERROR* mismatch in DDB state pipe
> B cursor (expected (0,0), found (586,594))
> [10148.490876] vfio_ecap_init: 0000:01:00.0 hiding ecap 0x19 at 0x270
> [10148.490881] vfio_ecap_init: 0000:01:00.0 hiding ecap 0x1b at 0x2d0
> [10154.080574] usb 1-12: reset low-speed USB device number 5 using
> xhci_hcd
> [10154.372122] usb 1-12: ep 0x81 - rounding interval to 64
> microframes, ep desc says 80 microframes
> [10194.443399] kvm: zapping shadow pages for mmio generation wraparound
> [10194.453708] kvm: zapping shadow pages for mmio generation wraparound
> [10165.930150] usb 1-12: ep 0x81 - rounding interval to 64
> microframes, ep desc says 80 microframes
> [10168.912066] usb 1-12: reset low-speed USB device number 5 using
> xhci_hcd
> [10169.203902] usb 1-12: ep 0x81 - rounding interval to 64
> microframes, ep desc says 80 microframes
>
>
> I can be missing something as I get lot of warnings from i915 driver
> (known bug for i915 and skylake), but I did grep for kvm and vfio and
> didn't find anything else
>
> I am running now dev version of qemu because I wanted to test if newer
> version will give better result (it didn't), and didn't compiled back
> the stable version yet
And yes, I have AMD Catalyst drivers installed in the guest, but as this
also happening before it starts to boot windows and when starting
windows installation, I don't think this is the reason
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/vfio-users/attachments/20150912/c7c19d86/attachment.htm>


More information about the vfio-users mailing list