[vfio-users] VM doesn't boot, hangs with R9 Fury passthrough

Matti Niemenmaa matti.niemenmaa+vfio at iki.fi
Wed Sep 2 17:22:15 UTC 2015


Hi José,

Since my last message I did try with SeaBIOS (version 
1.8.2-20150617_082717-anatol) as well. And yeah, I did have to 
re-install Windows...

The SeaBIOS-using VM does work better, since at least the SeaBIOS screen 
(that reports the version number and says "booting from hard disk" and 
so on) shows up on the monitor connected to the R9 Fury. From then on 
the behaviour seems to vary:

* In my initial attempts it hung whilst showing the Windows 10 logo. 
There were also dmesg errors: see below (I did not record these but I 
believe they were the same as I see now).

* Now as I try it again, it reboots before getting that far.

* If I then boot to Windows with "-vga std" and no passthrough, it 
displays some options related to debugging failing boots. I first 
disabled automatically rebooting on error and then enabled some sort of 
low resolution mode — but I'm not sure if both settings stuck, since 
making a selection caused an immediate reboot — after which the VM again 
hangs while displaying the Windows 10 logo.

But now there's also something in dmesg. First, the following:

DMAR: ERROR: DMA PTE for vPFN 0xfea00 already set (to c7d9c3003 not 
383feea00083)

And then the following stack trace, repeated hundreds of times:

WARNING: CPU: 3 PID: 1308 at drivers/vfio/vfio_iommu_type1.c:364 
vfio_remove_dma+0x1b6/0x1f0 [vfio_iommu_type1]()
Modules linked in: tun rpcsec_gss_krb5 nfsv4 dns_resolver 
snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic bridge 
stp llc nct6775 hwmon_vid nls_iso8859_1 nls_cp437 radeon snd_hda_intel 
snd_hda_codec x86_pkg_temp_thermal joydev snd_hwdep iTCO_wdt coretemp 
drm_kms_helper kvm_intel xpad mousedev ttm snd_hda_core 
iTCO_vendor_support kvm snd_pcm e1000e drm mei_me ptp pps_core snd_timer 
input_leds ff_memless i2c_algo_bit crc32_pclmul mei lpc_ich led_class 
mfd_core evdev wmi processor button sch_fq_codel nfsd nfs auth_rpcgss 
oid_registry nfs_acl lockd grace sunrpc fscache ip_tables x_tables 
usbhid dm_mod sd_mod atkbd libps2 ahci libahci libata scsi_mod i8042 
serio vfio_pci vfio_virqfd vfio_iommu_type1 vfio
CPU: 3 PID: 1308 Comm: qemu-system-x86 Tainted: G        W 
4.2.0-1-bfq #1
Hardware name: ASUS All Series/X99-A/USB 3.1, BIOS 1801 05/15/2015
  0000000000000000 000000009022e052 ffff880db1193c78 ffffffff816366e4
  0000000000000000 0000000000000000 ffff880db1193cb8 ffffffff810b4906
  00000000fec00000 ffff880fd4314e88 0000000000000000 0000000383feea00
Call Trace:
  [<ffffffff816366e4>] dump_stack+0x4c/0x6e
  [<ffffffff810b4906>] warn_slowpath_common+0x86/0xc0
  [<ffffffff810b4a3a>] warn_slowpath_null+0x1a/0x20
  [<ffffffffa000cb76>] vfio_remove_dma+0x1b6/0x1f0 [vfio_iommu_type1]
  [<ffffffffa000d22b>] vfio_iommu_type1_ioctl+0x3eb/0xa2e [vfio_iommu_type1]
  [<ffffffffa036603c>] ? kvm_set_memory_region+0x3c/0x60 [kvm]
  [<ffffffffa0366485>] ? kvm_vm_ioctl+0x425/0x740 [kvm]
  [<ffffffffa0001987>] vfio_fops_unl_ioctl+0x77/0x290 [vfio]
  [<ffffffff81201a3d>] do_vfs_ioctl+0x29d/0x480
  [<ffffffff8120ba67>] ? __fget+0x77/0xb0
  [<ffffffff81201c99>] SyS_ioctl+0x79/0x90
  [<ffffffff8163bcb2>] entry_SYSCALL_64_fastpath+0x16/0x75

This trace appears hundreds of times at the same timestamp. Then the VM 
appears to hang, but with high CPU usage (saturating between 6 and 8 of 
my 12 cores). If I wasn't writing this message I probably wouldn't have 
noticed the next step: some 15 minutes later an almost identical 
backtrace was again emitted hundreds of times, after which the VM shut 
down. I wasn't paying attention so I'm not sure whether anything 
meaningful took place on the VM's monitor while this occurred, but I 
doubt it. The other trace was as follows:

WARNING: CPU: 9 PID: 1313 at drivers/vfio/vfio_iommu_type1.c:364 
vfio_remove_dma+0x1b6/0x1f0 [vfio_iommu_type1]()
Modules linked in: tun rpcsec_gss_krb5 nfsv4 dns_resolver 
snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic bridge 
stp llc nct6775 hwmon_vid nls_iso8859_1 nls_cp437 radeon snd_hda_intel 
snd_hda_codec x86_pkg_temp_thermal joydev snd_hwdep iTCO_wdt coretemp 
drm_kms_helper kvm_intel xpad mousedev ttm snd_hda_core 
iTCO_vendor_support kvm snd_pcm e1000e drm mei_me ptp pps_core snd_timer 
input_leds ff_memless i2c_algo_bit crc32_pclmul mei lpc_ich led_class 
mfd_core evdev wmi processor button sch_fq_codel nfsd nfs auth_rpcgss 
oid_registry nfs_acl lockd grace sunrpc fscache ip_tables x_tables 
usbhid dm_mod sd_mod atkbd libps2 ahci libahci libata scsi_mod i8042 
serio vfio_pci vfio_virqfd vfio_iommu_type1 vfio
CPU: 9 PID: 1313 Comm: qemu-system-x86 Tainted: G        W 
4.2.0-1-bfq #1
Hardware name: ASUS All Series/X99-A/USB 3.1, BIOS 1801 05/15/2015
  0000000000000000 00000000f58b5048 ffff880fac0ebaa8 ffffffff816366e4
  0000000000000000 0000000000000000 ffff880fac0ebae8 ffffffff810b4906
  0000000100000000 ffff880fd4314e88 0000000000000000 0000000383fefe00
Call Trace:
  [<ffffffff816366e4>] dump_stack+0x4c/0x6e
  [<ffffffff810b4906>] warn_slowpath_common+0x86/0xc0
  [<ffffffff810b4a3a>] warn_slowpath_null+0x1a/0x20
  [<ffffffffa000cb76>] vfio_remove_dma+0x1b6/0x1f0 [vfio_iommu_type1]
  [<ffffffffa000cbd0>] vfio_iommu_unmap_unpin_all+0x20/0x40 
[vfio_iommu_type1]
  [<ffffffffa000cd14>] vfio_iommu_type1_detach_group+0x124/0x130 
[vfio_iommu_type1]
  [<ffffffffa00005ae>] __vfio_group_unset_container+0x3e/0x120 [vfio]
  [<ffffffffa00006b8>] vfio_group_try_dissolve_container+0x28/0x30 [vfio]
  [<ffffffffa0000729>] vfio_device_fops_release+0x29/0x40 [vfio]
  [<ffffffff811f14bc>] __fput+0x9c/0x1f0
  [<ffffffff811f165e>] ____fput+0xe/0x10
  [<ffffffff810cff8b>] task_work_run+0x9b/0xb0
  [<ffffffff810b7025>] do_exit+0x395/0xb10
  [<ffffffff810b782b>] do_group_exit+0x3b/0xb0
  [<ffffffff810c261c>] get_signal+0x23c/0x630
  [<ffffffff810bf58f>] ? recalc_sigpending+0x1f/0x60
  [<ffffffff81014337>] do_signal+0x37/0xa30
  [<ffffffff810c32b7>] ? do_sigtimedwait+0xe7/0x1f0
  [<ffffffff81127151>] ? SyS_futex+0x81/0x180
  [<ffffffff81014d8b>] do_notify_resume+0x5b/0x70
  [<ffffffff8163be84>] int_signal+0x12/0x17

The source code line pointed to is the WARN_ON in the following snippet 
in vfio_unmap_unpin (presumably inlined into vfio_remove_dma, which is 
why only that shows in the trace):

		phys = iommu_iova_to_phys(domain->domain, iova);
		if (WARN_ON(!phys)) {
			iova += PAGE_SIZE;
			continue;
		}

As I'm lacking the required low-level knowledge I'll once again simply 
say that any assistance would be welcomed. I'm glad I made at least some 
progress, though. :-)

— Matti

On 2015-09-01 22:51, José Ramón Muñoz Pekkarinen wrote:
> 	Hi Matti,
>
> 	I had similar results on this, using an HD5670, and a windows 7 guest
> installed on top of ovmf. No signal in the input, nothing extrange from qemu,
> and no specific error in /var/log/messages, but the host becomes unresponsive
> enough that I usually have to hard reset it.
>
> 	I though tried the same setup using seabios and, when the vm decides to
> boot correctly, it works quite well, and I can play something. The only
> problem is that you'll probably have to setup windows again to work with old
> fashion bios.
>
> 	I hope it helps.
>
> 	José.
>
> On Saturday 29 August 2015 21:29:10 Matti Niemenmaa wrote:
>> Hello all,
>>
>> I'm trying to passthrough an AMD Radeon R9 Fury to a Windows 10 guest,
>> but the VM doesn't seem to work at all when I tell qemu to add the
>> vfio-pci device. No errors of any kind are output anywhere that I can see.
>>
>> Possibly relevant hardware and software versions:
>>
>> Motherboard: Asus X99-A
>> CPU:         Intel Core i7-5930k
>> GPU host:    AMD Radeon R9 270X
>> GPU guest:   AMD Radeon R9 Fury
>> Linux 4.1.6 (I also tried 4.2-rc8 with the same results)
>> QEMU 2.4.0
>> edk2.git-ovmf-x64-0-20150804.b1143.g8ca1489
>>
>> The following command successfully boots the VM with emulated VGA
>> graphics and working audio and networking (modulo some audio distortion,
>> which I haven't bothered looking into yet):
>>
>> qemu-system-x86_64
>>     -nodefconfig -serial none -parallel none -nodefaults -name The-Windows
>>     -enable-kvm -cpu host -smp sockets=1,cores=6,threads=2 -m 16G
>>     -soundhw hda
>>     -drive
>> if=pflash,format=raw,readonly,file=.../edk2.git/ovmf-x64/OVMF_CODE-pure-efi.
>> fd -drive if=pflash,format=raw,file=.../windows-ovmf-vars.fd
>>     -drive if=virtio,format=raw,file=.../windows.img,index=0,cache=none
>>     -rtc base=localtime
>>     -net nic,model=virtio -net bridge,br=br0
>>     -alt-grab
>>     -boot order=c
>>     -vga std
>>
>> But replacing the last option, "-vga std", with the following, doesn't
>> work at all as intended:
>>
>>     -device vfio-pci,host=02:00.0,addr=06.0
>>
>> The monitor the Fury is plugged into doesn't receive a signal, nor does
>> anything show up in the window opened by qemu, so there's no visual
>> feedback at all. It appears that Windows doesn't even start, since I
>> can't connect to it over VNC despite that working when "-vga std" is
>> used. As far as I can tell the VM hangs indefinitely at this point.
>>
>> qemu doesn't report any errors (in fact it outputs nothing), nor is
>> there anything relevant-looking in dmesg.
>>
>> I'm at a loss as to how to proceed. Any assistance would be welcomed.
>>
>> — Matti




More information about the vfio-users mailing list