[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [vfio-users] Ryzen Primary GPU passthrough success and woes



Thanks for the link. I've tried a number of things but still no further down the line.

I've tried the following one-by-one
In using Arch Linux with Kernel 4.10.1.

uname -a
Linux amdr7 4.10.5-1-ARCH #1 SMP PREEMPT Wed Mar 22 14:42:03 CET 2017 x86_64 GNU/Linux

This is my kernel command line now:

BOOT_IMAGE=/vmlinuz-linux root=UUID=bf69add2-e36f-453a-b92e-a4343ca20d26 rw quiet amd_iommu=on vfio-pci.ids=1002:67b1,1002:aac8 video=efifb:off amdgpu.msi=0 kvm_amd.avic=1 isolcpus=0-7 kvm-amd.npt=0 iommu=pt

This is my full libvirt XML file for the VM:


<domain type='kvm'>
  <name>windows10</name>
  <uuid>7b222825-fc7d-4a66-a72c-5876063752d5</uuid>
  <memory unit='KiB'>8291456</memory>
  <currentMemory unit='KiB'>8291456</currentMemory>
  <vcpu placement='static'>4</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='0'/>
    <vcpupin vcpu='1' cpuset='1'/>
    <vcpupin vcpu='2' cpuset='2'/>
    <vcpupin vcpu='3' cpuset='3'/>
  </cputune>
  <os>
    <type arch='x86_64' machine='pc-i440fx-2.1'>hvm</type>
    <loader type='pflash' readonly='yes'>/home/virtualguests/windows10/OVMF_CODE.fd</loader>
    <nvram>/home/virtualguests/windows10/OVMF_VARS.fd</nvram>
    <boot dev='hd'/>
  </os>
 <features>
    <acpi/>
    <apic/>
    <pae/>
    <hyperv>
      <relaxed state='on'/>
      <vapic state='on'/>
      <spinlocks state='on' retries='8191'/>
    </hyperv>
    <kvm>
      <hidden state='on'/>
    </kvm>
  </features>
  <cpus>
    <arch name='x86'>
      <model name='kvm64'>
        <feature name='apic'/>
        <feature name='clflush'/>
        <feature name='cmov'/>
        <feature name='cx16'/>
        <feature name='cx8'/>
        <feature name='de'/>
        <feature name='fpu'/>
        <feature name='fxsr'/>
        <feature name='lm'/>
        <feature name='mca'/>
        <feature name='mce'/>
        <feature name='mmx'/>
        <feature name='msr'/>
        <feature name='mtrr'/>
        <feature name='nx'/>
        <feature name='pae'/>
        <feature name='pat'/>
        <feature name='pge'/>
        <feature name='pni'/>
        <feature name='pse'/>
        <feature name='pse36'/>
        <feature name='sep'/>
        <feature name='sse'/>
        <feature name='sse2'/>
        <feature name='syscall'/>
        <feature name='tsc'/>
      </model>
    </arch>
  </cpus>
  <clock offset='localtime'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
    <timer name='hypervclock' present='yes'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>destroy</on_crash>
  <devices>
    <emulator>/usr/bin/qemu-system-x86_64</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' />
      <source file='/home/virtualguests/windows10/windows10-c-nas.qcow2'/>
      <target dev='vdb' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </disk>
    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw'/>
      <source file='/home/virtualguests/windows10/Win10_1607_EnglishInternational_x64.iso'/>
      <target dev='hdc' bus='ide'/>
      <readonly/>
      <address type='drive' controller='0' bus='1' unit='0'/>
    </disk>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='none'/>
      <source file='/storage/windows10-d.qcow2'/>
      <target dev='vdc' bus='virtio'/>
    </disk>
    <controller type='pci' index='0' model='pci-root' />
    <interface type='bridge'>
      <mac address='52:54:00:12:34:76'/>
      <source bridge='br0'/>
      <target dev='tap8'/>
      <model type='virtio'/>
      <alias name='virtio'/>
      <rom bar='off'/>
    </interface>
    <input type='mouse' bus='ps2'/>
    <input type='keyboard' bus='ps2'/>
    <hostdev mode='subsystem' type='usb' managed='no'>
      <source>
        <vendor id='0x046d'/>
        <product id='0xc52e'/>
      </source>
    </hostdev>
    <hostdev mode='subsystem' type='usb' managed='no'>
      <source>
        <vendor id='0x28de'/>
        <product id='0x1142'/>
      </source>
    </hostdev>
    <hostdev mode='subsystem' type='usb' managed='no'>
      <source>
        <vendor id='0x0a12'/>
        <product id='0x0001'/>
      </source>
    </hostdev>
    <hostdev mode='subsystem' type='usb' managed='no'>
      <source>
        <vendor id='0x0fcf'/>
        <product id='0x1009'/>
      </source>
    </hostdev>
    <hostdev mode='subsystem' type='usb' managed='no'>
      <source>
        <vendor id='0x1b1c'/>
        <product id='0x1c0b'/>
      </source>
    </hostdev>
    <hostdev mode='subsystem' type='usb' managed='no'>
      <source>
        <vendor id='0x05e3'/>
        <product id='0x0608'/>
        <address bus='1' device='5'/>
      </source>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <source>
        <address domain='0x0000' bus='0x09' slot='0x00' function='0x0' />
      </source>
      <rom bar='on' file='/home/virtualguests/windows10/r9290.rom'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0' multifunction='on'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <source>
        <address domain='0x0000' bus='0x09' slot='0x00' function='0x1'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
    </hostdev>
   <memballoon model='none'/>
  </devices>
</domain>



When I have host-passthrough, Opteron_G5, althon or qemu64 CPUs configured I see a lot of these stack traces just appearing frequently and not just when the guest crashes, I see nothing when the guest crashes

[ 2848.156709] ------------[ cut here ]------------
[ 2848.156719] WARNING: CPU: 0 PID: 1445 at arch/x86/kvm/svm.c:1484 avic_vcpu_load+0x15a/0x180 [kvm_amd]
[ 2848.156720] Modules linked in: vhost_net vhost macvtap macvlan tun nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache hid_logitech_hidpp usb_serial_simple cdc_acm usbserial hid_logitech_dj cfg80211 bridge stp llc amdgpu sd_mod edac_mce_amd radeon edac_core kvm_amd kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel pcbc ppdev ttm snd_hda_codec_realtek snd_hda_codec_generic drm_kms_helper aesni_intel drm btusb snd_hda_intel nls_iso8859_1 aes_x86_64 btrtl crypto_simd nls_cp437 btbcm glue_helper syscopyarea btintel sysfillrect snd_hda_codec vfat r8169 joydev sysimgblt fat fb_sys_fops i2c_algo_bit bluetooth cryptd evdev mousedev uas mii input_leds snd_hda_core rfkill pcspkr led_class snd_hwdep mac_hid snd_pcm snd_timer ccp sp5100_tco snd i2c_piix4 soundcore rng_core shpchp wmi parport_pc
[ 2848.156772]  parport fjes 8250_dw i2c_designware_platform tpm_infineon i2c_designware_core button acpi_cpufreq tpm_tis tpm_tis_core tpm nfsd auth_rpcgss oid_registry nfs_acl lockd grace sch_fq_codel sunrpc ip_tables x_tables ext4 crc16 jbd2 fscrypto mbcache usb_storage hid_generic usbhid hid ahci libahci xhci_pci libata xhci_hcd usbcore scsi_mod nvme usb_common nvme_core serio vfio_pci irqbypass vfio_virqfd vfio_iommu_type1 vfio
[ 2848.156798] CPU: 0 PID: 1445 Comm: CPU 0/KVM Tainted: G        W       4.10.5-1-ARCH #1
[ 2848.156798] Hardware name: Gigabyte Technology Co., Ltd. Default string/AB350M-Gaming 3-CF, BIOS F2 02/20/2017
[ 2848.156799] Call Trace:
[ 2848.156807]  dump_stack+0x63/0x83
[ 2848.156812]  __warn+0xcb/0xf0
[ 2848.156816]  warn_slowpath_null+0x1d/0x20
[ 2848.156819]  avic_vcpu_load+0x15a/0x180 [kvm_amd]
[ 2848.156822]  svm_vcpu_unblocking+0x18/0x20 [kvm_amd]
[ 2848.156834]  kvm_vcpu_block+0xd3/0x330 [kvm]
[ 2848.156844]  ? kvm_get_rflags+0x1a/0x30 [kvm]
[ 2848.156856]  kvm_arch_vcpu_ioctl_run+0x4ea/0x1680 [kvm]
[ 2848.156859]  ? _copy_to_user+0x54/0x60
[ 2848.156867]  kvm_vcpu_ioctl+0x339/0x630 [kvm]
[ 2848.156872]  do_vfs_ioctl+0xa3/0x5f0
[ 2848.156876]  ? __fget+0x77/0xb0
[ 2848.156880]  SyS_ioctl+0x79/0x90
[ 2848.156883]  entry_SYSCALL_64_fastpath+0x1a/0xa9
[ 2848.156885] RIP: 0033:0x7f9980dbd0d7
[ 2848.156886] RSP: 002b:00007f9972efb8e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 2848.156887] RAX: ffffffffffffffda RBX: 00007f9987e0d001 RCX: 00007f9980dbd0d7
[ 2848.156888] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000013
[ 2848.156888] RBP: 0000000000000001 R08: 000055c65eff4830 R09: 00000000000000ff
[ 2848.156889] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000001
[ 2848.156889] R13: 00007f9987e0c000 R14: 0000000000000000 R15: 00007f99745a5980
[ 2848.156905] ---[ end trace e49522bc58864bce ]---

I still haven't tried SeaBIOS yet, I'm still running 'pc-i440fx-2.1' with OVMF, not Q35. I couldn't get Windows to boot with Q35.

I also noticed something really odd in the fact that after the guest crashes I see random pictures on the TV, which I assume are coming from Arch - I see things like a woman at a football stadium and waterfalls - I'm not sure if this would be expected if the card is assigned to vfio-pci?

@Steven Walter, can you paste a full copy of your libvirt XML file please?

Just for completeness here are my IOMMU groups:

[gneville amdr7 ~]$ lspci -nn
00:00.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:1450]
00:00.2 IOMMU [0806]: Advanced Micro Devices, Inc. [AMD] Device [1022:1451]
00:01.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:1452]
00:01.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:1453]
00:01.3 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:1453]
00:02.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:1452]
00:03.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:1452]
00:03.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:1453]
00:04.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:1452]
00:07.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:1452]
00:07.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:1454]
00:08.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:1452]
00:08.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:1454]
00:14.0 SMBus [0c05]: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller [1022:790b] (rev 59)
00:14.3 ISA bridge [0601]: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge [1022:790e] (rev 51)
00:18.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:1460]
00:18.1 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:1461]
00:18.2 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:1462]
00:18.3 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:1463]
00:18.4 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:1464]
00:18.5 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:1465]
00:18.6 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:1466]
00:18.7 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:1467]
01:00.0 Non-Volatile memory controller [0108]: Samsung Electronics Co Ltd Device [144d:a804]
03:00.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Device [1022:43bb] (rev 02)
03:00.1 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] Device [1022:43b7] (rev 02)
03:00.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43b2] (rev 02)
04:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43b4] (rev 02)
04:01.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43b4] (rev 02)
04:04.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43b4] (rev 02)
05:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [10ec:8168] (rev 0c)
07:00.0 Non-Volatile memory controller [0108]: Samsung Electronics Co Ltd Device [144d:a804]
09:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Hawaii PRO [Radeon R9 290/390] [1002:67b1]
09:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Hawaii HDMI Audio [Radeon R9 290/290X / 390/390X] [1002:aac8]
11:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Device [1022:145a]
11:00.2 Encryption controller [1080]: Advanced Micro Devices, Inc. [AMD] Device [1022:1456]
11:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Device [1022:145c]
12:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Device [1022:1455]
12:00.2 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901] (rev 51)
12:00.3 Audio device [0403]: Advanced Micro Devices, Inc. [AMD] Device [1022:1457]
[gneville amdr7 ~]$

virsh nodedev-dumpxml pci_0000_09_00_0
<device>
  <name>pci_0000_09_00_0</name>
  <path>/sys/devices/pci0000:00/0000:00:03.1/0000:09:00.0</path>
  <parent>pci_0000_00_03_1</parent>
  <driver>
    <name>vfio-pci</name>
  </driver>
  <capability type='pci'>
    <domain>0</domain>
    <bus>9</bus>
    <slot>0</slot>
    <function>0</function>
    <product id='0x67b1'>Hawaii PRO [Radeon R9 290/390]</product>
    <vendor id='0x1002'>Advanced Micro Devices, Inc. [AMD/ATI]</vendor>
    <iommuGroup number='2'>
      <address domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
      <address domain='0x0000' bus='0x09' slot='0x00' function='0x1'/>
      <address domain='0x0000' bus='0x00' slot='0x03' function='0x1'/>
      <address domain='0x0000' bus='0x09' slot='0x00' function='0x0'/>
    </iommuGroup>
  </capability>
</device>

find /sys/kernel/iommu_groups/ -type l
/sys/kernel/iommu_groups/7/devices/0000:00:18.6
/sys/kernel/iommu_groups/7/devices/0000:00:18.4
/sys/kernel/iommu_groups/7/devices/0000:00:18.2
/sys/kernel/iommu_groups/7/devices/0000:00:18.0
/sys/kernel/iommu_groups/7/devices/0000:00:18.7
/sys/kernel/iommu_groups/7/devices/0000:00:18.5
/sys/kernel/iommu_groups/7/devices/0000:00:18.3
/sys/kernel/iommu_groups/7/devices/0000:00:18.1
/sys/kernel/iommu_groups/5/devices/0000:12:00.2
/sys/kernel/iommu_groups/5/devices/0000:00:08.1
/sys/kernel/iommu_groups/5/devices/0000:12:00.0
/sys/kernel/iommu_groups/5/devices/0000:12:00.3
/sys/kernel/iommu_groups/5/devices/0000:00:08.0
/sys/kernel/iommu_groups/3/devices/0000:00:04.0
/sys/kernel/iommu_groups/1/devices/0000:00:02.0
/sys/kernel/iommu_groups/6/devices/0000:00:14.0
/sys/kernel/iommu_groups/6/devices/0000:00:14.3
/sys/kernel/iommu_groups/4/devices/0000:11:00.2
/sys/kernel/iommu_groups/4/devices/0000:11:00.0
/sys/kernel/iommu_groups/4/devices/0000:00:07.1
/sys/kernel/iommu_groups/4/devices/0000:11:00.3
/sys/kernel/iommu_groups/4/devices/0000:00:07.0
/sys/kernel/iommu_groups/2/devices/0000:00:03.0
/sys/kernel/iommu_groups/2/devices/0000:09:00.1
/sys/kernel/iommu_groups/2/devices/0000:00:03.1
/sys/kernel/iommu_groups/2/devices/0000:09:00.0
/sys/kernel/iommu_groups/0/devices/0000:07:00.0
/sys/kernel/iommu_groups/0/devices/0000:03:00.1
/sys/kernel/iommu_groups/0/devices/0000:00:01.3
/sys/kernel/iommu_groups/0/devices/0000:04:01.0
/sys/kernel/iommu_groups/0/devices/0000:00:01.1
/sys/kernel/iommu_groups/0/devices/0000:04:04.0
/sys/kernel/iommu_groups/0/devices/0000:05:00.0
/sys/kernel/iommu_groups/0/devices/0000:04:00.0
/sys/kernel/iommu_groups/0/devices/0000:03:00.2
/sys/kernel/iommu_groups/0/devices/0000:03:00.0
/sys/kernel/iommu_groups/0/devices/0000:00:01.0
/sys/kernel/iommu_groups/0/devices/0000:01:00.0









On Wed, Mar 29, 2017 at 12:24 PM, Steven Walter <stevenrwalter gmail com> wrote:
I  got a similar (though multi-GPU) setup working, which I wrote up
here: https://www.reddit.com/r/VFIO/comments/616xih/gpu_passthrough_with_msi_b350_tomahawk/

One thing that may help you is to enable AVIC (kvm_amd.avic=1).  What
I saw without AVIC was that things would work briefly (only a few
seconds for me) before interrupts would stop getting delivered.
Sounds like things are working better for you without AVIC than they
did for me, but perhaps the extra improvement in IRQ latency would fix
the hangs you get during intensive graphics operations.


On Tue, Mar 28, 2017 at 6:02 PM, Graham Neville <grahamneville gmail com> wrote:
> I've managed to get pci-e passthough working on a gigabyte gaming 3 matx MB
> and Ryzen 1700, no ACS patch, using only 1 GPU - AMD r9 290. However I'm
> facing a problem with the whole KVM setup and not sure what it's related to.
> For the Windows10 guest with the GPU passed through it crashes (guest only,
> host is fine) whenever I try anything graphics intensive, for example
> running Witcher3. Normal desktop is fine.
> Also my Linux guests are acting odd when I try to SSH to them, I notice that
> the SSH terminals just stop working randomly. And then there's the issue
> with very slow network throughout to both VMs. I have no idea what's going
> on. It used to work fine with my Intel setup. There's no logs in dmesg to
> show a problem either.
>
> I'm going to try Seabios instead of OVMF to see if I can stop the crashing.
>
> Any one having similar issues or anyone can advise?
>
>
> _______________________________________________
> vfio-users mailing list
> vfio-users redhat com
> https://www.redhat.com/mailman/listinfo/vfio-users
>



--
-Steven Walter <stevenrwalter gmail com>


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]