[vfio-users] Poor performance with nvidia GTX 980

Georgios Kourachanis geo.kourachanis at gmail.com
Sun Nov 1 17:40:55 UTC 2015


Hello all,

I had been using Xen with some AMD GPUs for almost 2 years till about 
June 2015, when I then found out that KVM and libvirt could do the same 
stuff I was interested in with nvidia GPUs, too. I needed the CUDA cores 
so I did change to an ASUS GTX 980 Strix. But unfortunatelly, I don't 
get any good performance output from it. On native windows 7/10 
installation it's a beast though.
I also have an AMD R7 250 which works great with KVM. But let's not mess 
with it.

Let me get to the point:

I have no problems as for the installation of Windows or OVMF or 
passing-through or anything else. The only problem is the GTX980's 
performance.
The performance had a significant boost when I used the latest qemu 
branch with the hyper-v trick, but still, not getting what many people 
seem to claim in this mailing list "almost-native" (even with nvidia GPUs).


Here is my system's specs:

Archlinux with 4.1.6-1-vfio (with the ACS patch ALONE)
Intel Core i73770 ( I use the igpu for the archlinux)
24GiB RAM
ASUS GTX 980 Strix
Sapphire R7 250
------------------------------------------------------------------------
lspci (only pass-through'd stuff):

01:00.0 VGA compatible controller: NVIDIA Corporation GM204 [GeForce GTX 
980] (rev a1)
         Subsystem: ASUSTeK Computer Inc. Device 8518
         Kernel driver in use: vfio-pci
         Kernel modules: nouveau
01:00.1 Audio device: NVIDIA Corporation GM204 High Definition Audio 
Controller (rev a1)
         Subsystem: ASUSTeK Computer Inc. Device 8518
         Kernel driver in use: vfio-pci
         Kernel modules: snd_hda_intel
02:00.0 VGA compatible controller: Advanced Micro Devices, Inc. 
[AMD/ATI] Oland PRO [Radeon R7 240/340]
         Subsystem: PC Partner Limited / Sapphire Technology Device e266
         Kernel modules: radeon
02:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Cape 
Verde/Pitcairn HDMI Audio [Radeon HD 7700/7800 Series]
         Subsystem: PC Partner Limited / Sapphire Technology Device aab0
         Kernel driver in use: snd_hda_intel
         Kernel modules: snd_hda_intel
08:00.0 USB controller: ASMedia Technology Inc. ASM1042 SuperSpeed USB 
Host Controller
         Subsystem: ASRock Incorporation Motherboard
         Kernel driver in use: vfio-pci
         Kernel modules: xhci_pci
------------------------------------------------------------------------
booting lines:

linux    /boot/vmlinuz-linux-vfio root=UUID=XXXX rw intel_iommu=on 
pcie_acs_override=downstream isolcpus=2-3,6-7 nohz_full=2-3,6-7
initrd    /boot/intel-ucode.img /boot/initramfs-linux-vfio.img
------------------------------------------------------------------------
/etc/fstab:|

hugetlbfs /hugepages hugetlbfs defaults 0 0|
------------------------------------------------------------------------
/etc/sysctl.d/40-hugepage.conf:

vm.nr_hugepages = 8000
------------------------------------------------------------------------
/etc/modules-load.d/vfio.conf:

kvm
kvm-intel
vfio
vfio-pci
vfio_iommu_type1
vfio_virqfd
------------------------------------------------------------------------
/etc/modprobe.d/kvm.conf:

options kvm ignore_msrs=1
------------------------------------------------------------------------
/etc/modprobe.d/kvm-intel.conf:

options kvm-intel nested=1
------------------------------------------------------------------------
/etc/modprobe.d/vfio_iommu_type1.conf:

options vfio_iommu_type1 allow_unsafe_interrupts=0
------------------------------------------------------------------------
/etc/modprobe.d/vfio-pci.conf:

options vfio-pci ids=10de:13c0,10de:0fbb,1002:6613,1002:aab0,1b21:1042
------------------------------------------------------------------------

And the virsh xml:

<domain type='kvm'>
   <name>windows_10</name>
<uuid>63045df8-c782-4cfd-abc7-a3598826ae83</uuid>
   <memory unit='KiB'>6553600</memory>
   <currentMemory unit='KiB'>6553600</currentMemory>
   <memoryBacking>
     <hugepages/>
   </memoryBacking>
   <vcpu placement='static'>4</vcpu>
   <cputune>
     <vcpupin vcpu='0' cpuset='2'/>
     <vcpupin vcpu='1' cpuset='3'/>
     <vcpupin vcpu='2' cpuset='6'/>
     <vcpupin vcpu='3' cpuset='7'/>
   </cputune>
   <os>
     <type arch='x86_64' machine='pc-i440fx-2.4'>hvm</type>
     <loader readonly='yes' 
type='pflash'>/usr/local/share/edk2.git/ovmf-x64/OVMF_CODE-pure-efi.fd</loader>
<nvram>/var/lib/libvirt/qemu/nvram/windows_nvidia_VARS.fd</nvram>
   </os>
   <features>
     <acpi/>
     <apic/>
     <pae/>
     <hyperv>
       <relaxed state='on'/>
       <vapic state='on'/>
       <spinlocks state='on' retries='8191'/>
     </hyperv>
     <kvm>
       <hidden state='on'/>
     </kvm>
     <vmport state='off'/>
   </features>
   <cpu mode='host-passthrough'>
     <topology sockets='1' cores='4' threads='1'/>
   </cpu>
   <clock offset='localtime'>
     <timer name='rtc' tickpolicy='catchup'/>
     <timer name='pit' tickpolicy='delay'/>
     <timer name='hpet' present='no'/>
     <timer name='hypervclock' present='yes'/>
   </clock>
   <on_poweroff>destroy</on_poweroff>
   <on_reboot>restart</on_reboot>
   <on_crash>restart</on_crash>
   <pm>
     <suspend-to-mem enabled='no'/>
     <suspend-to-disk enabled='no'/>
   </pm>
   <devices>
<emulator>/usr/local/bin/qemu-system-x86_64.hv</emulator>
     <disk type='block' device='disk'>
       <driver name='qemu' type='raw' cache='none'/>
       <source dev='/dev/mapper/vg_ssd-lv_kvm_NVIDIA'/>
       <target dev='sda' bus='scsi'/>
       <boot order='1'/>
       <address type='drive' controller='0' bus='0' target='0' unit='0'/>
     </disk>
     <disk type='block' device='disk'>
       <driver name='qemu' type='raw' cache='none'/>
       <source dev='/dev/mapper/vg_raid5-lv_xen_ntfs_files'/>
       <target dev='sdb' bus='scsi'/>
       <address type='drive' controller='0' bus='0' target='0' unit='1'/>
     </disk>
     <controller type='usb' index='0'>
       <address type='pci' domain='0x0000' bus='0x00' slot='0x01' 
function='0x2'/>
     </controller>
     <controller type='pci' index='0' model='pci-root'/>
     <controller type='scsi' index='0' model='virtio-scsi'>
       <address type='pci' domain='0x0000' bus='0x00' slot='0x06' 
function='0x0'/>
     </controller>
     <interface type='bridge'>
       <mac address='52:54:00:e9:85:8f'/>
       <source bridge='xenbr0'/>
       <model type='e1000'/>
       <address type='pci' domain='0x0000' bus='0x00' slot='0x03' 
function='0x0'/>
     </interface>
     <hostdev mode='subsystem' type='pci' managed='yes'>
       <source>
         <address domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
       </source>
       <address type='pci' domain='0x0000' bus='0x00' slot='0x0a' 
function='0x0' multifunction='on'/>
     </hostdev>
     <hostdev mode='subsystem' type='pci' managed='yes'>
       <source>
         <address domain='0x0000' bus='0x01' slot='0x00' function='0x1'/>
       </source>
       <address type='pci' domain='0x0000' bus='0x00' slot='0x0a' 
function='0x1'/>
     </hostdev>
     <hostdev mode='subsystem' type='pci' managed='yes'>
       <source>
         <address domain='0x0000' bus='0x08' slot='0x00' function='0x0'/>
       </source>
       <address type='pci' domain='0x0000' bus='0x00' slot='0x08' 
function='0x0'/>
     </hostdev>
     <memballoon model='virtio'>
       <address type='pci' domain='0x0000' bus='0x00' slot='0x05' 
function='0x0'/>
     </memballoon>
   </devices>
</domain>
------------------------------------------------------------------------

/usr/local/bin/qemu-system-x86_64.hv:
#!/bin/sh
exec /usr/bin/qemu-system-x86_64 `echo "\$@" | \
sed 's|hv_time|hv_time,hv_vendor_id=GoobyPLS|g'



And some notes:

1) Using "<topology sockets='1' cores='4' threads='1'/>" instead of 
"<topology sockets='1' cores='2' threads='2'/>" provided about 2% boost 
in GPU performance. No change in RAM or CPU tests. I've tested with the 
passmark.

2) I tried using the emulatorpin method Alex says on a mail here on 
vfio-users, but I didn't notice any changed in GPU performance. I didn't 
test it on the CPU side though.

3) The main problem of the performance lack is that a specific game that 
I've been playing isn't quite playable. That game has been mentioned 
before here on the list, it's Tera (european version (gameforge), 
although american version(enmasse) has exactly the same performance).

4) Every other game I managed to play is quite playable, though I 
haven't tested them to see if they run on native speeds.


I'd really want some help on this matter, I really want to make my 
server run this VM with the nvidia GPU. I hate dual booting Windows >_>


Thanks!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/vfio-users/attachments/20151101/574e7da3/attachment.htm>


More information about the vfio-users mailing list