[vfio-users] Poor performance with nvidia GTX 980

Georgios Kourachanis geo.kourachanis at gmail.com
Tue Nov 3 16:01:51 UTC 2015


Hello Erik,

Thank you for your interest about my laggy Tera sessions. I'd like to 
ask you something though:
What exactly you mean by "able" when pvp-ing? Do you use maxed-out settings?
Also, 80%-85% of the GTX 970 would be more than enough for maxed-out 
settings and full pvp time.
I hope I can get an 80% of my GTX 980, too. :/

I just tried a passed-through for my second NIC, and as for a first 
glimpse, I think performance has been really, no wait, I mean REALLY, 
boosted. Though, not even close to what my GTX 980 is made to offer:

When I'm on native hardware (dual booted windows 10), I get about 80fps 
(I think that's capped) on areas with few players. On FC (pvp 
battleground) I get about 35fps-40fps.

When I'm on the VM, I get about 25fps-35fps on areas with few players. 
On FC I get 8fps-10fps (hardly playable, almost unplayable - unless you 
know what's gonna happen on the next frame :p ).


So, the change from virtual network to a passed-through NIC, improved 
things to a point of being hardly playable.


Regards,
George


On 02/11/2015 04:35 μμ, Erik Adler wrote:
>
> I am gonna download Tera again and see whats up. After patching qemu I 
> was able to pvp in The Gridiron and CS. After a few a days I came to 
> the conclusion that I had about 80-85% of native on my GTX970. Around 
> Highwatch things are always bad with or without virtualization. Tera 
> was completely unplayable without patching in my case.
>
> Looking at your xml nothing pops out except that you are not using 
> vfio with networking. Tera is a very network intensive.
>
> Looking at the amount of hugepages you can plausibly afford to bump a 
> little more RAM to Windows/Tera for testing purposes. 6meg might be a 
> bit tight. What kind of fps are you getting with and without 
> virtualization?
>
> ------
>
> virtualkvm.com <http://virtualkvm.com>
>
>
> On Mon, Nov 2, 2015 at 3:21 PM, Okky Hendriansyah <okky at nostratech.com 
> <mailto:okky at nostratech.com>> wrote:
>
>     Sorry, it should be kernel 4.1.12-lts. :)
>
>     Best regards,
>     Okky Hendriansyah
>
>     On Nov 2, 2015, at 21:16, Okky Hendriansyah <okky at nostratech.com
>     <mailto:okky at nostratech.com>> wrote:
>
>>     I think the best benchmark would be the in-game ones. Since I
>>     cannot try Tera (region restricted), what are the other games
>>     that you play? Hopefully I can try to benchmark that on my VM and
>>     start from there. You can also try the Unigiene Valley benchmark.
>>
>>     My host has an ASRock Z87 Extreme6 with Intel Core i7-4770,
>>     patched my kernel with ACS Override patch, currently at kernel
>>     4.1.2-lts (patched from ABS PKGBUILD). I do not isolate any cores
>>     on the host and give all the cores the VM (exposed as 8 vcpus
>>     with topology 1 socket, 4 cores, each core 2 threads)
>>
>>     Best regards,
>>     Okky Hendriansyah
>>
>>     On Nov 2, 2015, at 19:54, Eddie Yen <missile0407 at gmail.com
>>     <mailto:missile0407 at gmail.com>> wrote:
>>
>>>     For now the latest driver is 358.50, and my guest using latest
>>>     driver wo any problem.
>>>     But I'm using the method that AW talking about, so maybe give it
>>>     a try?
>>>
>>>     2015-11-02 20:47 GMT+08:00 Georgios Kourachanis
>>>     <geo.kourachanis at gmail.com <mailto:geo.kourachanis at gmail.com>>:
>>>
>>>         It's the same thing, either by adding them with
>>>         qemuarguments, or with the wrapper.
>>>
>>>         The thing is to use the hyper-v functions. That's what the
>>>         hyper-v vendor-id patch has given to us. The ability of
>>>         hidding the hyper-v functions from nvidia GPUs so that we
>>>         can use them!
>>>
>>>         Also, I've tried with a null name for the vendor-id, I got
>>>         the same performance.
>>>
>>>         The nvidia drivers I'm currently using are 358.50
>>>
>>>         Moreover, could you suggest me a nice software to test the
>>>         VM's performance in general? I don't really like passmark.
>>>
>>>
>>>
>>>
>>>         On 02/11/2015 02:11 μμ, Eddie Yen wrote:
>>>>         OK, but I still suggest that remove Hyper-V function tags
>>>>         in your XML.
>>>>         Because we don't know about what new tricks inside the
>>>>         driver that NVIDIA wants to "surprise" us.
>>>>
>>>>         For me, my GTX980 works well by using upon edits. But I'm
>>>>         using 4820K which didn't need ACS patch and wo intel graphic.
>>>>         So I'm not sure it may cause by patch or sth.
>>>>
>>>>         2015-11-02 20:04 GMT+08:00 Georgios Kourachanis
>>>>         <geo.kourachanis at gmail.com <mailto:geo.kourachanis at gmail.com>>:
>>>>
>>>>             Hello Eddie,
>>>>
>>>>             Thanks for answering, though:
>>>>
>>>>             What you suggest me to do, I've already done it with
>>>>             this way:
>>>>
>>>>             /usr/local/bin/qemu-system-x86_64.hv:
>>>>             #!/bin/sh
>>>>             exec /usr/bin/qemu-system-x86_64 `echo "\$@" | \
>>>>             sed 's|hv_time|hv_time,hv_vendor_id=GoobyPLS|g'
>>>>
>>>>
>>>>             and by changing the emulator qemu to this line:
>>>>
>>>>             <emulator>/usr/local/bin/qemu-system-x86_64.hv</emulator>
>>>>
>>>>             I'm just giving the ID "GoobyPLS" to the vendor. I'll
>>>>             try without a vendor name to see if it changes anything.
>>>>
>>>>             Also, I'm using the qemu git version "r41983.g3a958f5"
>>>>             so it already contains the patch that helps us use the
>>>>             lines above.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>             On 02/11/2015 03:53 πμ, Eddie Yen wrote:
>>>>>             According from AW's blog:
>>>>>             " For this step we again need to run virsh edit on the
>>>>>             VM. Within the<features> section, remove everything
>>>>>             between the <hyperv> tags, including the tags
>>>>>             themselves.0"
>>>>>             and
>>>>>             "Additionally, within the <clock> tag, find the timer
>>>>>             named hypervclock, remove the line containing this tag
>>>>>             completely. Save and exit the edit session."
>>>>>
>>>>>             I found that these still exist in your XML file, so
>>>>>             try to do this:
>>>>>
>>>>>             1. Remove these tags.
>>>>>             2. Re-compile QEMU and re-install it with this patch
>>>>>             http://www.spinics.net/lists/kvm/msg121742.html
>>>>>             3. Add these tags between </devices> and </domain>
>>>>>
>>>>>             <qemu:commandline>
>>>>>              <qemu:arg value='-cpu'/>
>>>>>              <qemu:arg
>>>>>             value='host,hv_time,hv_relaxed,hv_vapic,hv_spinlocks=0x1fff,kvm=off,hv)vendor_id='/>
>>>>>             </qemu:commandline>
>>>>>
>>>>>             I'm using GTX980, too. Before that, I got poor 3D
>>>>>             performance in Windows 10, after this patch and
>>>>>             edition, I got performance back.
>>>>>
>>>>>             2015-11-02 1:43 GMT+08:00 Georgios Kourachanis
>>>>>             <geo.kourachanis at gmail.com
>>>>>             <mailto:geo.kourachanis at gmail.com>>:
>>>>>
>>>>>                 Hello all,
>>>>>
>>>>>                 I had been using Xen with some AMD GPUs for almost
>>>>>                 2 years till about June 2015, when I then found
>>>>>                 out that KVM and libvirt could do the same stuff I
>>>>>                 was interested in with nvidia GPUs, too. I needed
>>>>>                 the CUDA cores so I did change to an ASUS GTX 980
>>>>>                 Strix. But unfortunatelly, I don't get any good
>>>>>                 performance output from it. On native windows 7/10
>>>>>                 installation it's a beast though.
>>>>>                 I also have an AMD R7 250 which works great with
>>>>>                 KVM. But let's not mess with it.
>>>>>
>>>>>                 Let me get to the point:
>>>>>
>>>>>                 I have no problems as for the installation of
>>>>>                 Windows or OVMF or passing-through or anything
>>>>>                 else. The only problem is the GTX980's performance.
>>>>>                 The performance had a significant boost when I
>>>>>                 used the latest qemu branch with the hyper-v
>>>>>                 trick, but still, not getting what many people
>>>>>                 seem to claim in this mailing list "almost-native"
>>>>>                 (even with nvidia GPUs).
>>>>>
>>>>>
>>>>>                 Here is my system's specs:
>>>>>
>>>>>                 Archlinux with 4.1.6-1-vfio (with the ACS patch ALONE)
>>>>>                 Intel Core i73770 ( I use the igpu for the archlinux)
>>>>>                 24GiB RAM
>>>>>                 ASUS GTX 980 Strix
>>>>>                 Sapphire R7 250
>>>>>                 ------------------------------------------------------------------------
>>>>>                 lspci (only pass-through'd stuff):
>>>>>
>>>>>                 01:00.0 VGA compatible controller: NVIDIA
>>>>>                 Corporation GM204 [GeForce GTX 980] (rev a1)
>>>>>                 Subsystem: ASUSTeK Computer Inc. Device 8518
>>>>>                 Kernel driver in use: vfio-pci
>>>>>                 Kernel modules: nouveau
>>>>>                 01:00.1 Audio device: NVIDIA Corporation GM204
>>>>>                 High Definition Audio Controller (rev a1)
>>>>>                 Subsystem: ASUSTeK Computer Inc. Device 8518
>>>>>                 Kernel driver in use: vfio-pci
>>>>>                 Kernel modules: snd_hda_intel
>>>>>                 02:00.0 VGA compatible controller: Advanced Micro
>>>>>                 Devices, Inc. [AMD/ATI] Oland PRO [Radeon R7 240/340]
>>>>>                 Subsystem: PC Partner Limited / Sapphire
>>>>>                 Technology Device e266
>>>>>                 Kernel modules: radeon
>>>>>                 02:00.1 Audio device: Advanced Micro Devices, Inc.
>>>>>                 [AMD/ATI] Cape Verde/Pitcairn HDMI Audio [Radeon
>>>>>                 HD 7700/7800 Series]
>>>>>                 Subsystem: PC Partner Limited / Sapphire
>>>>>                 Technology Device aab0
>>>>>                 Kernel driver in use: snd_hda_intel
>>>>>                 Kernel modules: snd_hda_intel
>>>>>                 08:00.0 USB controller: ASMedia Technology Inc.
>>>>>                 ASM1042 SuperSpeed USB Host Controller
>>>>>                 Subsystem: ASRock Incorporation Motherboard
>>>>>                 Kernel driver in use: vfio-pci
>>>>>                 Kernel modules: xhci_pci
>>>>>                 ------------------------------------------------------------------------
>>>>>                 booting lines:
>>>>>
>>>>>                 linux /boot/vmlinuz-linux-vfio root=UUID=XXXX rw
>>>>>                 intel_iommu=on pcie_acs_override=downstream
>>>>>                 isolcpus=2-3,6-7 nohz_full=2-3,6-7
>>>>>                 initrd /boot/intel-ucode.img
>>>>>                 /boot/initramfs-linux-vfio.img
>>>>>                 ------------------------------------------------------------------------
>>>>>                 /etc/fstab:|
>>>>>
>>>>>                 hugetlbfs /hugepages hugetlbfs defaults 0 0|
>>>>>                 ------------------------------------------------------------------------
>>>>>                 /etc/sysctl.d/40-hugepage.conf:
>>>>>
>>>>>                 vm.nr_hugepages = 8000
>>>>>                 ------------------------------------------------------------------------
>>>>>                 /etc/modules-load.d/vfio.conf:
>>>>>
>>>>>                 kvm
>>>>>                 kvm-intel
>>>>>                 vfio
>>>>>                 vfio-pci
>>>>>                 vfio_iommu_type1
>>>>>                 vfio_virqfd
>>>>>                 ------------------------------------------------------------------------
>>>>>                 /etc/modprobe.d/kvm.conf:
>>>>>
>>>>>                 options kvm ignore_msrs=1
>>>>>                 ------------------------------------------------------------------------
>>>>>                 /etc/modprobe.d/kvm-intel.conf:
>>>>>
>>>>>                 options kvm-intel nested=1
>>>>>                 ------------------------------------------------------------------------
>>>>>                 /etc/modprobe.d/vfio_iommu_type1.conf:
>>>>>
>>>>>                 options vfio_iommu_type1 allow_unsafe_interrupts=0
>>>>>                 ------------------------------------------------------------------------
>>>>>                 /etc/modprobe.d/vfio-pci.conf:
>>>>>
>>>>>                 options vfio-pci
>>>>>                 ids=10de:13c0,10de:0fbb,1002:6613,1002:aab0,1b21:1042
>>>>>                 ------------------------------------------------------------------------
>>>>>
>>>>>                 And the virsh xml:
>>>>>
>>>>>                 <domain type='kvm'>
>>>>>                 <name>windows_10</name>
>>>>>                 <uuid>63045df8-c782-4cfd-abc7-a3598826ae83</uuid>
>>>>>                 <memory unit='KiB'>6553600</memory>
>>>>>                 <currentMemory unit='KiB'>6553600</currentMemory>
>>>>>                 <memoryBacking>
>>>>>                 <hugepages/>
>>>>>                 </memoryBacking>
>>>>>                 <vcpu placement='static'>4</vcpu>
>>>>>                 <cputune>
>>>>>                 <vcpupin vcpu='0' cpuset='2'/>
>>>>>                 <vcpupin vcpu='1' cpuset='3'/>
>>>>>                 <vcpupin vcpu='2' cpuset='6'/>
>>>>>                 <vcpupin vcpu='3' cpuset='7'/>
>>>>>                 </cputune>
>>>>>                 <os>
>>>>>                 <type arch='x86_64' machine='pc-i440fx-2.4'>hvm</type>
>>>>>                 <loader readonly='yes'
>>>>>                 type='pflash'>/usr/local/share/edk2.git/ovmf-x64/OVMF_CODE-pure-efi.fd</loader>
>>>>>                 <nvram>/var/lib/libvirt/qemu/nvram/windows_nvidia_VARS.fd</nvram>
>>>>>                 </os>
>>>>>                 <features>
>>>>>                 <acpi/>
>>>>>                 <apic/>
>>>>>                 <pae/>
>>>>>                 <hyperv>
>>>>>                 <relaxed state='on'/>
>>>>>                 <vapic state='on'/>
>>>>>                 <spinlocks state='on' retries='8191'/>
>>>>>                 </hyperv>
>>>>>                 <kvm>
>>>>>                 <hidden state='on'/>
>>>>>                 </kvm>
>>>>>                 <vmport state='off'/>
>>>>>                 </features>
>>>>>                 <cpu mode='host-passthrough'>
>>>>>                 <topology sockets='1' cores='4' threads='1'/>
>>>>>                 </cpu>
>>>>>                 <clock offset='localtime'>
>>>>>                 <timer name='rtc' tickpolicy='catchup'/>
>>>>>                 <timer name='pit' tickpolicy='delay'/>
>>>>>                 <timer name='hpet' present='no'/>
>>>>>                 <timer name='hypervclock' present='yes'/>
>>>>>                 </clock>
>>>>>                 <on_poweroff>destroy</on_poweroff>
>>>>>                 <on_reboot>restart</on_reboot>
>>>>>                 <on_crash>restart</on_crash>
>>>>>                 <pm>
>>>>>                 <suspend-to-mem enabled='no'/>
>>>>>                 <suspend-to-disk enabled='no'/>
>>>>>                 </pm>
>>>>>                 <devices>
>>>>>                 <emulator>/usr/local/bin/qemu-system-x86_64.hv</emulator>
>>>>>                 <disk type='block' device='disk'>
>>>>>                 <driver name='qemu' type='raw' cache='none'/>
>>>>>                 <source dev='/dev/mapper/vg_ssd-lv_kvm_NVIDIA'/>
>>>>>                 <target dev='sda' bus='scsi'/>
>>>>>                 <boot order='1'/>
>>>>>                 <address type='drive' controller='0' bus='0'
>>>>>                 target='0' unit='0'/>
>>>>>                 </disk>
>>>>>                 <disk type='block' device='disk'>
>>>>>                 <driver name='qemu' type='raw' cache='none'/>
>>>>>                 <source dev='/dev/mapper/vg_raid5-lv_xen_ntfs_files'/>
>>>>>                 <target dev='sdb' bus='scsi'/>
>>>>>                 <address type='drive' controller='0' bus='0'
>>>>>                 target='0' unit='1'/>
>>>>>                 </disk>
>>>>>                 <controller type='usb' index='0'>
>>>>>                 <address type='pci' domain='0x0000' bus='0x00'
>>>>>                 slot='0x01' function='0x2'/>
>>>>>                 </controller>
>>>>>                 <controller type='pci' index='0' model='pci-root'/>
>>>>>                 <controller type='scsi' index='0' model='virtio-scsi'>
>>>>>                 <address type='pci' domain='0x0000' bus='0x00'
>>>>>                 slot='0x06' function='0x0'/>
>>>>>                 </controller>
>>>>>                 <interface type='bridge'>
>>>>>                 <mac address='52:54:00:e9:85:8f'/>
>>>>>                 <source bridge='xenbr0'/>
>>>>>                 <model type='e1000'/>
>>>>>                 <address type='pci' domain='0x0000' bus='0x00'
>>>>>                 slot='0x03' function='0x0'/>
>>>>>                 </interface>
>>>>>                 <hostdev mode='subsystem' type='pci' managed='yes'>
>>>>>                 <source>
>>>>>                 <address domain='0x0000' bus='0x01' slot='0x00'
>>>>>                 function='0x0'/>
>>>>>                 </source>
>>>>>                 <address type='pci' domain='0x0000' bus='0x00'
>>>>>                 slot='0x0a' function='0x0' multifunction='on'/>
>>>>>                 </hostdev>
>>>>>                 <hostdev mode='subsystem' type='pci' managed='yes'>
>>>>>                 <source>
>>>>>                 <address domain='0x0000' bus='0x01' slot='0x00'
>>>>>                 function='0x1'/>
>>>>>                 </source>
>>>>>                 <address type='pci' domain='0x0000' bus='0x00'
>>>>>                 slot='0x0a' function='0x1'/>
>>>>>                 </hostdev>
>>>>>                 <hostdev mode='subsystem' type='pci' managed='yes'>
>>>>>                 <source>
>>>>>                 <address domain='0x0000' bus='0x08' slot='0x00'
>>>>>                 function='0x0'/>
>>>>>                 </source>
>>>>>                 <address type='pci' domain='0x0000' bus='0x00'
>>>>>                 slot='0x08' function='0x0'/>
>>>>>                 </hostdev>
>>>>>                 <memballoon model='virtio'>
>>>>>                 <address type='pci' domain='0x0000' bus='0x00'
>>>>>                 slot='0x05' function='0x0'/>
>>>>>                 </memballoon>
>>>>>                 </devices>
>>>>>                 </domain>
>>>>>                 ------------------------------------------------------------------------
>>>>>
>>>>>                 /usr/local/bin/qemu-system-x86_64.hv:
>>>>>                 #!/bin/sh
>>>>>                 exec /usr/bin/qemu-system-x86_64 `echo "\$@" | \
>>>>>                 sed 's|hv_time|hv_time,hv_vendor_id=GoobyPLS|g'
>>>>>
>>>>>
>>>>>
>>>>>                 And some notes:
>>>>>
>>>>>                 1) Using "<topology sockets='1' cores='4'
>>>>>                 threads='1'/>" instead of "<topology sockets='1'
>>>>>                 cores='2' threads='2'/>" provided about 2% boost
>>>>>                 in GPU performance. No change in RAM or CPU tests.
>>>>>                 I've tested with the passmark.
>>>>>
>>>>>                 2) I tried using the emulatorpin method Alex says
>>>>>                 on a mail here on vfio-users, but I didn't notice
>>>>>                 any changed in GPU performance. I didn't test it
>>>>>                 on the CPU side though.
>>>>>
>>>>>                 3) The main problem of the performance lack is
>>>>>                 that a specific game that I've been playing isn't
>>>>>                 quite playable. That game has been mentioned
>>>>>                 before here on the list, it's Tera (european
>>>>>                 version (gameforge), although american
>>>>>                 version(enmasse) has exactly the same performance).
>>>>>
>>>>>                 4) Every other game I managed to play is quite
>>>>>                 playable, though I haven't tested them to see if
>>>>>                 they run on native speeds.
>>>>>
>>>>>
>>>>>                 I'd really want some help on this matter, I really
>>>>>                 want to make my server run this VM with the nvidia
>>>>>                 GPU. I hate dual booting Windows >_>
>>>>>
>>>>>
>>>>>                 Thanks!
>>>>>
>>>>>                 _______________________________________________
>>>>>                 vfio-users mailing list
>>>>>                 vfio-users at redhat.com <mailto:vfio-users at redhat.com>
>>>>>                 https://www.redhat.com/mailman/listinfo/vfio-users
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>>     _______________________________________________
>>>     vfio-users mailing list
>>>     vfio-users at redhat.com <mailto:vfio-users at redhat.com>
>>>     https://www.redhat.com/mailman/listinfo/vfio-users
>
>     _______________________________________________
>     vfio-users mailing list
>     vfio-users at redhat.com <mailto:vfio-users at redhat.com>
>     https://www.redhat.com/mailman/listinfo/vfio-users
>
>
>
>
> _______________________________________________
> vfio-users mailing list
> vfio-users at redhat.com
> https://www.redhat.com/mailman/listinfo/vfio-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/vfio-users/attachments/20151103/00c404fa/attachment.htm>


More information about the vfio-users mailing list