[vfio-users] GPU passthrough errors with linux 5.1 and newer

Ivan Volosyuk ivan.volosyuk at gmail.com
Sat Aug 3 11:35:45 UTC 2019


I just hit the problem on Windows 8.1 guest on kernel 5.2.5 (gentoo), tried
to upgrade from kernel 4.19.57.
It seems the problem doesn't happen if I use Windows 10 guest or kernel
4.19.57.
There is graphics artifacts on my RTX-2080TI in W8.1 guest. I temporary
switched to older kernel. Any news about this? Any way to debug it?


On Thu, Aug 1, 2019 at 5:08 AM Zoltán Kővágó <dirty.ice.hu at gmail.com> wrote:

> On 2019-07-31 15:41, José Ramón Muñoz Pekkarinen wrote:
> > On Sun, 21 Jul 2019 at 21:59, Zoltán Kővágó <dirty.ice.hu at gmail.com>
> wrote:
> >>
> >> Hi,
> >>
> >> Recently my previously perfectly working GPU passthrough setup (with a
> >> win8.1 x64 guest with OVMF) started to malfunction in various ways:
> >> screen randomly turned off for a few seconds, BSOD with
> >> VIDEO_TDR_FAILURE, 3d apps randomly crashing, not drawing the windows'
> >> content, and graphical glitches (for example in furmark the OSD text
> >> flickers).
> >>
> >> After fiddling around with various qemu versions, nvidia driver versions
> >> on the guest, I figured out that with a linux 5.0 kernel it works fine,
> >> but with 5.1 it randomly fails. I bisected it and it looks like the
> >> culprit is the commit 4e103134b862 "KVM: x86/mmu: Zap only the relevant
> >> pages when removing a memslot"[1]. I tried to revert in on top of 5.2.1
> >> but too many things changed in the meantime. Anyway, if I replace the
> >> body of kvm_mmu_invalidate_zap_pages_in_memslot with
> >> kvm_mmu_zap_all(kvm); it works again (probably with horrible performance
> >> degradation).
> >>
> >> Did anyone experience anything like this? I'm using Alex's ACS override
> >> patch, maybe it violates some assumption that the new code has?
> >
> >      Hi,
> >
> >      I noticed some changes that made 5.0 not working well when
> > detecting screen speakers through hdmi, but this I didn't see anytime.
> > My problem flew away with 5.1.15(the one I currently use), and no
> > other spread. I never needed the ACS override patch in my setup,
> > what happen if you try without it, does your groups comes wrong in
> > any ways?
> >
> >      Best regards.
> >
> >      José.
> >
>
> Hi,
>
> Unfortunately without pcie_acs_override=downstream my iommu groups look
> like this (i.e. both video cards and their pci bridges are in one
> group), and I never had a problem with it in the last ~4.5 years.
>
> # ls /sys/kernel/iommu_groups/*/devices
> /sys/kernel/iommu_groups/0/devices:
> 0000:00:00.0
>
> /sys/kernel/iommu_groups/10/devices:
> 0000:00:1c.3
>
> /sys/kernel/iommu_groups/11/devices:
> 0000:00:1d.0
>
> /sys/kernel/iommu_groups/12/devices:
> 0000:00:1f.0  0000:00:1f.2  0000:00:1f.3
>
> /sys/kernel/iommu_groups/1/devices:
> 0000:00:01.0  0000:00:01.1  0000:01:00.0  0000:01:00.1  0000:02:00.0
> 0000:02:00.1
>
> /sys/kernel/iommu_groups/2/devices:
> 0000:00:02.0
>
> /sys/kernel/iommu_groups/3/devices:
> 0000:00:03.0
>
> /sys/kernel/iommu_groups/4/devices:
> 0000:00:14.0
>
> /sys/kernel/iommu_groups/5/devices:
> 0000:00:16.0
>
> /sys/kernel/iommu_groups/6/devices:
> 0000:00:19.0
>
> /sys/kernel/iommu_groups/7/devices:
> 0000:00:1a.0
>
> /sys/kernel/iommu_groups/8/devices:
> 0000:00:1b.0
>
> /sys/kernel/iommu_groups/9/devices:
> 0000:00:1c.0
>
> Regards,
> Zoltan
>
> _______________________________________________
> vfio-users mailing list
> vfio-users at redhat.com
> https://www.redhat.com/mailman/listinfo/vfio-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/vfio-users/attachments/20190803/b23ed233/attachment.htm>


More information about the vfio-users mailing list