[vfio-users] GPU crashes after a while with dmesg spam

Aria aria at ar1as.space
Wed Jun 21 16:49:43 UTC 2017


strange, when I swapped it to the host and tried using it, it seemed fine. I ran furmark to test it. can you recommend something more intensive for Linux that I can test with to confirm its a hardware fault. 

⁣Sent from BlueMail ​

On 22 Jun. 2017, 12:48 am, at 12:48 am, Alex Williamson <alex.williamson at redhat.com> wrote:
>On Wed, 21 Jun 2017 23:58:58 +0800
>Aria <aria at ar1as.space> wrote:
>
>> After a few minutes of gaming, once a significant event happens
>> (Someone dies ingame) the screen shuts off and claims there's no
>> signal. My dmesg log is spammed with
>> [ 2806.613203] vfio_bar_restore: 0000:01:00.0 reset recovery -
>> restoring bars [ 2808.169346] vfio_bar_restore: 0000:01:00.1 reset
>> recovery - restoring bars
>> 
>> Running archlinux, kernel 4.11.6-1-ARCH, NVIDIA GTX 970.
>> 
>> A curious note is the output of lspci once this happens is
>> 
>> 01:00.0 VGA compatible controller: NVIDIA Corporation GM204 [GeForce
>> GTX 970] (rev ff) (prog-if ff) !!! Unknown header type 7f
>> 	Kernel driver in use: vfio-pci
>> 	Kernel modules: nouveau, nvidia_drm, nvidia
>> 
>> 01:00.1 Audio device: NVIDIA Corporation GM204 High Definition Audio
>> Controller (rev ff) (prog-if ff) !!! Unknown header type 7f
>> 	Kernel driver in use: vfio-pci
>> 	Kernel modules: snd_hda_intel
>> 
>
>This means that the card doesn't show up in PCI config space anymore
>(all reads return -1).  That's potentially also why vfio thinks the
>device was reset, suddenly the BARs don't contain what we think they
>should because reading them returns -1.  Seems like a hardware issue.
>Thanks,
>
>Alex
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/vfio-users/attachments/20170622/1e7680ca/attachment.htm>


More information about the vfio-users mailing list