[vfio-users] Boot using second GPU?

Fri Aug 5 11:44:42 UTC 2016

Am 05.08.2016 10:22, schrieb Rokas Kupstys:

> Okay this is unexpected luck. After more tinkering i got it to work! 
> Here is my setup:
> 
> * AMD FX-8350 CPU + Sabertooth 990FX R2 motherboard
> * 0000:01:00.0 - gpu in first slot
> * 0000:06:00.0 - gpu in third slot
> * UEFI on host and guest.
> * Archlinux
> 
> In order to make host use non-boot GPU:
> 
> 1. Add Kernel boot parameter "video=efifb:off". This makes kernel not 
> use first gpu and boot messages appear on second gpu.
> 
> 2. Bind first gpu (0000:01:00.0) to vfio-pci driver. I did this by 
> adding line
> 
>> options vfio-pci         ids=1002:677B,1002:AA98
> to /etc/modprobe.d/kvm.conf. They are obtained from "lspci -n" which in 
> my case show:
> 
>> 01:00.0 0300: 1002:677B
>> 01:00.1 0403: 1002:AA98
> 3. Configure xorg to use second gpu (0000:06:00.0). I added file 
> /etc/X11/xorg.conf.d/secondary-gpu.conf with contents:
> 
>> Section "Device"
>> Identifier     "Device0"
>> Driver         "radeon"
>> VendorName     "AMD Corporation"
>> BoardName      "AMD Secondary"
>> BusID          "PCI:6:0:0"
>> EndSection
> And thats it! Now when machine boots it shows POST messages and 
> bootloader on first gpu, but as soon as boot option is selected display 
> goes blank and kernel boot messages show on second gpu. After boot you 
> can assign first gpu to VM as usual and it works.
> HELP REQUEST: could someone with intel hardware (ideally x99 chipset) 
> test this method? I am planning a build and if this works i could 
> settle with 28 lane cpu and save couple hundred dollars. Intel's 40 
> lane cpus are way overpriced.. And with 28 lane cpus only first slot 
> can run at x16 speed while other slots downgrade to x8 or less. Anyhow 
> i would love to hear if this works on intel hardware.
> 

Hi,

I have a Gigabyte GA-X99-UD4 motherboard and i7-5820K. There are two 
GPUs
in it - a GTX 970 for pass-through on 03:00.0 and a GT 730 as primary 
GPU
on 06:00.0. The PCIE slot of the GT is selected as primary video output
in the UEFI settings. All text and graphics output goes to it - the 
output
of the GTX card remains off the entire time until the VM is booted. The 
X
server does see both cards but since the nvidia module is prevented from
binding to the GTX, X cannot use it and starts on the GT. No fiddling 
with
the console driver parameters necessary.

Distribution:
    Arch Linux, 4.6.4-1-ARCH

Kernel parameters:
    ... pci-stub.ids=10de:13c2,10de:0fbb,8086:8d20 nvidia-drm.modeset=1 
...

/etc/modprobe.d/vfio.conf:
    options vfio-pci ids=10de:13c2,10de:0fbb,8086:8d20

/etc/mkinitcpio.conf:
    ...
    MODULES="vfio vfio_iommu_type1 vfio_pci vfio_virqfd vfat aes_x86_64 
crc32c_intel nvidia nvidia_modeset nvidia_uvm nvidia_drm"
    ...

/etc/X11/xorg.conf.d/20-nvidia.conf:
    Section "Device"
     Identifier                "Device0"
     Driver                    "nvidia"
     VendorName                "NVIDIA Corporation"
     Option "ConnectToAcpid"   "0"
    EndSection

The only problem with my setup is that the GTX is in PCIE_2, which works
as x8 with i7-5820K installed. I cannot fit the card in PCIE_1 because 
of
the oversized CPU cooler. This doesn't actually bug me at all as 
multiple
tests (for example, [1]) have shown negligible difference in gaming FPS
between PCI-e 3.0 x8 and x16. The GT card is in PCIE_4.

Kind regards,
Hristo

[1] 
http://www.gamersnexus.net/guides/2488-pci-e-3-x8-vs-x16-performance-impact-on-gpus

> Rokas Kupstys
> 
> On 2016.08.05 10:34, Rokas Kupstys wrote:
> 
> I think i got half-way there.. My primary gpu is at 0000:01:00.0 and
> secondary on 0000:06:00.0. I used following xorg config:
> 
> Section "Device"
> Identifier     "Device0"
> Driver         "radeon"
> VendorName     "AMD Corporation"
> BoardName      "AMD Secondary"
> BusID          "PCI:6:0:0"
> EndSection
> 
> After booting 0000:06:00.0 was still bound to vfio-pci (im yet to sort
> it out why as i removed modprobe configs and kernel parameters) and i
> ran following script to bind gpu to correct driver:
> 
> #!/bin/bash
> 
> unbind() {
> dev=$1
> if [ -e /sys/bus/pci/devices/${dev}/driver ]; then
> echo "${dev}" > /sys/bus/pci/devices/${dev}/driver/unbind
> while [ -e /sys/bus/pci/devices/${dev}/driver ]; do
> sleep 0.1
> done
> fi
> }
> 
> bind() {
> dev=$1
> driver=$2
> vendor=$(cat /sys/bus/pci/devices/${dev}/vendor)
> device=$(cat /sys/bus/pci/devices/${dev}/device)
> echo "${vendor} ${device}" > /sys/bus/pci/drivers/${driver}/new_id
> echo "$dev" > /sys/bus/pci/drivers/${driver}/bind
> }
> 
> unbind "0000:06:00.0"
> bind "0000:06:00.0" "radeon"
> #unbind "0000:01:00.0"
> 
> After restarting sddm.service (display manager) i could switch to
> secondary gpu and log in to desktop. All worked. Problem is i can not
> unbind 0000:01:00.0 so i could pass-through it. Attempt to unbind 
> driver
> resulted in display freezing. Even secondary gpu froze.
> 
> Rokas Kupstys
> 
> On 2016.08.05 04:55, Nicolas Roy-Renaud wrote:
> 
> That's something you should fix in the BIOS. The boot GPU is special
> because the motherboard has to use it to display things such as POST
> messages and such, so it's already "tainted" by the time the kernel
> gets a hold of it. I had to put my guest GPU on my motherboard's
> second PCI slot because of that (can't change the boot GPU in the BIOS
> settings), which is pretty unconveinient because it blocks access to
> most of my sata ports.
> 
> If there's a way to cleanly pass the boot GPU to a VM, I don't know
> about it. I'd be interested to know too, however.
> 
> - Nicolas
> 
> On 2016-08-04 13:59, Rokas Kupstys wrote:
> 
> Hey is it possible to make kernel use GPU other than one that is in
> first slot? If so - how?
> 
> I have multiple PCIe slots but only first can run at max speed so i
> would like to use it for VGA passthrough. However if i put powerful GPU
> into the first slot - linux boots using that GPU. I would like to make
> kernel use GPU in slot 3. So result should be bios and bootloader
> running on gpu in slot #1, but kernel should use gpu in slot #3. I 
> tried
> binding first gpu to vfio-pci driver hoping kernel would use next
> available gpu. That did not work, i could see one line with systemd
> version in low-res console (normally its high-res). I also tryed
> fbcon=map:1234 (not exactly being sure what im doing) but that yielded
> black screen. Not sure what else i could try.
>