<div dir="ltr"><div>sorry , update infomation right now </div><div><br></div><div><br></div><div>i installed centos7.3 at my 8 gpus machine yesterday, and i made a successful passthrough, the vm guest os can use gpu with no problem. so i think this is a software problem, i need to patch some patch.</div><div><br></div><div>i also made a test in my 4 gpus machine without any software change, the result is success. the 4 gpus are attached at pci root without pcie switch , so , i think the software problem have some correlation with pcie switch .</div><div><br></div><div>thank you Alex .</div><div><div><br></div><div>[root@64 /data]# lspci|grep NV</div><div>04:00.0 3D controller: NVIDIA Corporation Device 17fd (rev a1)</div><div>05:00.0 3D controller: NVIDIA Corporation Device 1b38 (rev a1)</div><div>08:00.0 3D controller: NVIDIA Corporation Device 1b38 (rev a1)</div><div>09:00.0 3D controller: NVIDIA Corporation Device 1b38 (rev a1)</div><div>85:00.0 3D controller: NVIDIA Corporation Device 1b38 (rev a1)</div><div>86:00.0 3D controller: NVIDIA Corporation Device 17fd (rev a1)</div><div>89:00.0 3D controller: NVIDIA Corporation Device 1b38 (rev a1)</div><div>8a:00.0 3D controller: NVIDIA Corporation Device 1b38 (rev a1)</div></div><div><br></div><div><br></div><div><div>[root@64 /data]# lspci -t</div><div>-+-[0000:ff]-+-08.0</div><div> |           +-08.2</div><div> |           +-08.3</div><div> |           +-09.0</div><div> |           +-09.2</div><div> |           +-09.3</div><div> |           +-0b.0</div><div> |           +-0b.1</div><div> |           +-0b.2</div><div> |           +-0b.3</div><div> |           +-0c.0</div><div> |           +-0c.1</div><div> |           +-0c.2</div><div> |           +-0c.3</div><div> |           +-0c.4</div><div> |           +-0c.5</div><div> |           +-0c.6</div><div> |           +-0c.7</div><div> |           +-0d.0</div><div> |           +-0d.1</div><div> |           +-0d.2</div><div> |           +-0d.3</div><div> |           +-0d.4</div><div> |           +-0d.5</div><div> |           +-0f.0</div><div> |           +-0f.1</div><div> |           +-0f.2</div><div> |           +-0f.3</div><div> |           +-0f.4</div><div> |           +-0f.5</div><div> |           +-0f.6</div><div> |           +-10.0</div><div> |           +-10.1</div><div> |           +-10.5</div><div> |           +-10.6</div><div> |           +-10.7</div><div> |           +-12.0</div><div> |           +-12.1</div><div> |           +-12.4</div><div> |           +-12.5</div><div> |           +-13.0</div><div> |           +-13.1</div><div> |           +-13.2</div><div> |           +-13.3</div><div> |           +-13.6</div><div> |           +-13.7</div><div> |           +-14.0</div><div> |           +-14.1</div><div> |           +-14.2</div><div> |           +-14.3</div><div> |           +-14.4</div><div> |           +-14.5</div><div> |           +-14.6</div><div> |           +-14.7</div><div> |           +-16.0</div><div> |           +-16.1</div><div> |           +-16.2</div><div> |           +-16.3</div><div> |           +-16.6</div><div> |           +-16.7</div><div> |           +-17.0</div><div> |           +-17.1</div><div> |           +-17.2</div><div> |           +-17.3</div><div> |           +-17.4</div><div> |           +-17.5</div><div> |           +-17.6</div><div> |           +-17.7</div><div> |           +-1e.0</div><div> |           +-1e.1</div><div> |           +-1e.2</div><div> |           +-1e.3</div><div> |           +-1e.4</div><div> |           +-1f.0</div><div> |           \-1f.2</div><div> +-[0000:80]-+-00.0-[81]--+-00.0</div><div> |           |            \-00.1</div><div> |           +-01.0-[82]----00.0</div><div> |           +-02.0-[83-86]----00.0-[84-86]--+-08.0-[85]----00.0</div><div> |           |                                            \-10.0-[86]----00.0</div><div> |           +-03.0-[87-8a]----00.0-[88-8a]--+-08.0-[89]----00.0</div><div> |           |                                            \-10.0-[8a]----00.0</div><div> |           +-04.0</div><div> |           +-04.1</div><div> |           +-04.2</div><div> |           +-04.3</div><div> |           +-04.4</div><div> |           +-04.5</div><div> |           +-04.6</div><div> |           +-04.7</div><div> |           +-05.0</div><div> |           +-05.1</div><div> |           +-05.2</div><div> |           \-05.4</div><div> +-[0000:7f]-+-08.0</div><div> |           +-08.2</div><div> |           +-08.3</div><div> |           +-09.0</div><div> |           +-09.2</div><div> |           +-09.3</div><div> |           +-0b.0</div><div> |           +-0b.1</div><div> |           +-0b.2</div><div> |           +-0b.3</div><div> |           +-0c.0</div><div> |           +-0c.1</div><div> |           +-0c.2</div><div> |           +-0c.3</div><div> |           +-0c.4</div><div> |           +-0c.5</div><div> |           +-0c.6</div><div> |           +-0c.7</div><div> |           +-0d.0</div><div> |           +-0d.1</div><div> |           +-0d.2</div><div> |           +-0d.3</div><div> |           +-0d.4</div><div> |           +-0d.5</div><div> |           +-0f.0</div><div> |           +-0f.1</div><div> |           +-0f.2</div><div> |           +-0f.3</div><div> |           +-0f.4</div><div> |           +-0f.5</div><div> |           +-0f.6</div><div> |           +-10.0</div><div> |           +-10.1</div><div> |           +-10.5</div><div> |           +-10.6</div><div> |           +-10.7</div><div> |           +-12.0</div><div> |           +-12.1</div><div> |           +-12.4</div><div> |           +-12.5</div><div> |           +-13.0</div><div> |           +-13.1</div><div> |           +-13.2</div><div> |           +-13.3</div><div> |           +-13.6</div><div> |           +-13.7</div><div> |           +-14.0</div><div> |           +-14.1</div><div> |           +-14.2</div><div> |           +-14.3</div><div> |           +-14.4</div><div> |           +-14.5</div><div> |           +-14.6</div><div> |           +-14.7</div><div> |           +-16.0</div><div> |           +-16.1</div><div> |           +-16.2</div><div> |           +-16.3</div><div> |           +-16.6</div><div> |           +-16.7</div><div> |           +-17.0</div><div> |           +-17.1</div><div> |           +-17.2</div><div> |           +-17.3</div><div> |           +-17.4</div><div> |           +-17.5</div><div> |           +-17.6</div><div> |           +-17.7</div><div> |           +-1e.0</div><div> |           +-1e.1</div><div> |           +-1e.2</div><div> |           +-1e.3</div><div> |           +-1e.4</div><div> |           +-1f.0</div><div> |           \-1f.2</div><div> \-[0000:00]-+-00.0</div><div>             +-01.0-[01]--</div><div>             +-02.0-[02-05]----00.0-[03-05]--+-08.0-[04]----00.0</div><div>             |                                            \-10.0-[05]----00.0</div><div>             +-03.0-[06-09]----00.0-[07-09]--+-08.0-[08]----00.0</div><div>             |                                            \-10.0-[09]----00.0</div><div>             +-04.0</div><div>             +-04.1</div><div>             +-04.2</div><div>             +-04.3</div><div>             +-04.4</div><div>             +-04.5</div><div>             +-04.6</div><div>             +-04.7</div><div>             +-05.0</div><div>             +-05.1</div><div>             +-05.2</div><div>             +-05.4</div><div>             +-11.0</div><div>             +-11.4</div><div>             +-14.0</div><div>             +-16.0</div><div>             +-16.1</div><div>             +-1a.0</div><div>             +-1c.0-[0a]--</div><div>             +-1c.7-[0b-0c]----00.0-[0c]----00.0</div><div>             +-1d.0</div><div>             +-1f.0</div><div>             +-1f.2</div></div><div class="gmail_extra"><br></div><div class="gmail_extra"><br></div><div class="gmail_extra">and the xml </div><div class="gmail_extra"><br></div><div class="gmail_extra"><div class="gmail_extra"><domain type='kvm'></div><div class="gmail_extra">  <name>win</name></div><div class="gmail_extra">  <uuid>a2021423-89d8-4a33-aaa5-07102ae7ad4e</uuid></div><div class="gmail_extra">  <memory unit='KiB'>8388608</memory></div><div class="gmail_extra">  <currentMemory unit='KiB'>8388608</currentMemory></div><div class="gmail_extra">  <vcpu placement='static' cpuset='0-8'>8</vcpu></div><div class="gmail_extra">  <sysinfo type='smbios'></div><div class="gmail_extra">    <system></div><div class="gmail_extra">      <entry name='serial'>21aa32e5-8233-40d4-b323-128824f6becf</entry></div><div class="gmail_extra">      <entry name='uuid'>a2021423-89d8-4a33-aaa5-07102ae7ad4e</entry></div><div class="gmail_extra">    </system></div><div class="gmail_extra">  </sysinfo></div><div class="gmail_extra">  <os></div><div class="gmail_extra">    <type arch='x86_64' machine='pc'>hvm</type></div><div class="gmail_extra">    <boot dev='hd'/></div><div class="gmail_extra">    <smbios mode='sysinfo'/></div><div class="gmail_extra">  </os></div><div class="gmail_extra">  <features></div><div class="gmail_extra">    <acpi/></div><div class="gmail_extra">    <apic/></div><div class="gmail_extra">    <pae/></div><div class="gmail_extra">    <hap/></div><div class="gmail_extra">    <hyperv></div><div class="gmail_extra">      <relaxed state='on'/></div><div class="gmail_extra">    </hyperv></div><div class="gmail_extra">  </features></div><div class="gmail_extra">  <cpu></div><div class="gmail_extra">    <topology sockets='2' cores='6' threads='2'/></div><div class="gmail_extra">  </cpu></div><div class="gmail_extra">  <clock offset='localtime'></div><div class="gmail_extra">    <timer name='pit' tickpolicy='delay'/></div><div class="gmail_extra">    <timer name='rtc' tickpolicy='catchup' track='guest'/></div><div class="gmail_extra">    <timer name='hpet' present='no'/></div><div class="gmail_extra">  </clock></div><div class="gmail_extra">  <on_poweroff>destroy</on_poweroff></div><div class="gmail_extra">  <on_reboot>restart</on_reboot></div><div class="gmail_extra">  <on_crash>restart</on_crash></div><div class="gmail_extra">  <devices></div><div class="gmail_extra">    <emulator>/usr/local/bin/qemu-system-x86_64</emulator></div><div class="gmail_extra">    <disk type='file' device='disk'></div><div class="gmail_extra">      <driver name='qemu' type='qcow2' cache='none'/></div><div class="gmail_extra">      <source file='/data/win.qcow2'/></div><div class="gmail_extra">      <target dev='vda' bus='virtio'/></div><div class="gmail_extra">      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/></div><div class="gmail_extra">    </disk></div><div class="gmail_extra">    <controller type='usb' index='0'></div><div class="gmail_extra">      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/></div><div class="gmail_extra">    </controller></div><div class="gmail_extra">    <serial type='pty'></div><div class="gmail_extra">      <target port='0'/></div><div class="gmail_extra">    </serial></div><div class="gmail_extra">    <console type='pty'></div><div class="gmail_extra">      <target type='serial' port='0'/></div><div class="gmail_extra">    </console></div><div class="gmail_extra">    <input type='tablet' bus='usb'/></div><div class="gmail_extra">    <input type='mouse' bus='ps2'/></div><div class="gmail_extra">    <graphics type='vnc' port='-1' autoport='yes' listen='0.0.0.0' keymap='en-us'></div><div class="gmail_extra">      <listen type='address' address='0.0.0.0'/></div><div class="gmail_extra">    </graphics></div><div class="gmail_extra">    <video></div><div class="gmail_extra">      <model type='cirrus' vram='9216' heads='1'/></div><div class="gmail_extra">      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/></div><div class="gmail_extra">    </video></div><div class="gmail_extra">    <memballoon model='virtio'></div><div class="gmail_extra">      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/></div><div class="gmail_extra">    </memballoon></div><div class="gmail_extra">    <hostdev mode='subsystem' type='pci' managed='yes'></div><div class="gmail_extra">      <source></div><div class="gmail_extra">        <address domain='0x0000' bus='0x84' slot='0x00' function='0x0'/></div><div class="gmail_extra">      </source></div><div class="gmail_extra">    </hostdev></div><div class="gmail_extra">  </devices></div><div class="gmail_extra"></domain></div></div><div class="gmail_extra"><br></div><div class="gmail_extra"><br></div><div class="gmail_extra"><br><div class="gmail_quote">2017-03-09 21:49 GMT+08:00 Alex Williamson <span dir="ltr"><<a href="mailto:alex.williamson@redhat.com" target="_blank">alex.williamson@redhat.com</a>></span>:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><span class="gmail-">On Thu, 9 Mar 2017 11:47:32 +0800<br>
rhett rhett <<a href="mailto:rhett.kernel@gmail.com">rhett.kernel@gmail.com</a>> wrote:<br>
<br>
> somebody can help me ?<br>
<br>
</span>I asked for VM commandline or XML, you haven't provided it.  I asked<br>
for lspci info, you haven't provided it.  Help us help you.<br>
<div class="gmail-HOEnZb"><div class="gmail-h5"><br>
> 2017-03-08 14:34 GMT+08:00 rhett rhett <<a href="mailto:rhett.kernel@gmail.com">rhett.kernel@gmail.com</a>>:<br>
><br>
> > here's some more error log from centos guest:<br>
> ><br>
> > Mar  7 05:38:07 localhost kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel<br>
> > Module  375.39  Tue Jan 31 20:47:00 PST 2017 (using threaded interrupts)<br>
> > Mar  7 05:38:08 localhost kernel: nvidia-modeset: Loading NVIDIA Kernel<br>
> > Mode Setting Driver for UNIX platforms  375.39  Tue Jan 31 19:41:48 PST 2017<br>
> > Mar  7 05:39:27 localhost kernel: NVRM: RmInitAdapter failed!<br>
> > (0x24:0x51:1060)<br>
> > Mar  7 05:39:27 localhost kernel: NVRM: rm_init_adapter failed for device<br>
> > bearing minor number 0<br>
> > Mar  7 05:43:40 localhost kernel: NVRM: RmInitAdapter failed!<br>
> > (0x24:0x51:1060)<br>
> > Mar  7 05:43:40 localhost kernel: NVRM: rm_init_adapter failed for device<br>
> > bearing minor number 0<br>
> > Mar  8 05:07:47 localhost kernel: nvidia: module license 'NVIDIA' taints<br>
> > kernel.<br>
> > Mar  8 05:07:47 localhost kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel<br>
> > Module<br>
> ><br>
> > 2017-03-08 14:31 GMT+08:00 rhett rhett <<a href="mailto:rhett.kernel@gmail.com">rhett.kernel@gmail.com</a>>:<br>
> ><br>
> >> i have two guest , a windows 2008 server and a centos 7.2 . in windows,<br>
> >> the device manager said the gpu can't start ,error code 10.<br>
> >> in centos, when i run nvidia-smi,  it said no device found.<br>
> >><br>
> >> no specil vm configurations,  whit the same config, i can use gpu<br>
> >> successfully in my two gpu server. the biggest different is , that server<br>
> >> is no pcie switcher.<br>
> >><br>
> >> 2017-03-08 11:55 GMT+08:00 Alex Williamson <<a href="mailto:alex.williamson@redhat.com">alex.williamson@redhat.com</a>>:<br>
> >><br>
> >>> On Wed, 8 Mar 2017 11:26:17 +0800<br>
> >>> rhett rhett <<a href="mailto:rhett.kernel@gmail.com">rhett.kernel@gmail.com</a>> wrote:<br>
> >>><br>
> >>> > two gpus share the same irq , i found the reason. because the msi be<br>
> >>> > disabled later , so irq 140 is being reused.<br>
> >>> ><br>
> >>> > but i don't know why somebady calls vfio_pci_ioctl to disable the msi.<br>
> >>><br>
> >>> vfio just does what the guest requests, but you're really providing<br>
> >>> hardly any more information than when you asked off list.  My wild<br>
> >>> guess, is that maybe you're running a Windows guest and not configuring<br>
> >>> the VM for a vCPU type where Windows supports MSI.  For more<br>
> >>> assistance, please provide basic information, like the QEMU command<br>
> >>> line or VM XML, also the PCI information from the host (sudo lspci<br>
> >>> -vvv), and of course any error codes in the guest or an actual<br>
> >>> description of how the device doesn't work in the guest.  Thanks,<br>
> >>><br>
> >>> Alex<br>
> >>><br>
> >>><br>
> >>> > 2017-03-08 10:55 GMT+08:00 rhett rhett <<a href="mailto:rhett.kernel@gmail.com">rhett.kernel@gmail.com</a>>:<br>
> >>> ><br>
> >>> > > i have a question about vfio , here is my description.<br>
> >>> > ><br>
> >>> > > i have 8 gpus in my server machine ,  but they are all behind a pcie<br>
> >>> > > bridge.  when i make a vfio passthrough , i can't use the gpus in my<br>
> >>> guest<br>
> >>> > > os.<br>
> >>> > > dmesg shows the following message<br>
> >>> > ><br>
> >>> > > [  662.208072] vfio-pci 0000:87:00.0: irq 140 for MSI/MSI-X<br>
> >>> > > [  725.761623] vfio-pci 0000:04:00.0: irq 140 for MSI/MSI-X<br>
> >>> > ><br>
> >>> > > i started two vm , one use 87 and another use 04,  dmesg shows that<br>
> >>> they<br>
> >>> > > share the same irq 140 . is this normal ?<br>
> >>> > ><br>
> >>> > > i also saw the iommu groups, each gpu stays in a separate group, and<br>
> >>> with<br>
> >>> > > no other device in group. so this means ACS works correctly ?<br>
> >>> > ><br>
> >>> > > hope to get your helps !<br>
> >>> > ><br>
> >>><br>
> >>><br>
> >><br>
> ><br>
<br>
</div></div></blockquote></div><br></div></div>