[libvirt-users] Using hostdev to plug a PCI-E host device into Q35 pcie-root port

Thomas Kuther tom at kuther.net
Tue Nov 19 13:51:59 UTC 2013


Am 19.11.2013 11:36, schrieb Laine Stump:
> On 11/15/2013 03:35 PM, Thomas Kuther wrote:
>> Hello,
>> 
>> I'm trying to migrate a working qemu command line configuration to
>> libvirt.
>> The part I'm currently failing on is:
>> 
>> $ qemu-system-x86_64 -M Q35 ... -device 
>> vfio-pci,host=05:00.0,bus=pcie.0
>> 
>> The right way to translate this into libvirt XML seems to be using
>> <hostdev>, but I seem to be unable to plug it into the pcie-root port
>> 
>> This is how the interesting part looks like when I let "virsh edit"
>> generate an <address>
>> 
>>     <controller type='pci' index='0' model='pcie-root'/>
>>     <controller type='pci' index='1' model='dmi-to-pci-bridge'>
>>       <address type='pci' domain='0x0000' bus='0x00' slot='0x02'
>> function='0x0'/>
>>     </controller>
>>     <controller type='pci' index='2' model='pci-bridge'>
>>       <address type='pci' domain='0x0000' bus='0x01' slot='0x01'
>> function='0x0'/>
>>     </controller>
>>     [...]
>>     <hostdev mode='subsystem' type='pci' managed='yes'>
>>       <driver name='vfio'/>
>>       <source>
>>         <address domain='0x0000' bus='0x03' slot='0x00' 
>> function='0x0'/>
>>       </source>
>>       <address type='pci' domain='0x0000' bus='0x02' slot='0x06'
>> function='0x0'/>
>>     </hostdev>
>>     [...]
>> 
>> To my understanding, this will plug the host device into the
>> pci-bridge controller.
>> The guest OS doesn't boot with this and resets right after bios.
> 
> Ugh. That's very unfortunate. This is the first report I've heard of
> something failing in such a bad way due to being plugged into a
> pci-bridge slot; up until now I'd only heard that there is some extra
> PCIe functionality that would be missing if a device was plugged into a
> PCI slot vs. PCIe.
> 
> Can I ask what type of device this is?
> 

It's a Marvell 88SE9172 SATA controller, here is the lspci -vvv

03:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9172 SATA 
6Gb/s Controller (rev 11) (prog-if 01 [AHCI 1.0])
         Subsystem: Gigabyte Technology Co., Ltd Device b000
         Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- 
ParErr- Stepping- SERR+ FastB2B- DisINTx+
         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- 
<TAbort- <MAbort- >SERR- <PERR- INTx-
         Interrupt: pin A routed to IRQ 47
         Region 0: I/O ports at d040 [disabled] [size=8]
         Region 1: I/O ports at d030 [disabled] [size=4]
         Region 2: I/O ports at d020 [disabled] [size=8]
         Region 3: I/O ports at d010 [disabled] [size=4]
         Region 4: I/O ports at d000 [disabled] [size=16]
         Region 5: Memory at f7610000 (32-bit, non-prefetchable) 
[disabled] [size=512]
         Expansion ROM at f7600000 [disabled by cmd] [size=64K]
         Capabilities: [40] Power Management version 3
                 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA 
PME(D0-,D1-,D2-,D3hot+,D3cold-)
                 Status: D3 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
         Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit-
                 Address: 00000000  Data: 0000
         Capabilities: [70] Express (v2) Legacy Endpoint, MSI 00
                 DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s 
<1us, L1 <8us
                         ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
                 DevCtl: Report errors: Correctable- Non-Fatal- Fatal- 
Unsupported-
                         RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
                         MaxPayload 128 bytes, MaxReadReq 512 bytes
                 DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- 
TransPend-
                 LnkCap: Port #0, Speed 5GT/s, Width x1, ASPM L0s L1, 
Latency L0 <512ns, L1 <64us
                         ClockPM- Surprise- LLActRep- BwNot-
                 LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- 
CommClk+
                         ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                 LnkSta: Speed 5GT/s, Width x1, TrErr- Train- SlotClk+ 
DLActive- BWMgmt- ABWMgmt-
                 DevCap2: Completion Timeout: Not Supported, TimeoutDis+, 
LTR-, OBFF Not Supported
                 DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, 
LTR-, OBFF Disabled
                 LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- 
SpeedDis-
                          Transmit Margin: Normal Operating Range, 
EnterModifiedCompliance- ComplianceSOS-
                          Compliance De-emphasis: -6dB
                 LnkSta2: Current De-emphasis Level: -3.5dB, 
EqualizationComplete-, EqualizationPhase1-
                          EqualizationPhase2-, EqualizationPhase3-, 
LinkEqualizationRequest-
         Capabilities: [100 v1] Advanced Error Reporting
                 UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- 
UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                 UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- 
UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                 UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- 
UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                 CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- 
NonFatalErr+
                 CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- 
NonFatalErr+
                 AERCap: First Error Pointer: 00, GenCap- CGenEn- ChkCap- 
ChkEn-
         Kernel driver in use: vfio-pci


The second one I'm trying to pass through is a Renesas uPD720201 USB 3.0 
Host Controller, but first I wanted to get the SATA controller working 
in libvirt. I will try to leave out the SATA controller and see what 
happens with only the USB3 controller.


>> 
>> Manually setting
>> <address type='pci' domain='0x0000' bus='0x00' slot='0x1E'
>> function='0x0'/>
>> cause XML validation failure.
>> 
>> Is there any way in libvirt XML to plug a host's PCI-E device directly
>> into the pcie-root port, like it works on qemu command line?
> 
> 
> I'm sorry to say, no. With very few (and specific) exceptions, libvirt
> insists that all guest devices be plugged into a hot-pluggable PCI slot
> - this eliminates both the PCIe "root complex" (a.k.a. pcie.0) as well
> as the dmi-to-pci-controller that is plugged into pcie.0 (because
> pci-to-dmi controllers' slots don't support hot-plug).
> 
> This is done because, for now, almost all devices that qemu knows about
> are PCI (no PCI-e) devices, and if we allowed plugging them into pcie.0
> now, then on the day in the future when qemu begins enforcing the
> difference between PCI and PCIe (currently it doesn't), the world would
> be full of libvirt configs that would no longer work.
> 
> There was some discussion about this a month or two ago either on
> libvir-list or maybe it was the qemu-devel list. We decided that qemu
> needs to provide some sort of introspection of the devices' connection
> types so that libvirt can determine what device can plug into which
> slots; at that time we'll be able to allow exactly what's proper in 
> each
> case. In the meantime we're stuck with being overly cautious in order 
> to
> prevent future catastrophe.
> 

Understood, thanks for the explanation.

>> 
>> I'm aware I could use something like
>> 
>>   <qemu:commandline>
>>     <qemu:arg value='-device'/>
>>     <qemu:arg value='vfio-pci,host=05:00.0,bus=pcie.0'/>
>>   </qemu:commandline>
>> 
>> but I insist on running the VM as non-root, and if I got that right I
>> need to configure at least one vfio device (or memory locking) in
>> order for libvirt to set a proper RLIMIT_MEMLOCK value.
>> 
>> Any help would be be appreciated.
> 
> For now at least, you'll need to let it plug into the pci-bridge device
> pci.2 (which, as you've found, libvirt will automatically find when you
> don't specify any address). Unfortunately that doesn't do you much 
> good,
> since that particular device you're assigning actually requires that it
> be plugged into the PCIe bus.
> 
> I'm wondering as I type if possibly we could relax the enforcement of
> the "PCI only" rule such that we allow explicitly placing any device on
> any type of bus, but only auto-assign to a plain PCI slot. That may be 
> a
> reasonable compromise until qemu has the required new device/controller
> introspection info available.
> 

I like the idea.


Regards,
Thomas




More information about the libvirt-users mailing list