[libvirt-users] Using hostdev to plug a PCI-E host device into Q35 pcie-root port
Thomas Kuther
tom at kuther.net
Tue Nov 19 13:51:59 UTC 2013
Am 19.11.2013 11:36, schrieb Laine Stump:
> On 11/15/2013 03:35 PM, Thomas Kuther wrote:
>> Hello,
>>
>> I'm trying to migrate a working qemu command line configuration to
>> libvirt.
>> The part I'm currently failing on is:
>>
>> $ qemu-system-x86_64 -M Q35 ... -device
>> vfio-pci,host=05:00.0,bus=pcie.0
>>
>> The right way to translate this into libvirt XML seems to be using
>> <hostdev>, but I seem to be unable to plug it into the pcie-root port
>>
>> This is how the interesting part looks like when I let "virsh edit"
>> generate an <address>
>>
>> <controller type='pci' index='0' model='pcie-root'/>
>> <controller type='pci' index='1' model='dmi-to-pci-bridge'>
>> <address type='pci' domain='0x0000' bus='0x00' slot='0x02'
>> function='0x0'/>
>> </controller>
>> <controller type='pci' index='2' model='pci-bridge'>
>> <address type='pci' domain='0x0000' bus='0x01' slot='0x01'
>> function='0x0'/>
>> </controller>
>> [...]
>> <hostdev mode='subsystem' type='pci' managed='yes'>
>> <driver name='vfio'/>
>> <source>
>> <address domain='0x0000' bus='0x03' slot='0x00'
>> function='0x0'/>
>> </source>
>> <address type='pci' domain='0x0000' bus='0x02' slot='0x06'
>> function='0x0'/>
>> </hostdev>
>> [...]
>>
>> To my understanding, this will plug the host device into the
>> pci-bridge controller.
>> The guest OS doesn't boot with this and resets right after bios.
>
> Ugh. That's very unfortunate. This is the first report I've heard of
> something failing in such a bad way due to being plugged into a
> pci-bridge slot; up until now I'd only heard that there is some extra
> PCIe functionality that would be missing if a device was plugged into a
> PCI slot vs. PCIe.
>
> Can I ask what type of device this is?
>
It's a Marvell 88SE9172 SATA controller, here is the lspci -vvv
03:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9172 SATA
6Gb/s Controller (rev 11) (prog-if 01 [AHCI 1.0])
Subsystem: Gigabyte Technology Co., Ltd Device b000
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR+ FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
Interrupt: pin A routed to IRQ 47
Region 0: I/O ports at d040 [disabled] [size=8]
Region 1: I/O ports at d030 [disabled] [size=4]
Region 2: I/O ports at d020 [disabled] [size=8]
Region 3: I/O ports at d010 [disabled] [size=4]
Region 4: I/O ports at d000 [disabled] [size=16]
Region 5: Memory at f7610000 (32-bit, non-prefetchable)
[disabled] [size=512]
Expansion ROM at f7600000 [disabled by cmd] [size=64K]
Capabilities: [40] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA
PME(D0-,D1-,D2-,D3hot+,D3cold-)
Status: D3 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit-
Address: 00000000 Data: 0000
Capabilities: [70] Express (v2) Legacy Endpoint, MSI 00
DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s
<1us, L1 <8us
ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
DevCtl: Report errors: Correctable- Non-Fatal- Fatal-
Unsupported-
RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
MaxPayload 128 bytes, MaxReadReq 512 bytes
DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr-
TransPend-
LnkCap: Port #0, Speed 5GT/s, Width x1, ASPM L0s L1,
Latency L0 <512ns, L1 <64us
ClockPM- Surprise- LLActRep- BwNot-
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain-
CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 5GT/s, Width x1, TrErr- Train- SlotClk+
DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Not Supported, TimeoutDis+,
LTR-, OBFF Not Supported
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-,
LTR-, OBFF Disabled
LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance-
SpeedDis-
Transmit Margin: Normal Operating Range,
EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -3.5dB,
EqualizationComplete-, EqualizationPhase1-
EqualizationPhase2-, EqualizationPhase3-,
LinkEqualizationRequest-
Capabilities: [100 v1] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt-
UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout-
NonFatalErr+
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout-
NonFatalErr+
AERCap: First Error Pointer: 00, GenCap- CGenEn- ChkCap-
ChkEn-
Kernel driver in use: vfio-pci
The second one I'm trying to pass through is a Renesas uPD720201 USB 3.0
Host Controller, but first I wanted to get the SATA controller working
in libvirt. I will try to leave out the SATA controller and see what
happens with only the USB3 controller.
>>
>> Manually setting
>> <address type='pci' domain='0x0000' bus='0x00' slot='0x1E'
>> function='0x0'/>
>> cause XML validation failure.
>>
>> Is there any way in libvirt XML to plug a host's PCI-E device directly
>> into the pcie-root port, like it works on qemu command line?
>
>
> I'm sorry to say, no. With very few (and specific) exceptions, libvirt
> insists that all guest devices be plugged into a hot-pluggable PCI slot
> - this eliminates both the PCIe "root complex" (a.k.a. pcie.0) as well
> as the dmi-to-pci-controller that is plugged into pcie.0 (because
> pci-to-dmi controllers' slots don't support hot-plug).
>
> This is done because, for now, almost all devices that qemu knows about
> are PCI (no PCI-e) devices, and if we allowed plugging them into pcie.0
> now, then on the day in the future when qemu begins enforcing the
> difference between PCI and PCIe (currently it doesn't), the world would
> be full of libvirt configs that would no longer work.
>
> There was some discussion about this a month or two ago either on
> libvir-list or maybe it was the qemu-devel list. We decided that qemu
> needs to provide some sort of introspection of the devices' connection
> types so that libvirt can determine what device can plug into which
> slots; at that time we'll be able to allow exactly what's proper in
> each
> case. In the meantime we're stuck with being overly cautious in order
> to
> prevent future catastrophe.
>
Understood, thanks for the explanation.
>>
>> I'm aware I could use something like
>>
>> <qemu:commandline>
>> <qemu:arg value='-device'/>
>> <qemu:arg value='vfio-pci,host=05:00.0,bus=pcie.0'/>
>> </qemu:commandline>
>>
>> but I insist on running the VM as non-root, and if I got that right I
>> need to configure at least one vfio device (or memory locking) in
>> order for libvirt to set a proper RLIMIT_MEMLOCK value.
>>
>> Any help would be be appreciated.
>
> For now at least, you'll need to let it plug into the pci-bridge device
> pci.2 (which, as you've found, libvirt will automatically find when you
> don't specify any address). Unfortunately that doesn't do you much
> good,
> since that particular device you're assigning actually requires that it
> be plugged into the PCIe bus.
>
> I'm wondering as I type if possibly we could relax the enforcement of
> the "PCI only" rule such that we allow explicitly placing any device on
> any type of bus, but only auto-assign to a plain PCI slot. That may be
> a
> reasonable compromise until qemu has the required new device/controller
> introspection info available.
>
I like the idea.
Regards,
Thomas
More information about the libvirt-users
mailing list