[libvirt] new libvirt "pci" controller type and pcie/q35 (was Re: [PATCH 4/7] add pci-bridge controller type)

Alex Williamson alex.williamson at redhat.com
Mon Apr 15 17:27:03 UTC 2013


On Fri, 2013-04-12 at 11:46 -0400, Laine Stump wrote:
> On 04/11/2013 07:23 AM, Michael S. Tsirkin wrote:
> > On Thu, Apr 11, 2013 at 07:03:56AM -0400, Laine Stump wrote:
> >> On 04/10/2013 05:26 AM, Daniel P. Berrange wrote:
> >>> On Tue, Apr 09, 2013 at 04:06:06PM -0400, Laine Stump wrote:
> >>>> On 04/09/2013 04:58 AM, Daniel P. Berrange wrote:
> >>>>> On Mon, Apr 08, 2013 at 03:32:07PM -0400, Laine Stump wrote:
> >>>>> Actually I do wonder if we should reprent a PCI root as two
> >>>>> <controller> elements, one representing the actual PCI root
> >>>>> device, and the other representing the host bridge that is
> >>>>> built-in.
> >>>>>
> >>>>> Also we should use the actual model names, not 'pci-root' or
> >>>>> 'pcie-root' but rather i440FX for "pc" machine type, and whatever
> >>>>> the q35 model name is.
> >>>>>
> >>>>>  - One PCI root with built-in PCI bus (ie todays' setup)
> >>>>>
> >>>>>    <controller type="pci-root" index="0">
> >>>>>      <model name="i440FX"/>
> >>>>>    </controller>
> >>>>>    <controller type="pci" index="0"> <!-- Host bridge -->
> >>>>>      <address type='pci' domain='0' bus='0' slot='0''/>
> >>>> Isn't this saying that the bridge connects to itself? (since bus 0 is
> >>>> this bus)
> >>>>
> >>>> I understand (again, possibly wrongly) that the builtin PCI bus connects
> >>>> to the chipset using its own slot 0 (that's why it's reserved), but
> >>>> that's its address on itself. How is this bridge associated with the
> >>>> pci-root?
> >>>>
> >>>> Ah, I *think* I see it - the domain attribute of the pci controller is
> >>>> matched to the index of the pci-root controller, correct? But there's
> >>>> still something strange about the <address> of the pci controller being
> >>>> self-referential.
> >>> Yes, the index of the pci-root matches the 'domain' of <address>
> >>
> >> Okay, then the way that libvirt differentiates between a pci bridge that
> >> is connected to the root, and one that is connected to a slot of another
> >> bridge is 1) the "bus" attribute of the bridge's <address> matches the
> >> "index" attribute of the bridge itself, and 2) "slot" is always 0. Correct?
> >>
> >> (The corollary of this is that if slot == 0 and bus != index, or bus ==
> >> index and slot != 0, it is a configuration error).
> >>
> >> I'm still unclear on the usefulness of the pci-root controller though -
> >> all the necessary information is contained in the pci controller, except
> >> for the type of root. But in the case of pcie root, I think you're not
> >> allowed to connect a standard bridge to it, only a "dmi-to-pci-bridge"
> >> (i82801b11-bridge)
> > Yes you can connect a pci bridge to pcie-root.
> > It's represented as a root complex integrated device.

Is this accurate?  Per the PCI express spec, any PCI express device
needs to have a PCI express capability, which our pci-bridge does not.
I think this is one of the main differences for our i82801b11-bridge,
that it exposes itself as a root complex integrated endpoint, so we know
it's effectively a PCIe-to-PCI bridge.  We'll be asking for trouble
if/when we get guest IOMMU support if we are lax about using PCI-to-PCI
bridges where we should have PCIe-to-PCI bridges.  There are plenty of
examples to the contrary of root complex integrated endpoints without an
express capability, but that doesn't make it correct to the spec.

> ARGHH!! Just when I think I'm starting to understand *something* about
> these devices...
> 
> (later edit: after some coaching on IRC, I *think* I've got a bit better
> handle on it.)
> 
> >>>>>    </controller>
> >>>>>    <interface type='direct'>
> >>>>>       ...
> >>>>>      <address type='pci' domain='0' bus='0' slot='3'/>
> >>>>>    </controller>
> >>>>>
> >>>>>  - One PCI root with built-in PCI bus and extra PCI bridge
> >>>>>
> >>>>>    <controller type="pci-root" index="0">
> >>>>>      <model name="i440FX"/>
> >>>>>    </controller>
> >>>>>    <controller type="pci" index="0"> <!-- Host bridge -->
> >>>>>      <address type='pci' domain='0' bus='0' slot='0'/>
> >>>>>    </controller>
> >>>>>    <controller type="pci" index="1"> <!-- Additional bridge -->
> >>>>>      <address type='pci' domain='0' bus='0' slot='1'/>
> >>>>>    </controller>
> >>>>>    <interface type='direct'>
> >>>>>       ...
> >>>>>      <address type='pci' domain='0' bus='1' slot='3'/>
> >>>>>    </controller>
> >>>>>
> >>>>>  - One PCI root with built-in PCI bus, PCI-E bus and and extra PCI bridge
> >>>>>    (ie possible q35 setup)
> >>>> Why would a q35 machine have an i440FX pci-root?
> >>> It shouldn't, that's a typo
> >>>
> >>>>>    <controller type="pci-root" index="0">
> >>>>>      <model name="i440FX"/>
> >>>>>    </controller>
> >>>>>    <controller type="pci" index="0"> <!-- Host bridge -->
> >>>>>      <address type='pci' domain='0' bus='0' slot='0'/>
> >>>>>    </controller>
> >>>>>    <controller type="pci" index="1"> <!-- Additional bridge -->
> >>>>>      <address type='pci' domain='0' bus='0' slot='1'/>
> >>>>>    </controller>
> >>>>>    <controller type="pci" index="1"> <!-- Additional bridge -->
> >>>>>      <address type='pci' domain='0' bus='0' slot='1'/>
> >>>>>    </controller>
> >>>> I think you did a cut-paste here and intended to change something, but
> >>>> didn't - those two bridges are identical.
> >>> Yep, the slot should be 2 in the second one
> >>>
> >>>>>    <interface type='direct'>
> >>>>>       ...
> >>>>>      <address type='pci' domain='0' bus='1' slot='3'/>
> >>>>>    </controller>
> >>>>>
> >>>>> So if we later allowed for mutiple PCI roots, then we'd have something
> >>>>> like
> >>>>>
> >>>>>    <controller type="pci-root" index="0">
> >>>>>      <model name="i440FX"/>
> >>>>>    </controller>
> >>>>>    <controller type="pci-root" index="1">
> >>>>>      <model name="i440FX"/>
> >>>>>    </controller>
> >>>>>    <controller type="pci" index="0"> <!-- Host bridge 1 -->
> >>>>>      <address type='pci' domain='0' bus='0' slot='0''/>
> >>>>>    </controller>
> >>>>>    <controller type="pci" index="0"> <!-- Host bridge 2 -->
> >>>>>      <address type='pci' domain='1' bus='0' slot='0''/>
> >>>>>    </controller>
> 
> 
> There is a problem here - within a given controller type, we will now
> have the possibility of multiple controllers with the same index - the
> differentiating attribute will be in the <address> subelement, which
> could create some awkwardness. Maybe instead this should be handled with
> a different model of pci controller, and we can add a "domain" attribute
> at the toplevel rather than specifying an <address>?

On real hardware, the platform can specify the _BBN (Base Bus Number =
bus) and the _SEG (Segment = domain) of the host bridge.  So perhaps you
want something like:

<controller type="pci-host-bridge">
  <model name="i440FX"/>
  <address type="pci-host-bridge-addr" domain='1' bus='0'/>
</controller>

"index" is confusing to me.

> >>>>>    <interface type='direct'> <!-- NIC on host bridge 2 -->
> >>>>>       ...
> >>>>>      <address type='pci' domain='1' bus='0' slot='3'/>
> >>>>>    </controller>
> >>>>>
> >>>>>
> >>>>> NB this means that 'index' values can be reused against the
> >>>>> <controller>, provided they are setup on different pci-roots.
> >>>>>
> >>>>>> (also note that it might happen that the bus number in libvirt's config
> >>>>>> will correspond to the bus numbering that shows up in the guest OS, but
> >>>>>> that will just be a happy coincidence)
> >>>>>>
> >>>>>> Does this make sense?
> >>>>> Yep, I think we're fairly close.
> >>>> What about the other types of pci controllers that are used by PCIe? We
> >>>> should make sure they fit in this model before we settle on it.
> >>> What do they do ?
> 
> (The descriptions of different models below tell what each of these
> other devices does; in short, they're all just some sort of electronic
> Lego to help connect PCI and PCIe devices into a tree).
> 
> Okay, I'll make yet another attempt at understanding these devices, and
> suggesting how they can all be described in the XML. I'm thinking that
> *all* of the express hubs, switch ports, bridges, etc can be described
> in xml in the manner above, i.e.:
> 
>    <controller type='pci' index='n'>
>      <model type='xxx'/>
>    </controller>
> 
> and that the method for connecting a device to any of them would be by
> specifying:
> 
>      <address type='pci' domain='n' bus='n' slot='n' function='n'/>
> 
> Any limitations about which devices/controllers can connect to which
> controllers, and how many devices can connect to any particular
> controller will be derived from the <model type='xxx'/>. (And, as we've
> said before, although qemu doesn't assign each of these controllers a
> numeric bus id, and although we can make no guarantee that the bus id we
> use for a particular controller is what will be used by the guest
> BIOS/OS, it's still a convenient notation and works well with other
> hypervisors as well as qemu. I'll also note that when I run lspci on an
> X58-based machine I have here, *all* of the relationships between all
> the devices listed below are described with simple bus:slot.function
> numbers.)
> 
> Here is a list of the pci controller model types and their restrictions
> (thanks to mst and aw for repeating these over and over to me; I'm sure
> I still have made mistakes, but at least it's getting closer).
> 
> 
> <controller type='pci-root'>
> ============================
> 
> Upstream:         nothing
> Downstream:       only a single pci-root-bus (implied)
> qemu commandline: nothing (it's implied in the q35 machinetype)
> 
> Explanation:
> 
> Each machine will have a different controller called "pci-root" as
> outlined above by Daniel. Two types of pci-root will be supported:
> i440FX and q35. If a pci-root is not spelled out in the config, one will
> be auto-added (depending on machinetype).
> 
> An i440FX pci-root has an implicitly added pci-bridge at 0:0:0.0 (and
> any bridge that has an address of slot='0' on its own bus is, by
> definition, connected to a pci-root controller - the two are matched by
> setting "domain" in the address of the pci-bridge to "index" of the
> pci-root). This bridge can only have PCI devices added.
> 
> A q35 pci-root also implies a different kind of pci-bridge device - one
> that can only have PCIe devices/controllers attached, but is otherwise
> identical to the pci-bridge added for i440FX. This bus will be called
> "root-bus" (Note that there are generally followed conventions for what
> can be connected to which slot on this bus, and we will probably follow
> those conventions when building a machine, *but* we will not hardcode
> this convention into libvirt; each q35 machine will be an empty slate)
> 
> 
> <controller type='pci'>
> =======================
> 
> This will be used for *all* of the following controller devices
> supported by qemu:
> 
> <model type='pcie-root-bus'/> (implicit/integrated)
> ----------------------------
> 
> Upstream:         connect to pci-root controller *only*
> Downstream:       32 slots, PCIe devices only, no hotplug.
> qemu commandline: nothing (implicit in the q35-* machinetype)
> 
> This controller is the bus described above that connects to a q35's
> pci-root, and provides places for PCIe devices to connect. Examples are
> root-ports, dmi-to-pci-bridges sata controllers, integrated
> sound/usb/ethernet devices (do any of those that can be connected to the
> pcie-root-bus exist yet?).
> 
> There is only one of these controllers, and it will *always* be
> index='0', and will always have the following address:
> 
>   <address type='pci' domain='0' bus='0' slot='0' function='0'/>

Implicit devices make me nervous, why wouldn't this just be a pcie-root
(or pcie-host-bridge)?  If we want to support multiple host bridges,
there can certainly be more than one, so the index='0' assumption seems
to fall apart.

> <model type='root-port'/> (ioh3420)
> -------------------------
> 
> Upstream:         PCIe, connect to pcie-root-bus *only* (?)

yes

> Downstream:       1 slot, PCIe devices only (?)

yes

> qemu commandline: -device ioh3420,...
> 
> These can only connect to the "pcie-root-bus" of of a q35 (implying that
> this bus will need to have a different model name than the simple
> "pci-bridge"
> 
> 
> <model type='dmi-to-pci-bridge'/> (i82801b11-bridge)

I'm worried this name is either too specific or too generic.  What
happens when we add a generic pcie-bridge and want to use that instead
of the i82801b11-bridge?  The guest really only sees this as a
PCIe-to-PCI bridge, it just happens that on q35 this attaches at the DMI
port of the MCH.

> ---------------------------------
> 
> (btw, what does "dmi" mean?)

http://en.wikipedia.org/wiki/Direct_Media_Interface

> Upstream:         pcie-root-bus *only*

And only to a specific q35 slot (1e.0) for the i82801b11-bridge.

> Downstream:       32 slots, any PCI device, no hotplug (?)

Yet, but I think this is where we want to implement ACPI based hotplug.

> qemu commandline: -device i82801b11-bridge,...
> 
> 
> <model type='upstream-switch-port'/> (x3130-upstream)
> ------------------------------------
> 
> Upstream:         PCIe, connect to pcie-root-bus, root-port, or
> downstream-switch-port (?)

yes

> Downstream:       32 slots, connect *only* to downstream-switch-port

I can't verify that there are 32 slots, mst?  I've only setup downstream
ports within slot 0.

> qemu-commandline: -device x3130-upstream
> 
> 
> This is the upper side of a switch that can multiplex multiple devices
> onto a single port. It's only useful when one or more downstream switch
> ports are connected to it.
> 
> <model type='downstream-switch-port'/> (xio3130-downstream)
> --------------------------------------
> 
> Upstream:         connect *only* to upstream-switch-port
> Downstream:       1 slot, any PCIe device
> qemu commandline: -device xio3130-downstream
> 
> You can connect one or more of these to an upstream-switch-port in order
> to effectively plug multiple devices into a single PCIe port.
> 
> <model type='pci-bridge'/> (pci-bridge)
> --------------------------
> 
> Upstream:         PCI, connect to 1) pci-root, 2) dmi-to-pci-bridge, 3)
> another pci-bridge
> Downstream:       any PCI device, 32 slots
> qemu commandline: -device pci-bridge,...
> 
> This differs from dmi-to-pci-bridge in that its upstream connection is
> PCI rather than PCIe (so it will work on an i440FX system, which has no
> root PCIe bus) and that hotplug is supported. In general, if a guest
> will have any PCI devices, one of these controllers should be added, and
> 
> ===============================================================
> 
> 
> Comment: I'm not quite convinced that we really need the separate
> "pci-root" device. Since 1) every pci-root will *always* have either a
> pcie-root-bus or a pci-bridge connected to it, 2) the pci-root-bus will
> only ever be connected to the pci-root, and 3) the pci-bridge that
> connects to it will need special handling within the pci-bridge case
> anyway, why not:
> 
> 1) eliminate the separate pci-root controller type
> 
> 2) within <controller type='pci'>, a new <model type='pci-root-bus'/>
> will be added.
> 
> 3) a pcie-root-bus will automatically be added for q35 machinetypes, and
> pci-root-bus for any machinetype that supports a PCI bus (e.g. "pc-*")
> 
> 4) model type='pci-root-bus' will behave like pci-bridge, except that it
> will be an implicit device (nothing on qemu commandline) and it won't
> need an <address> element (neither will pcie-root-bus).

I think they should both have a domain + bus address to make it possible
to build multi-domain/multi-host bridge systems.  They do not use any
slots through.

> 5) to support multiple domains, we can simply add a "domain" attribute
> to the toplevel of controller.
> 

Or this Wouldn't even be unnecessary if we supported a 'pci-root-addr'
address type for the above with the default being domain=0, bus=0?  I
suppose it doesn't matter whether it's a separate attribute or new
address type though.  Thanks,

Alex




More information about the libvir-list mailing list