[libvirt] new libvirt "pci" controller type and pcie/q35 (was Re: [PATCH 4/7] add pci-bridge controller type)
Alex Williamson
alex.williamson at redhat.com
Mon Apr 15 17:27:03 UTC 2013
On Fri, 2013-04-12 at 11:46 -0400, Laine Stump wrote:
> On 04/11/2013 07:23 AM, Michael S. Tsirkin wrote:
> > On Thu, Apr 11, 2013 at 07:03:56AM -0400, Laine Stump wrote:
> >> On 04/10/2013 05:26 AM, Daniel P. Berrange wrote:
> >>> On Tue, Apr 09, 2013 at 04:06:06PM -0400, Laine Stump wrote:
> >>>> On 04/09/2013 04:58 AM, Daniel P. Berrange wrote:
> >>>>> On Mon, Apr 08, 2013 at 03:32:07PM -0400, Laine Stump wrote:
> >>>>> Actually I do wonder if we should reprent a PCI root as two
> >>>>> <controller> elements, one representing the actual PCI root
> >>>>> device, and the other representing the host bridge that is
> >>>>> built-in.
> >>>>>
> >>>>> Also we should use the actual model names, not 'pci-root' or
> >>>>> 'pcie-root' but rather i440FX for "pc" machine type, and whatever
> >>>>> the q35 model name is.
> >>>>>
> >>>>> - One PCI root with built-in PCI bus (ie todays' setup)
> >>>>>
> >>>>> <controller type="pci-root" index="0">
> >>>>> <model name="i440FX"/>
> >>>>> </controller>
> >>>>> <controller type="pci" index="0"> <!-- Host bridge -->
> >>>>> <address type='pci' domain='0' bus='0' slot='0''/>
> >>>> Isn't this saying that the bridge connects to itself? (since bus 0 is
> >>>> this bus)
> >>>>
> >>>> I understand (again, possibly wrongly) that the builtin PCI bus connects
> >>>> to the chipset using its own slot 0 (that's why it's reserved), but
> >>>> that's its address on itself. How is this bridge associated with the
> >>>> pci-root?
> >>>>
> >>>> Ah, I *think* I see it - the domain attribute of the pci controller is
> >>>> matched to the index of the pci-root controller, correct? But there's
> >>>> still something strange about the <address> of the pci controller being
> >>>> self-referential.
> >>> Yes, the index of the pci-root matches the 'domain' of <address>
> >>
> >> Okay, then the way that libvirt differentiates between a pci bridge that
> >> is connected to the root, and one that is connected to a slot of another
> >> bridge is 1) the "bus" attribute of the bridge's <address> matches the
> >> "index" attribute of the bridge itself, and 2) "slot" is always 0. Correct?
> >>
> >> (The corollary of this is that if slot == 0 and bus != index, or bus ==
> >> index and slot != 0, it is a configuration error).
> >>
> >> I'm still unclear on the usefulness of the pci-root controller though -
> >> all the necessary information is contained in the pci controller, except
> >> for the type of root. But in the case of pcie root, I think you're not
> >> allowed to connect a standard bridge to it, only a "dmi-to-pci-bridge"
> >> (i82801b11-bridge)
> > Yes you can connect a pci bridge to pcie-root.
> > It's represented as a root complex integrated device.
Is this accurate? Per the PCI express spec, any PCI express device
needs to have a PCI express capability, which our pci-bridge does not.
I think this is one of the main differences for our i82801b11-bridge,
that it exposes itself as a root complex integrated endpoint, so we know
it's effectively a PCIe-to-PCI bridge. We'll be asking for trouble
if/when we get guest IOMMU support if we are lax about using PCI-to-PCI
bridges where we should have PCIe-to-PCI bridges. There are plenty of
examples to the contrary of root complex integrated endpoints without an
express capability, but that doesn't make it correct to the spec.
> ARGHH!! Just when I think I'm starting to understand *something* about
> these devices...
>
> (later edit: after some coaching on IRC, I *think* I've got a bit better
> handle on it.)
>
> >>>>> </controller>
> >>>>> <interface type='direct'>
> >>>>> ...
> >>>>> <address type='pci' domain='0' bus='0' slot='3'/>
> >>>>> </controller>
> >>>>>
> >>>>> - One PCI root with built-in PCI bus and extra PCI bridge
> >>>>>
> >>>>> <controller type="pci-root" index="0">
> >>>>> <model name="i440FX"/>
> >>>>> </controller>
> >>>>> <controller type="pci" index="0"> <!-- Host bridge -->
> >>>>> <address type='pci' domain='0' bus='0' slot='0'/>
> >>>>> </controller>
> >>>>> <controller type="pci" index="1"> <!-- Additional bridge -->
> >>>>> <address type='pci' domain='0' bus='0' slot='1'/>
> >>>>> </controller>
> >>>>> <interface type='direct'>
> >>>>> ...
> >>>>> <address type='pci' domain='0' bus='1' slot='3'/>
> >>>>> </controller>
> >>>>>
> >>>>> - One PCI root with built-in PCI bus, PCI-E bus and and extra PCI bridge
> >>>>> (ie possible q35 setup)
> >>>> Why would a q35 machine have an i440FX pci-root?
> >>> It shouldn't, that's a typo
> >>>
> >>>>> <controller type="pci-root" index="0">
> >>>>> <model name="i440FX"/>
> >>>>> </controller>
> >>>>> <controller type="pci" index="0"> <!-- Host bridge -->
> >>>>> <address type='pci' domain='0' bus='0' slot='0'/>
> >>>>> </controller>
> >>>>> <controller type="pci" index="1"> <!-- Additional bridge -->
> >>>>> <address type='pci' domain='0' bus='0' slot='1'/>
> >>>>> </controller>
> >>>>> <controller type="pci" index="1"> <!-- Additional bridge -->
> >>>>> <address type='pci' domain='0' bus='0' slot='1'/>
> >>>>> </controller>
> >>>> I think you did a cut-paste here and intended to change something, but
> >>>> didn't - those two bridges are identical.
> >>> Yep, the slot should be 2 in the second one
> >>>
> >>>>> <interface type='direct'>
> >>>>> ...
> >>>>> <address type='pci' domain='0' bus='1' slot='3'/>
> >>>>> </controller>
> >>>>>
> >>>>> So if we later allowed for mutiple PCI roots, then we'd have something
> >>>>> like
> >>>>>
> >>>>> <controller type="pci-root" index="0">
> >>>>> <model name="i440FX"/>
> >>>>> </controller>
> >>>>> <controller type="pci-root" index="1">
> >>>>> <model name="i440FX"/>
> >>>>> </controller>
> >>>>> <controller type="pci" index="0"> <!-- Host bridge 1 -->
> >>>>> <address type='pci' domain='0' bus='0' slot='0''/>
> >>>>> </controller>
> >>>>> <controller type="pci" index="0"> <!-- Host bridge 2 -->
> >>>>> <address type='pci' domain='1' bus='0' slot='0''/>
> >>>>> </controller>
>
>
> There is a problem here - within a given controller type, we will now
> have the possibility of multiple controllers with the same index - the
> differentiating attribute will be in the <address> subelement, which
> could create some awkwardness. Maybe instead this should be handled with
> a different model of pci controller, and we can add a "domain" attribute
> at the toplevel rather than specifying an <address>?
On real hardware, the platform can specify the _BBN (Base Bus Number =
bus) and the _SEG (Segment = domain) of the host bridge. So perhaps you
want something like:
<controller type="pci-host-bridge">
<model name="i440FX"/>
<address type="pci-host-bridge-addr" domain='1' bus='0'/>
</controller>
"index" is confusing to me.
> >>>>> <interface type='direct'> <!-- NIC on host bridge 2 -->
> >>>>> ...
> >>>>> <address type='pci' domain='1' bus='0' slot='3'/>
> >>>>> </controller>
> >>>>>
> >>>>>
> >>>>> NB this means that 'index' values can be reused against the
> >>>>> <controller>, provided they are setup on different pci-roots.
> >>>>>
> >>>>>> (also note that it might happen that the bus number in libvirt's config
> >>>>>> will correspond to the bus numbering that shows up in the guest OS, but
> >>>>>> that will just be a happy coincidence)
> >>>>>>
> >>>>>> Does this make sense?
> >>>>> Yep, I think we're fairly close.
> >>>> What about the other types of pci controllers that are used by PCIe? We
> >>>> should make sure they fit in this model before we settle on it.
> >>> What do they do ?
>
> (The descriptions of different models below tell what each of these
> other devices does; in short, they're all just some sort of electronic
> Lego to help connect PCI and PCIe devices into a tree).
>
> Okay, I'll make yet another attempt at understanding these devices, and
> suggesting how they can all be described in the XML. I'm thinking that
> *all* of the express hubs, switch ports, bridges, etc can be described
> in xml in the manner above, i.e.:
>
> <controller type='pci' index='n'>
> <model type='xxx'/>
> </controller>
>
> and that the method for connecting a device to any of them would be by
> specifying:
>
> <address type='pci' domain='n' bus='n' slot='n' function='n'/>
>
> Any limitations about which devices/controllers can connect to which
> controllers, and how many devices can connect to any particular
> controller will be derived from the <model type='xxx'/>. (And, as we've
> said before, although qemu doesn't assign each of these controllers a
> numeric bus id, and although we can make no guarantee that the bus id we
> use for a particular controller is what will be used by the guest
> BIOS/OS, it's still a convenient notation and works well with other
> hypervisors as well as qemu. I'll also note that when I run lspci on an
> X58-based machine I have here, *all* of the relationships between all
> the devices listed below are described with simple bus:slot.function
> numbers.)
>
> Here is a list of the pci controller model types and their restrictions
> (thanks to mst and aw for repeating these over and over to me; I'm sure
> I still have made mistakes, but at least it's getting closer).
>
>
> <controller type='pci-root'>
> ============================
>
> Upstream: nothing
> Downstream: only a single pci-root-bus (implied)
> qemu commandline: nothing (it's implied in the q35 machinetype)
>
> Explanation:
>
> Each machine will have a different controller called "pci-root" as
> outlined above by Daniel. Two types of pci-root will be supported:
> i440FX and q35. If a pci-root is not spelled out in the config, one will
> be auto-added (depending on machinetype).
>
> An i440FX pci-root has an implicitly added pci-bridge at 0:0:0.0 (and
> any bridge that has an address of slot='0' on its own bus is, by
> definition, connected to a pci-root controller - the two are matched by
> setting "domain" in the address of the pci-bridge to "index" of the
> pci-root). This bridge can only have PCI devices added.
>
> A q35 pci-root also implies a different kind of pci-bridge device - one
> that can only have PCIe devices/controllers attached, but is otherwise
> identical to the pci-bridge added for i440FX. This bus will be called
> "root-bus" (Note that there are generally followed conventions for what
> can be connected to which slot on this bus, and we will probably follow
> those conventions when building a machine, *but* we will not hardcode
> this convention into libvirt; each q35 machine will be an empty slate)
>
>
> <controller type='pci'>
> =======================
>
> This will be used for *all* of the following controller devices
> supported by qemu:
>
> <model type='pcie-root-bus'/> (implicit/integrated)
> ----------------------------
>
> Upstream: connect to pci-root controller *only*
> Downstream: 32 slots, PCIe devices only, no hotplug.
> qemu commandline: nothing (implicit in the q35-* machinetype)
>
> This controller is the bus described above that connects to a q35's
> pci-root, and provides places for PCIe devices to connect. Examples are
> root-ports, dmi-to-pci-bridges sata controllers, integrated
> sound/usb/ethernet devices (do any of those that can be connected to the
> pcie-root-bus exist yet?).
>
> There is only one of these controllers, and it will *always* be
> index='0', and will always have the following address:
>
> <address type='pci' domain='0' bus='0' slot='0' function='0'/>
Implicit devices make me nervous, why wouldn't this just be a pcie-root
(or pcie-host-bridge)? If we want to support multiple host bridges,
there can certainly be more than one, so the index='0' assumption seems
to fall apart.
> <model type='root-port'/> (ioh3420)
> -------------------------
>
> Upstream: PCIe, connect to pcie-root-bus *only* (?)
yes
> Downstream: 1 slot, PCIe devices only (?)
yes
> qemu commandline: -device ioh3420,...
>
> These can only connect to the "pcie-root-bus" of of a q35 (implying that
> this bus will need to have a different model name than the simple
> "pci-bridge"
>
>
> <model type='dmi-to-pci-bridge'/> (i82801b11-bridge)
I'm worried this name is either too specific or too generic. What
happens when we add a generic pcie-bridge and want to use that instead
of the i82801b11-bridge? The guest really only sees this as a
PCIe-to-PCI bridge, it just happens that on q35 this attaches at the DMI
port of the MCH.
> ---------------------------------
>
> (btw, what does "dmi" mean?)
http://en.wikipedia.org/wiki/Direct_Media_Interface
> Upstream: pcie-root-bus *only*
And only to a specific q35 slot (1e.0) for the i82801b11-bridge.
> Downstream: 32 slots, any PCI device, no hotplug (?)
Yet, but I think this is where we want to implement ACPI based hotplug.
> qemu commandline: -device i82801b11-bridge,...
>
>
> <model type='upstream-switch-port'/> (x3130-upstream)
> ------------------------------------
>
> Upstream: PCIe, connect to pcie-root-bus, root-port, or
> downstream-switch-port (?)
yes
> Downstream: 32 slots, connect *only* to downstream-switch-port
I can't verify that there are 32 slots, mst? I've only setup downstream
ports within slot 0.
> qemu-commandline: -device x3130-upstream
>
>
> This is the upper side of a switch that can multiplex multiple devices
> onto a single port. It's only useful when one or more downstream switch
> ports are connected to it.
>
> <model type='downstream-switch-port'/> (xio3130-downstream)
> --------------------------------------
>
> Upstream: connect *only* to upstream-switch-port
> Downstream: 1 slot, any PCIe device
> qemu commandline: -device xio3130-downstream
>
> You can connect one or more of these to an upstream-switch-port in order
> to effectively plug multiple devices into a single PCIe port.
>
> <model type='pci-bridge'/> (pci-bridge)
> --------------------------
>
> Upstream: PCI, connect to 1) pci-root, 2) dmi-to-pci-bridge, 3)
> another pci-bridge
> Downstream: any PCI device, 32 slots
> qemu commandline: -device pci-bridge,...
>
> This differs from dmi-to-pci-bridge in that its upstream connection is
> PCI rather than PCIe (so it will work on an i440FX system, which has no
> root PCIe bus) and that hotplug is supported. In general, if a guest
> will have any PCI devices, one of these controllers should be added, and
>
> ===============================================================
>
>
> Comment: I'm not quite convinced that we really need the separate
> "pci-root" device. Since 1) every pci-root will *always* have either a
> pcie-root-bus or a pci-bridge connected to it, 2) the pci-root-bus will
> only ever be connected to the pci-root, and 3) the pci-bridge that
> connects to it will need special handling within the pci-bridge case
> anyway, why not:
>
> 1) eliminate the separate pci-root controller type
>
> 2) within <controller type='pci'>, a new <model type='pci-root-bus'/>
> will be added.
>
> 3) a pcie-root-bus will automatically be added for q35 machinetypes, and
> pci-root-bus for any machinetype that supports a PCI bus (e.g. "pc-*")
>
> 4) model type='pci-root-bus' will behave like pci-bridge, except that it
> will be an implicit device (nothing on qemu commandline) and it won't
> need an <address> element (neither will pcie-root-bus).
I think they should both have a domain + bus address to make it possible
to build multi-domain/multi-host bridge systems. They do not use any
slots through.
> 5) to support multiple domains, we can simply add a "domain" attribute
> to the toplevel of controller.
>
Or this Wouldn't even be unnecessary if we supported a 'pci-root-addr'
address type for the above with the default being domain=0, bus=0? I
suppose it doesn't matter whether it's a separate attribute or new
address type though. Thanks,
Alex
More information about the libvir-list
mailing list