[libvirt] new libvirt "pci" controller type and pcie/q35 (was Re: [PATCH 4/7] add pci-bridge controller type)

Laine Stump laine at laine.org
Mon Apr 8 16:37:49 UTC 2013


On 04/05/2013 03:26 PM, Alex Williamson wrote:
> On Fri, 2013-04-05 at 14:42 -0400, Laine Stump wrote:
>> On 04/05/2013 01:38 PM, Daniel P. Berrange wrote:
>>> On Fri, Apr 05, 2013 at 12:32:04PM -0400, Laine Stump wrote:
>>>> On 04/03/2013 11:50 AM, Ján Tomko wrote:
>>>>> From: liguang <lig.fnst at cn.fujitsu.com>
>>>>>
>>>>> add a new controller type, then one can
>>>>> define a pci-bridge controller like this:
>>>>>     <controller type='pci-bridge' index='0'/>
>>>> In the next patch we're prohibiting exactly this config (index='0')
>>>> because the pre-existing pci bus on the "pc-*" machinetypes is already
>>>> named pci.0. If we don't allow it, we shouldn't include it as an example
>>>> in the commit log :-)
>>> NB, it isn't always named 'pci.0' - on many arches it is merely 'pci'.
>> Yeah, I'm just using that as a convenient shorthand. The final decision
>> on whether to use pci.0 or pci happens down in the qemuBuildCommandline().
>>
>>>> More on this - one of the things this points out is that there is no
>>>> representation in the config of the pci.0 bus, it's just assumed to
>>>> always be there. That is the case for pc-* machinetypes (and probably
>>>> several others with PCI buses), but for q35, there is no pci.0 bus in
>>>> the basic machine, only a pcie.0; if you want a pci.0 on q35 (which
>>>> *will* be necessary in order to attach any pci devices, so I imagine we
>>>> will always want one), you have to attach a pcie->pci bridge, which is
>>>> the device "i82801b11-bridge", to pcie.0.
>>>> The reason I bring this up here, is I'm wondering:
>>>>
>>>> 1) should we have some representation of the default pci.0 bus in the
>>>> config, even though it is just "always there" for the pc machinetypes
>>>> and there is no way to disable it, and nothing on the commandline that
>>>> specifies its existence?
>>> Yep, we should be aiming for the XML to fully describe the machine
>>> hardware. So since we're adding the concept of PCI controllers/bridges
>>> etc to the XML, we should be auto-adding the default bus to the XML.
>>>
>>>> 2) For the q35 machinetype, should we just always add an
>>>> i82801b11-bridge device and name it pci.0? Or should that need to be
>>>> present in the xml?
>>> We've been burnt before auto-adding stuff that ought to have
>>> been optional. So I'd tend towards only having the minimal
>>> config that is required. If the users want this, let them
>>> explicitly ask for the bridge

Okay. This makes for a larger burden on the
user/virt-manager/boxes/libvirt-designer, but does prevent us from
setting up an undesirable default that we can't rescue ourselves from :-)


>>>
>>> Also from the apps POV the QEMU device name is irrelevant. The
>>> XML config works off the PCI addresses. So there's no need
>>> to force/specialcase a i82801b11-bridge to use the name
>>> 'pci.0'.
>>
>> Sure. I just mean "pci bus 0" (hmm, but actually this does point out a
>> problem with my logic - the same namespace (well, "numbering space") is
>> used for both pcie and pci buses, so on a q35 system, bus=0 is already
>> taken by pcie.0; that means that the first pci bus would need to use a
>> different bus number anyway, so it wouldn't be so easy to switch an
>> existing domain from pc to q35 - every PCI device would need to have its
>> bus number modified. I suppose that's reasonable to expect, though.
> I would think you'd want to differentiate PCI from PCIe anyway.  PCI is
> a bus and you have 32 slots per bus to fill.  PCIe is a point-to-point
> link and you really only have slot 0 available.  Perhaps that puts them
> in different number spaces already.

Are you saying that it's okay to have a bus=0 for pci and a different
bus=0 for pcie?

I was hoping that what is used in libvirt's config could mirror as
closely as possible the numbering that you see in the output of lspci on
the guest, but it sounds like that numbering is something done at the
whim of the guest, with no basis in (standard) reality, is that right?


>>>> 3) Most important - depending on the answers to (1) and (2), should we
>>>> maybe name this device "pci", and use a different backend depending on
>>>> index and machinetype? (or alternately explicitly specifiable with a
>>>> <driver> subelement). To be specific, we would have:
>>>>
>>>>    <controller type='pci' index='0'/>
>>>>
>>>> which on pc machinetypes would just be a placeholder in the config (and
>>>> always inserted if it wasn't there, for machinetypes that have a pci
>>>> bus). On the q35 machinetype, that same line would equate to adding an
>>>> i82801b11-bridge device (with source defaulting to
>>>> bus=pcie.0,addr=1e.0). This would serve several purposes:
>>>>
>>>> a) on pc machinetypes, it would be a visual aid indicating that pci.0
>>>> exists, and that index='0' isn't available for a new pci controller.
>>>>
>>>> b) it would make switching a domain config from pc to q35 simpler, since
>>>> pci.0 would always already be in place for attaching pci devices
>>>> (including pci.1, pci.2, etc)
>>>>
>>>> c) it would make the config a true complete description of the machine
>>>> being created.
>>>>
>>>> (I've suggested naming the controller "pci" rather than "pci-bridge"
>>>> because in the case of a "root" bus like pci.0 it seems to not be a
>>>> "bridge", but maybe the name "pci-bridge" is always appropriate, even
>>>> when it's a root bus. Maybe someone with better pci/pcie knowledge can
>>>> provide an opinion on this)
>>> I think "pci" is a little too generic - how about we call it  'pci-root'
>> Okay, so a separate "pci-root" device along with "pci-bridge"? What I
>> was really hoping was to have all PCI buses represented in a common way
>> in the config. How about a controller called "pci" with different types,
>> "root" and "bridge"? And since they use the same numbering space as pcie
>> buses, maybe the pcie controllers (including the root and the hubs and
>> ???) would be different types of PCI controllers. That would make it
>> easier (i.e. *possible*) to avoid collisions in use of bus numbers.
>>
>> Alex or mst, any advice/opinions on how to represent all the different
>> q35 devices that consume bus numbers in a succinct fashion?
> Note that none of these are really bus numbers, they're just bus
> identifiers.  The BIOS and the guest running define the bus numbers.
> "root" also has special meaning in PCI, so for instance I wouldn't name
> a bus behind the i82801b11-bridge "pci-root".  Somehow we also need to
> deal with what can be attached where.  For instance a pci-bridge is a
> PCI device and can only go on a PCI bus.  The equivalent structure on
> PCIe is an upstream switch port with some number of downstream switch
> ports.  Each of those are specific to the bus type.

I think we're starting to get closer to the concrete problem that's
bothering me. As I understand it (and again - "what I understand" has
repeatedly been shown to be incorrect in this thread :-):

* Ihere are multiple different types of devices that provide a bus with
1 or more "slots" that PCI devices (e.g., the virtio-net-pci device, the
e1000 network device, etc) can be plugged into.

* In the config for those devices, there is a required (auto-generated
if not explicitly provided) <address> element that indicates what
controller that device is plugged into e.g.:

    <interface type='direct'>
      ...
      <address type='pci' domain='0' bus='0' slot='3' function='0'/>
      ...
    </interface>

* domain is always hardcoded to 0, and in the past bus was also always
hardcoded to 0 because until now there has only been a single place
where PCI devices could be connected - the builtin pci.0 bus, which is a
part of the basic "pc" (and some others) virtual machine and provides 32
slots.

* Now we are adding the ability to define new PCI buses, for now just a
single kind - a pci-bridge controller, which itself must connect to an
existing PCI slot, and provides 32 new PCI slots. But in the future
there will be more different types of controllers that provide one or
more PCI slots where PCI devices/controllers can be plugged in.

* In these patches adding support for pci-bridge, we are making the
assumption that there is a 1:1 correspondence between the "index='n'"
attribute of the pci-bridge controller and the "bus='n'" attribute of
the <address> element in devices that will be plugged into that
controller. So for example if we have:


   <controller type='pci-bridge' index='1'>
      <address type='pci' domain='0' bus='0' slot='10' function='0'/>
   </controller>

and then change the <interface> definition above to say "bus='1'", that
interface device will plug into this new bus at slot 3.

* So let's assume that we add a new controller called "dmi-to-pci-bridge:

  <controller type='dmi-to-pci-bridge' index='0'/>

Ignoring for now the question of what address we give in the definition
of *this* device (which is itself problematic - do we need a new "pcie"
address type?), if some device is then defined with


   <address type='pci bus='0' .../>

How do we differentiate between that meaning "the pci-ptp controller
that is index='0'" and "the pci-bridge controller that is index='0'"? Do
we need to expand our <address> element further? If, as I think you
suggest, we have multiple different kinds of controllers that provide
PCI slots, each with its own namespace, the current pci address element
is inadequate to unambiguously describe where a pci device should be
plugged in.

Perhaps we should be referencing the "<alias name='nnn'/>" element of
each controller in the pci address of the target device, e.g.:

   <controller type='pci-bridge' index='0'>
     <alias name='pci.0'/>  <!-- obviously on a machine with no builtin
pci.0! -->
   </controller/>
   <controller type='dmi-to-pci-bridge' index='0'>
     <alias name='dmi-to-pci-bridge.0'/>
   </controller>
   <interface type='direct'>
     ...
     <address type='pci' controller='dmi-to-pci-bridge.0' slot='3'
function='0'/>
   </interface>

(or, since this "controller" attribute really obsolates the numeric
"bus" attribute, maybe it could be "bus='dmi-to-pci-bridge.0'", and we
could continue to support "bus='0'" for legacy configs).

I believe right now the alias name is always auto-generated; we would
need to make that so that when explicitly provided it would be
guaranteed to never change (and if that's not possible to do in a
backward compatible manner, then we need to come up with some new
attribute to use in this manner)

Alternately, we could add new types to address, one for each new type of
controller, then define the devices like this:

    <interface type='direct'>
      <address type='pci-bridge' bus='0' slot='3' function='0'/>
    <interface
    <interface type='direct'>
      <address type='dmi-to-pci-bridge' bus='0' slot='3' function='0'/>
    </interface>

(yes, I know you wouldn't want to plug a network device into the
dmi-to-pci-bridge directly, this is just for the sake of example)

You'll notice that this makes the bus attribute obsolete.


(side note: I know that this discussion has gone far beyond just talking
about adding a single new type of controller (pci-bridge), but how we do
this device will have implications far beyond, so we need to figure it
out now.)

> For PCIe, we create new buses for root ports (ioh3420), upstream switch
> ports (xio3130-upstream), downstream switch ports (xio3130-downstream),
> and the dmi-to-pci bridge (i82801b11-bridge).  For PCI, PCI-to-PCI
> bridges create new buses (pci-bridge and dec-21154).
>
> One of my goals is to move us away from emulation of specific chips and
> create more devices like pci-bridge that adhere to the standard, but
> don't try to emulate a specific device.  Then we might have "root-port",
> "pcie-upstream-switch-port", "pcie-downstream-switch-port", and
> "dmi-to-pci-bridge" (none of these names have been discussed).

That makes sense to me at the level of libvirt, but in qemu don't you
need to "emulate specific devices" anyway, in order for the guest OS to
operate properly? If that's the case and there are different chips that
implement the same functionality in a different manner, how would you
decide which of those should be chosen as "the *only" dmi-to-pci-bridge"?





More information about the libvir-list mailing list