[libvirt] [PATCH v3 15/18] qemu: auto-add pcie-root-port/dmi-to-pci-bridge controllers as needed

Laine Stump laine at laine.org
Thu Oct 13 17:43:52 UTC 2016


On 10/11/2016 05:34 AM, Andrea Bolognani wrote:
> On Mon, 2016-10-10 at 15:43 -0400, Laine Stump wrote:
>>>> 3) Although it is strongly discouraged, it is legal for a pci-bridge
>>>>        to be directly plugged into pcie-root, and we don't want to
>>>>        auto-add a dmi-to-pci-bridge if there is already a pci-bridge
>>>>        that's been forced directly into pcie-root. Finally, although I
>>>>        fail to see the utility of it, it is legal to have something like
>>>>        this in the xml:
>>>>     
>>>>          <controller type='pci' model='pcie-root' index='0'/>
>>>>          <controller type='pci' model='pci-bridge' index='2'/>
>>>>     
>>>>        and that will lead to an automatically added dmi-to-pci-bridge at
>>>>        index=1 (to give the unaddressed pci-bridge a proper place to plug
>>>>        in):
>>>>     
>>>>          <controller type='pci' model='dmi-to-pci-bridge' index='1'/>
>>>>     
>>>>        (for example, see the existing test case
>>>>        "usb-controller-default-q35"). This is handled in
>>>>        qemuDomainPCIAddressSetCreate() when it's adding in controllers to
>>>>        fill holes in the indexes.
>>>   
>>> I wonder how this "feature" came to be... It seems to be the
>>> reason for quite a bit of convoluted code down below, which
>>> we could avoid if this had never worked at all. As is so
>>> often the case, too late for that :(
>>   
>> Maybe not. The only place I ever saw that was in the above test case,
>> and another named "usb-controller-explicit-q35", and the author of both
>> of those tests was (wait for it!), no, not Albert Einstein.  Andrea
>> Bolognani!
> Oh, *that* guy! It's always *that* guy, isn't it? :P
>
>> The only reason it worked in the past was because we always
>> automatically added the dmi-to-pci-bridge very early in the post-parse
>> processing. This implies that any saved config anywhere will already
>> have the necessary dmi-to-pci-bridge at index='1', so we only need to
>> preserve the behavior for new domain definitions that have a pci-bridge
>> at index='2' but nothing at index='1'.
>>   
>> Since you're the only person documented to have ever created a config
>> like that, and it would only be problematic if someone tried to create
>> another new config, maybe we should just stop accidentally supporting it
>> and count it as a bug that's being fixed. What's your opinion?
> Given the evidence you're presenting, I'm all for getting
> rid of it, especially since it will make the code below
> much simpler and hence more maintainable.
>
> Considering how critical that part of libvirt is, anything
> we can do to make it leaner and meaner is going to be a huge
> win in the long run.

I'm looking back through the code and wondering how to simplify it - it 
may be that the alternate method I had initially used (which failed that 
one test) is just as complicated as what I have now; unfortunately I 
didn't save it.

>
>>>> @@ -82,6 +82,30 @@ virDomainPCIControllerModelToConnectType(virDomainControllerModelPCI model)
>>>>          return 0;
>>>>      }
>>>>      
>>>> +
>>>> +static int
>>> s/int/virDomainControllerModelPCI/
>>   
>> But then we can't return -1 when there isn't a perfect match (that's why
>> I made it int)
> Right. Disregard my comments then :)
>
>>> Alternatively you could turn it into
>>>   
>>>      if ((flags & VIR_PCI_CONNECT_PCIE_DEVICE) ||
>>>          (flags & VIR_PCI_CONNECT_PCIE_SWITCH_UPSTREAM_PORT))
>>>   
>>> which is more obviously correct and also nicer to look at :)
>>   
>> ....but takes two operations instead of one.
> I hardly think this would turn out to be a bottleneck, but
> feel free to stick to the original implementation - after
> fixing it, of course :)
>
>>>> +            if (lowestDMIToPCIBridge > idx)
>>>> +                lowestDMIToPCIBridge = idx;
>>>> +        } else if (cont->model == VIR_DOMAIN_CONTROLLER_MODEL_PCI_BRIDGE) {
>>>> +            if (virDeviceInfoPCIAddressWanted(&cont->info)) {
>>>> +                if (lowestUnaddressedPCIBridge > idx)
>>>> +                    lowestUnaddressedPCIBridge = idx;
>>>> +            } else {
>>>> +                if (lowestAddressedPCIBridge > idx)
>>>> +                    lowestAddressedPCIBridge = idx;
>>>> +            }
>>>>              }
>>>> +    }
>>>> +
>>>> +    if (nbuses > 0 && !addrs->buses[0].model) {
>>>> +        if (virDomainPCIAddressBusSetModel(&addrs->buses[0],
>>>> +                                           VIR_DOMAIN_CONTROLLER_MODEL_PCI_ROOT) < 0)
>>>> +            goto error;
>>>> +    }
>>>   
>>> Shouldn't we use either PCI_ROOT or PCIE_ROOT based on the
>>> machine type here? And set hasPCIeRoot *after* doing so?
>>>   
>>> Sorry for the questions, I guess this is the point in the
>>> patch where I got a bit lost :(
> I'm afraid you missed this question :)


Sorry about the omission.  I've tried to limit the code that decides 
whether or not there should be a pci-root or a pcie-root to the one 
place when default controllers are being added (I forget the name of the 
function right now), and after that only decide based on whether a 
pci-root or pcie-root really is there in the config. My subconscious 
reasoning for this is that the extra calisthenics around supporting 
aarch64/virt with PCI(e) vs with mmio has made me wonder if there might 
be machinetypes in the future that could support one type of root bus or 
another (or none) according to what is in the config, rather than having 
a root bus just builtin to the machine. I don't know if that would ever 
happen, but if it did, this code would work without change - only the 
function adding the default controllers would need to be changed.

(Note that I used the same logic when deciding how to right 
qemuDomainMachineHasPCI[e]Root())(still not sure about that name, but it 
can always be changed later to remove the "Machine" if someone doesn't 
like it)


>
>>>> +    else if (lowestUnaddressedPCIBridge < MIN(lowestAddressedPCIBridge,
>>>> +                                              lowestDMIToPCIBridge))
>>>> +        defaultModel = VIR_DOMAIN_CONTROLLER_MODEL_DMI_TO_PCI_BRIDGE;
>>>   
>>> Again, a bit lost here, sorry :(
>>>   
>>> Basically if we've found a PCI bridge without address that
>>> can't be plugged into any existing PCI bridge or DMI-to-PCI
>>> bridge, we want to add a DMI-to-PCI bridge so that we have
>>> somewhere to plug it in, right?
> And I'm pretty sure I got it right here, but just to be on
> the safe side, it would be great if you could confirm or
> deny :)
>
>>>> +    else
>>>> +        defaultModel = VIR_DOMAIN_CONTROLLER_MODEL_PCIE_ROOT_PORT;
>>>> +
>>>> +    for (i = 1; i < addrs->nbuses; i++) {
>>>> +
>>>> +        if (addrs->buses[i].model)
>>>> +            continue;
>>>> +
>>>> +        if (virDomainPCIAddressBusSetModel(&addrs->buses[i], defaultModel) < 0)
>>>> +            goto error;
>>>> +
>>>> +        VIR_DEBUG("Auto-adding <controller type='pci' model='%s' index='%zu'/>",
>>>> +                  virDomainControllerModelPCITypeToString(defaultModel), i);
>>>> +        /* only add a single dmi-to-pci-bridge, then add pcie-root-port
>>>> +         * for any other unspecified controller indexes.
>>>> +         */
>>>> +        if (hasPCIeRoot)
>>>> +            defaultModel = VIR_DOMAIN_CONTROLLER_MODEL_PCIE_ROOT_PORT;
>>>   
>>> Okay, so
>>>   
>>>      * if the machine has a PCIe Root Bus and we have a PCI
>>>        bridge that we don't know where to plug (no existing
>>>        legacy PCI hierarchy below it), we add a single
>>>        DMI-to-PCI bridge and then plug PCIe Root Ports into
>>>        PCIe Root Ports until we have as many buses as we need
>>   
>> You mean "plug pcie-root-ports into pcie-root".
> No, I actually meant what I wrote, but I think that thanks to
> your remark I see my error now.
>
> What I thought was going on was that whese additional buses
> were *chained* to each other, eg. the first PCIe Root Port
> (bus 1) would be plugged into the PCIe Root Bus (bus 0), the
> second PCIe Root Port (bus 2) would be plugged into the first
> PCIe Root Port (bus 1) and so on.

That wouldn't work because 1) pcie-root-port can only plug into 
pcie-root or pcie-expander-bus, and 2) even if it could plug into 
another pcie-root-port, there is only a single slot so you would have to 
plug the pcie-root-port and the endpoint device into different functions 
of the single slot, so hotplug would be impossible.

>
> What happens instead is that the first PCIe Root Port (bus 1)
> is plugged into the PCIe Root Bus (bus 0), and so is the
> second PCIe Root Port (bus 2) and all subsequent ones.
>
> Basically for a second there I forgot that the bus number
> doesn't increase only when increasing the depth of the PCI
> hierarchy, but also when increasing its breadth.
>
>>>      * if the machine has a PCIe Root Bus and we're not in the
>>>        situation above when it comes to PCI bridges (there is
>>>        an existing legacy PCI hierarchy below it), we plug PCIe
>>>        Root Ports into PCIe Root Ports until we have as many
>>>        buses as we need
>>>   
>>>      * otherwise (no PCIe Root Bus) we just plug PCI bridges
>>>        into PCI bridges until we have as many buses as we need
>>>   
>>> Does that sound about right?
>>   
>> Yep.
> The same as above applies, though.
>
> On the other hand, the virDomainPCIAddressSet structure
> doesn't really store any information about the relationship
> between controllers: whether eg. the PCIe Root Port providing
> bus 5 is plugged directly into the PCIe Root Bus, or into
> another PCIe Root Port, or into a PCIe Switch Downstream
> Port, is something that we just don't know at this stage.

Yes, which may be necessary at some point - e.g. I've been thinking that 
we really don't want the buses that are children of a 
pci[e]-expander-bus to be used for auto-assigned addresses (although 
maybe that could just be solved with a flag that's set at the time the 
AddressSet is created, rather than needing to teach it the full 
intracacies of the topology.)

>
>>>> +    <controller type='pci' index='1' model='dmi-to-pci-bridge'>
>>>> +      <model name='i82801b11-bridge'/>
>>>> +      <address type='pci' domain='0x0000' bus='0x00' slot='0x1e' function='0x0'/>
>>>> +    </controller>
>>>   
>>> A DMI-to-PCI bridge with index='1' has been added
>>> automatically here...
>>>   
>>> ... but it's not being used by any of the devices. So why
>>> would it be added in the first place?
>>   
>> That is a *very* good question!
> Can't wait to know the answer! ;)

Oh, now that I've looked in context of the patch ordering, I undestand: 
it's because patch 16/18 hasn't been applied yet. I'd forgotten the 
ordering...




More information about the libvir-list mailing list