[libvirt] Analysis of the effect of adding PCIe root ports

Laine Stump laine at laine.org
Fri Oct 7 14:17:34 UTC 2016


On 10/06/2016 12:10 PM, Daniel P. Berrange wrote:
> On Thu, Oct 06, 2016 at 11:57:17AM -0400, Laine Stump wrote:
>> On 10/06/2016 11:31 AM, Daniel P. Berrange wrote:
>>> On Thu, Oct 06, 2016 at 12:58:51PM +0200, Andrea Bolognani wrote:
>>>> On Wed, 2016-10-05 at 18:36 +0100, Richard W.M. Jones wrote:
>>>>>>> (b) It would be nice to turn the whole thing off for people who don't
>>>>>>> care about / need hotplugging.
>>>>>> I had contemplated having an "availablePCIeSlots" (or something like
>>>>>> that) that was either an attribute of the config, or an option in
>>>>>> qemu.conf or libvirtd.conf. If we had such a setting, it could be
>>>>>> set to "0".
>>>> I remember some pushback when this was proposed. Maybe we
>>>> should just give up on the idea of providing spare
>>>> hotpluggable PCIe slots by default and ask the user to add
>>>> them explicitly after all.
>>>>
>>>>> Note that changes to libvirt conf files are not usable by libguestfs.
>>>>> The setting would need to go into the XML, and please also make it
>>>>> possible to determine if $random version of libvirt supports the
>>>>> setting, either by a version check or something in capabilities.
>>>> Note that you can avoid using any PCIe root port at all by
>>>> assigning PCI addresses manually. It looks like the overhead
>>>> for the small (I'm assuming) number of devices a libguestfs
>>>> appliance will use is low enough that you will probably not
>>>> want to open that can of worm, though.
>>> For most apps the performance impact of the PCI enumeration
>>> is not a big deal. So having libvirt ensure there's enough
>>> available hotpluggable PCIe slots is reasonable, as long as
>>> we leave a get-out clause for libguestfs.
>>>
>>> This could be as simple as declaring that *if* we see one
>>> or more <controller type="pci"> in the input XML, then libvirt
>>> will honour those and not try to add new controllers to the
>>> guest.
>>>
>>> That way, by default libvirt will just "do the right thing"
>>> and auto-create a suitable number of controllers needed to
>>> boot the guest.
>>>
>>> Apps that want strict control though, can specify the
>>> <controllers> elements themselves.  Libvirt can still
>>> auto-assign device addresses onto these controllers.
>>> It simply wouldn't add any further controllers itself
>>> at that point.

Even if it was adding offline, and there wasn't any place to put a new 
device? (i.e., the operation would fail without adding a controller, and 
libvirt was able to add it). Or am I taking your statement beyond its 
intent (I'm good at that :-)


>>>   NB I'm talking cold-boot here. So libguestfs
>>> would specify <controller> itself to the minimal set it wants
>>> to optimize its boot performance.
>> That works for the initial definition of the domain, but as soon as you've
>> saved it once, there will be controllers explicitly in the config, and since
>> we don't have any way of differentiating between auto-added controllers and
>> those specifically requested by the user, we have to assume they were
>> explicitly added, so such a check is then meaningless because you will
>> *always* have PCI controllers.
> Ok, so coldplug was probably the wrong word to use. What I actually
> meant was "at time of initial define", since that's when libvirt
> actually does its controller auto-creation. If you later add more
> devices to the guest, whether it is online or offline, that libvirt
> would still be auto-adding more controllers if required (and if
> possible) . I was not expecting libvirt to remember whether we
> were auto-adding controllers the first time or not.
>
>> Say you create a domain definition with no controllers, you would get enough
>> for the devices in the initial config, plus "N" more empty root ports. Let's
>> say you then add 4 more devices (either hotplug or coldplug, doesn't
>> matter). Those devices are placed on the existing unused pcie-root-ports.
>> But now all your ports are full, and since you have PCI controllers in the
>> config, libvirt is going to say "Ah, this user knows what they want to do,
>> so I'm not going to add any extras! I'm so smart!". This would be especially
>> maddening in the case of "coldplug", where libvirt could have easily added a
>> new controller to accomodate the new device, but didn't.
>>
>> Unless we don't care what happens after the initial definition (and then
>> adding of "N" new devices), trying to behave properly purely based on
>> whether or not there are any PCI controllers present in the config isn't
>> going to work.
> I think that's fine.
>
> Lets stop talking about coldplug since that's very misleading.

Do you mean use of the term "coldplug" at all, or talking about what 
happens when you add a device to the persistent config of the domain but 
not to the currently running guest itself?

>
> What I mean is that...
>
> 1. When initially defining a guest
>
>     If no controllers are present, auto-add controllers implied
>     by the machine type, sufficient to deal with all currently
>     listed devices, plus "N" extra spare ports.
>
>     Else, simply assign devices to the controllers listed in
>     the XML config. If there are no extra spare ports after
>     doing this, so be it. It was the application's choice
>     to have not listed enough controllers to allow later
>     addition of more devices.
>
>
> 2. When adding further devices (whether to an offline or online
>     guest)
>
>     If there's not enough slots left, add further controllers
>     to host the devices.

Right. That works great for adding devices "offline" (since you don't 
like the term "coldplug" :-). It's just in the case of hotplug that it's 
problematic, because you can't add new PCI controllers to a running system

(big digression here - skip if you like...) (well, it *could* be 
possible to hotplug an upstream port plus some number of downstream 
ports, but qemu doesn't support it because it attaches devices one by 
one, and guest has to be notified of the entire contraption at once when 
the upstream port is attached - so you would have to send the attach for 
all the downstream ports first, then the upstream, but you *can't* do it 
in that order because then qemu doesn't yet know about the id (alias) 
you're going to give to the upstream port at the time you're attaching 
the downstreams. Anyway, even if and when qemu does support hotplugging 
upstream+downstream ports, you can't add more downstream ports to an 
existing upstream afterwards, so you would end up with some horrid 
scheme where you had to always make sure you had at least one open 
downstream or root-port open every time a device was added.)


>    If there were not enough slots left
>     to allow adding further controllers, that must be due to
>     the initial application decision at time of defining the
>     original XML

Or it's because there have already been "N" new devices added since the 
domain was defined, and they're now trying to *hotplug* device "N+1".

I'm fine with that behavior, I just want to make sure everyone 
understands this restriction beforehand.


So here's a rewording of your description (with a couple additional 
conditions) to see if I understand everything correctly:

1) during initial domain definition:

A) If there are *no pci controllers at all* (not even a pci-root or 
pcie-root) *and there are any unaddressed devices that need a PCI slot* 
then auto-add enough controllers for the requested devices, *and* make 
sure there are enough empty slots for "N" (do we stick with 4? or make 
it 3?) devices to be added later without needing more controllers. (So, 
if the domain has no PCI devices, we don't add anything extra, and also 
if it only has PCI devices that already have addresses, then we also 
don't add anything extra).

B) if there is at least one pci controller specified in the XML, and 
there are any unused slots in the pci controllers in the provided XML, 
then use them for the unaddressed devices. If there are more devices 
that need an address at this time, also add controllers for them, but no 
*extra* controllers.

(Note to Rich: libguestfs could avoid the extra controllers either by 
adding a pci-root/pcie-root to the XML, or by manually addressing the 
devices. The latter would actually be better, since it would avoid the 
need for any pcie-root-ports).

2) When adding a device to the persistent config (i.e. offline): if 
there is an empty slot on a controller, use it. If not, add a controller 
for that device *but no extra controllers*

3) when adding a device to the guest machine (i.e. hotplug / online), if 
there is an empty slot on a controller, use it. If not, then fail.

The differences I see from what (I think) you suggested are:

* if there aren't  any unaddressed pci devices (even if there are no 
controllers in the config), then we also don't add any extra controllers 
(although we will of course add the pci-root or pcie-root, to 
acknowledge it is there).

* if another controller is needed for adding a device offline, it's okay 
to add it.




More information about the libvir-list mailing list