[libvirt] [PATCH v3 19/18] [RFC] qemu: assure there are always at least 4 open pcie-root-ports for hotplug

Andrea Bolognani abologna at redhat.com
Fri Oct 14 12:40:54 UTC 2016


On Thu, 2016-09-29 at 15:20 -0400, Laine Stump wrote:
> For machinetypes with a pci-root bus (all legacy PCI), libvirt will
> make a "fake" reservation for one extra slot prior to assigning
> addresses to unaddressed PCI endpoint devices in the domain. This will
> trigger auto-adding of a pci-bridge for the final device to be
> assigned an address *if that device would have otherwise instead been
> the last device on the last available pci-bridge*; thus it assures
> that there will always be at least one slot left open in the domains
> bus topology for expansion (which is important both for hotplug (since
> a new pci-bridge can't be added while the guest is running) as well as
> for coldplug (since adding a new device might otherwise in some cases
> require re-addressing existing devices, which we want to avoid)).
> 
> It's important to note that for this (legacy PCI) case, we must check
> for the special case of all slots on all buses being occupied *prior
> to assigning any addresses*, and avoid attempting to reserve the extra
> address in that case, because there is no free address in the existing
> topology, so no place to auto-add a pci-bridge for expansion. Since
> that condition can only be reached by manual intervention, this is
> acceptable.
> 
> For machinetypes with pcie-root, libvirt's methodology for
> automatically expanding the bus topology is different -
> pcie-root-ports are plugged into slots (soon to be functions) of
> pcie-root as needed, and the new endpoint devices are assigned to the
> single slot in each pcie-root-port. This is done so that the devices
> are, by default, hotpluggable (the slots of pcie-root don't support
> hotplug, but the single slot of the pcie-root-port does). Since
> pcie-root-ports can only be plugged into pcie-root, and we don't
> auto-assign endpoint devices to those slots, this means topology
> expansion doesn't compete with endpoint devices for slots, so we don't
> need to worry about checking for all "useful" slots being free prior
> to assigning addresses to new endpoint devices - as a matter of fact,
> if we attempt to reserve the fake slots before the useful slots, it
> can lead to errors.
> 
> Instead this patch just reserves slots for "fake" devices after doing
> the assignment for actual devices - if there were already enough
> unoccupied pcie-root-ports, then none will be added; if not, then
> enough will be added to support the desired (hardcoded) potential
> number of hotplugged devices.
> 
> Since there hasn't been any other concrete suggestion for the number
> of "available hotpluggable slots" libvirt should assure, this patch
> uses "4" as the answer (thanks Drew!) Alternatives could be:
> 
> 0 - hotplug would only work if the user had thought to add an extra
> pcie-root to the config *as an extra step after adding and saving each
> new device*. (this would preclude creating a new domain that had a
> pcie-root-port available for hotplug - the extra modify/save step
> would be needed even during initial domain creation, and would need to
> be repeated every time a device was added to the domain)
> 
> 1 - this is the *minimum* number of hotpluggable slots libvirt
> guarantees for legacy PCI domains (with pci-root). Of course on
> average these domains are more likely to have *15* slots available
> (half of the slots on a pci-bridge).
> 
> "anything else" - really any choice made for this is going to be
> considered wrong by *somebody*. I hope we can all agree that "0" is a
> wrong choice, just because it will require so much ongoing
> babysitting.
> 
> A couple of us have brought up the idea of having
> "availableHotplugSlots" be configurable in the domain. There has also
> been sentiment against that. Perhaps it could be configurable in
> qemu.conf? (I don't really like that either, though - I think
> everything about the domain should be self-contained in the domain
> XML, and if something is tunable, it should be tunable separately for
> each domain).
> 
> In the absense of anything configurable, we will need to pick a number
> though. I've done that here, and now we can argue about it (or not :-)

As we have already agreed on a slightly different approach,x
eg. doing pretty much exactly this, but only if the
configuration provided by the user contains no PCI controller
(except possibly for pcie-root), I think it's fair to say:

NACK

-- 
Andrea Bolognani / Red Hat / Virtualization




More information about the libvir-list mailing list