[libvirt] [PATCH 4/6] conf: new <hotplug require='yes|no'/> element for hotpluggable devices

Laine Stump laine at laine.org
Mon Aug 8 16:41:48 UTC 2016

On 08/08/2016 04:56 AM, Laine Stump wrote:
> When faced with a guest device that requires a PCI address but doesn't
> have one manually assigned in the config, libvirt has always insisted
> (well... *tried* to insist) on auto-assigning an address that is on a
> PCI controller that supports hotplug. One big problem with this is
> that it prevents automatic use of a Q35 (or aarch64/virt) machine's
> pcie-root (since the PCIe root complex doesn't support hotplug).
> In order to promote simpler domain configs (more devices on pcie-root
> rather than on a pci-bridge), this patch adds a new sub-element to all
> guest devices that have a PCI address and support hotplug:
>    <hotplug require='no'/>
> For devices that have hotplug require='no', we turn off the
> VIR_PCI_CONNECT_HOGPLUGGABLE bit in the devFlags when searching for an
> available PCI address. Since pcie-root now allows standard PCI
> devices, this results in those devices being placed on pcie-root
> rather than pci-bridge.

I've been playing around with this and, by itself, it works very well. 
With this solved, combined with taking advantage of PCIe for virtio when 
available, it's very easy to create q35 domains that have no legacy-PCI 
without needing to resort to manually assigning addresses.

However, there is still another item that we need to be able to 
configure - stating a preference of legacy PCI vs. PCIe when both are 
available for a device (again, the aim is to do this *without* needing 
to manually assign an address). The following devices have this choice:

1) vfio assigned devices
2) virtio devices
3) the nec-xhci USB controller

You might think that it would always be preferable to use PCIe if it's 
available, but especially in these "early days" of using PCIe in guests 
it would be useful to have to ability to *easily* force use of a legacy 
PCI slot in case some PCIe-related bug is encountered (in particular, 
people have pointed out in discussions about vfio device assignment that 
it could be possible for a guest OS to misbehave when presented with a 
device's PCIe configuration block (which hasn't been visible in the past 
because the device was attached to a legacy PCI slot)).

In order maintain functionality while any such bugs are figured out and 
fixed, we need to be able to force the device onto a PCI slot. There are 
two ways of doing this:

1) manually specify the full PCI address of a legacy PCI slot in the config
2) provide an option in the config that simply says "use any PCI slot" 
or "use any PCIe slot".

Assuming that (1) is too cumbersome, we need to come up with a 
reasonable name/location for a config option (providing the backend for 
it will be trivial). Some possible places:

2a) add a new attribute to the <address> element

I don't like this option because that makes it impossible to easily 
force re-addressing of the devices in a domain by simply removing all 
the <address> lines. (Yes, I know that's a non-issue in production, 
especially when there is some other management system (OpenStack, oVirt) 
sitting on top of libvirt. But it is a *big* help for developers who are 
messing around with it).

2b) Add a new attribute to an existing subelement, e.g.  <target 

This makes parsing and formatting cumbersome, because every device type 
has its own code to parse/format its <target> subelement. Also, in the 
case of <interface>, the <target> subelement is being mis-used to hold 
the name *on the host* of the tap device, and it would be confusing to 
see something like this:

      <target dev='vnet1' preferredBus='pci'/>

2c) add a new subelement just for this, e.g. <bus prefer='pci'/> or ???

I don't like this because it adds to the toplevel clutter in the 
devices. XML's hierarchichal structure is useful to organize attributes 
so they are easier to comprehend, and we should take advantage of that 
as much as possible.

2d) Try to find a common subelement that can be used for *all* address 
assignment preferences/restrictions, including hotplug.

This is the option that has prompted my writing this message in response 
to my own patch mail. What it, instead of:

      <hotplug require='no'/>
      <bus prefer='pci'/> (or whatever)

we had something like this?

      <addressPreferences hotplug='no' bus='pci'/>

(*PLEASE* think of a better name!)

In my mind, the choice is between 1 and 2d - if everyone thinks this is 
something only needed during a short transitional stage, maybe (1) is an 
adequate solution. If not, then we should decide now on the name for 
this option, and potentially rename the hotplug option accordingly.

What are your opinions?

(BTW, just to throw another wrench into the works - I think it would 
also be useful to be able to specify a numa node for devices, so that a 
device could be placed on a particular numa node in the guest (i.e. on a 
particular pci[e]-expander-bus or one of its subordinate buses) without 
needing to know the full PCI address. That could be done by specifying 
it the same way it's done in the pci[e]-expander-bus itself:

     <hostdev ....>

or it could be made a part of this new proposed element:

       <addressPreferences hotplug='no' bus='pci' numa='2'/>

This is something that we will want in the long term (not just a 
temporary method of working around potential bugs), so if we're going to 
want it in a separate element rather than in <target>, we'll need to 
consider it *now* in order to avoid giving the wrong name to the new 
hotplug option defined in the parent of this message.)

More information about the libvir-list mailing list