[libvirt] [PATCH 2/2] HACK: qemu: aarch64: Use virtio-pci if user specifies PCI controller
Andrea Bolognani
abologna at redhat.com
Fri Feb 26 15:13:43 UTC 2016
On Wed, 2016-02-17 at 15:03 -0500, Laine Stump wrote:
> On 01/28/2016 04:14 PM, Cole Robinson wrote:
> >
> > If a user manually specifies this XML snippet for aarch64 machvirt:
> >
> > <controller type='pci' index='0' model='pci-root'/>
> As you've noted below, this isn't correct. aarch64 machvirt has no
> implicit pci-root controller (aka "pci.0"). It instead has a pcie-root
> controller ("pcie.0"). Since a pci[e]-root controller cannot be
> explicitly added, by definition this couldn't work.
>
> >
> >
> > Libvirt will interpret this to mean that the OS supports virtio-pci,
> > and will allocate PCI addresses (instead of virtio-mmio) for virtio
> > devices.
> >
> > This is a giant hack. Trying to improve it led me into the maze of PCI
> > address code and I gave up for now. Here are the issues:
> >
> > * I'd prefer that to be model='pcie-root' which matches what
> > qemu-system-aarch64 -M virt actually provides by default... however
> > libvirt isn't happy with a single pcie-root specified by the user, it
> > will error with:
> >
> > error: unsupported configuration: failed to create PCI bridge on bus 1: too many devices with fixed addresses
> That's not the right error, but it's caused by the fact that libvirt
> wants the pci-bridge device to be plugged into a standard PCI slot, but
> all the slots of pcie-root are PCIe slots. Since we now know that qemu
> doesn't mind if any standard PCI device is plugged into a PCIe slot,
Should we rely on this behavior? Isn't this something that might
change in the future? Or at least be quite puzzling for users?
Just thinking out loud :)
> the
> decision of how we want to solve this problem depends on whether or not
> we want the devices in question to be hot-pluggable - the ports of
> pcie-root do not support hot-plugging devices (at least on Q35), while
> the ports on pci-bridge do. So if we require that all devices be
> hot-pluggable, then we have a few choices:
>
> 1) create the same PCI controller Frankenstein we currently have for Q35
> - a dmi-to-pci-bridge plugged into pcie-root, and a pci-bridge plugged
> into dmi-to-pci-bridge. This is easiest because it already works, but it
> does create an extra unnecessary controller.
This is the current situation, right?
qemu-kvm in current aarch64 RHEL doesn't have the i82801b11-bridge
device compiled in, by the way. However, since qemu-system-aarch64
in Fedora 23 *does* have it, I assume enabling it would simply be
a matter of flipping a build configuration bit.
> 2) auto-add a pci-bridge in cases when there is a pcie-root but not
> standard PCI slots. This would take only a slight amount more work.
>
> 3) auto-add a pcie-root-port to each port of the pcie-root controller.
> This would still leave us with PCIe ports, so we would need to teach
> libvirt that it's okay to plug PCI devices into PCIe ports.
As mentioned above, I'm not sure this is a good idea. Maybe I'm just
afraid of my own shadow though :)
> If we don't require hot-pluggability, then we can just teach the
> address-assignment code that PCI devices can plug into non-hotpluggable
> PCIe ports and we're done.
>
> Or we can do a hybrid that's kind of a continuation of the "use PCI if
> it's available, otherwise mmio" - we could do this:
>
> A) If there are any standard PCI slots, then auto-assign to PCI slots
> (creating new pci-bridge controllers s necessary)
>
> B) else if there are any PCIe slots, then auto-assign to hot-pluggable
> PCIe if available, or straight PCIe if not.
>
> C) else use virtio-mmio.
>
> -------------------------------------------
>
> Mixed in with all of this discussion is my thinking that we should have
> some way to specify, in XML, constraints for the address of each device
> *without specifying the address itself*. Things we need to be able to
> specify:
>
> 1) Is a PCI-only vs. PCIe-only vs. either one (maybe this could be used
> in the future to constrain to virtio-mmio as well)?
>
> 2) Must the device be hot-pluggable? (default would be yes)
>
> 3) guest-side NUMA node? (I'm not sure if this needs to be user
> specifiable - in the case of a vfio-assigned device, I think all we need
> to to inform the guest which NUMA node the device is on in the host (via
> putting it on a PXB controller that is configured with that same NUMA
> node number). For emulated devices - is there any use to putting an
> *emulated* device on the same controller as a particular vfio-assigned
> device that is on a specific node? If not, then maybe it will never matter).
>
> It would be better if these "address constraints" were in a different
> part of the XML than the <address> element itself - this would maintain
> the simplicity of being able to just remove all <address> elements in
> order to force libvirt to re-assign all device addresses.
>
> This isn't something that needs doing immediately, but worth keeping in
> mind while putting together something that works for aarch64.
>
>
>
> >
> >
> > Instead this patch uses hacks to make pci-root use the pcie.0 bus for
> > aarch64, since that code path already works.
> I think that's a dead-end that we would have to back-track on, so
> probably not a good solution even temporarily.
>
>
> Here's an attempt at a plan:
>
> 1) change the PCI address assignment code so that for aarch64/virt it
> prefers PCIe addresses, but still requires hot-pluggable (currently it
> almost always prefers PCI, and requires hot-pluggable). (alternate - if
> aarch64 doesn't support pcie-root-port or pcie-switch-*-port, then don't
> require hot-pluggable either).
>
> 2) put something on the front of that that checks for existence of
> pcie-root, and if it's not found, uses virtio-mmio instead (is there
> something already that auto-adds the virtio-mmio address? I haven't
> looked and am too lazy to do so now).
>
> At this point, as long as you manually add a bunch of pcie-root-port
> controllers along with the manual pcie-root, everything should just
> work. Then we would go to step 3:
>
> 3) enhance the auto-assign code so that, in addition to auto-adding a
> pci-bridge when needed, it would auto-add either a single pcie-root-port
> or a pcie-switch-upstream-port and 32 pcie-switch-downstream-ports
> anytime a hotpluggable PCIe port was needed and couldn't be found. (the
> latter assumes that aarch64 supports those controllers).
>
> Does that make any sense? I could try to code some of this up if you
> could test it (or help me get setup to test it myself).
I'm not sure I fully understand all of the above, but I'll pitch
in with my own proposal regardless :)
First, we make sure that
<controller type='pci' index='0' model='pcie-root'/>
is always added automatically to the domain XML when using the
mach-virt machine type. Then, if
<controller type='pci' index='1' model='dmi-to-pci-bridge'/>
<controller type='pci' index='2' model='pci-bridge'/>
is present as well we default to virtio-pci, otherwise we use
the current default of virtio-mmio. This should allow management
applications, based on knowledge about the guest OS, to easily
pick between the two address schemes.
Does this sound like a good idea?
Cheers.
--
Andrea Bolognani
Software Engineer - Virtualization Team
More information about the libvir-list
mailing list