[libvirt] [PATCH 2/2] qemu: Order PCI controllers on the command line

Laine Stump laine at laine.org
Thu Sep 7 01:36:10 UTC 2017


On 09/05/2017 09:25 AM, Andrea Bolognani wrote:
> Unlike other controllers, PCI controller can plug into each other,

s/PCI controller/PCI controllers/

> which might turn into a problem if they are not listed in the
> expected order on the QEMU command line, because the guest will
> then not be able to start with an error such as
>
>   qemu-kvm: -device pci-bridge,chassis_nr=3,id=pci.3,bus=pci.2.0,addr=0x7:
>   Bus 'pci.2.0' not found
>
> Add some logic to make sure PCI controller appear in the correct

s/PCI controller/PCI controllers/

> order on the QEMU command line, regardless of the order they were
> added to the guest configuration.

Before examining the code, one concern I had was whether an
*unnecessary* re-ordering would cause a guest ABI change. So I decided
to try testing it by making an XML file with the pci controller indexes
out of order. Unfortunately, when you try to do that, libvirt ends up
inserting the controllers into its internal list in sorted order, so
it's not even possible to force/fake libvirt into putting the
controllers in the wrong order - no matter what you try, the following
will happen:

1) if the indexes are out of order in the input XML, libvirt will
re-order them as it parses.

2) if any pci controller's address (the slot it's plugged into) has a
bus attribute that is higher than that controller's own index, libvirt
logs an error and fails the operation.

So under normal circumstances, I think that the situation you're talking
about can't happen.

*BUT*

The one exception to this is when libvirt auto-adds controllers in order
to fill in unused indexes (e.g. if you have a pci controller with
index='8', but none with index='7', you'll automatically get an extra
controller added with index='7') or to provide a controller at the
proper index for a device that has a PCI address pointing to a
non-existent bus (this is the bug described in the referenced BZ). The
problem is that, when libvirt auto-adds these new controllers to the
controller array, it just appends them to the end, rather than inserting
them in proper sorted order. THAT is what leads to the problem. (as a
matter of fact, the next time you edit the domain in *any way*, the
issue is resolved because the parser re-orders the list of controllers).

I haven't tried it, but it looks to me like this could all be fixed by
simply changing the function virDomainDefAddController() to call
virDomainControllerInsert() to add the new controller at the proper
(sorted) place in the controller array instead of using
VIR_APPEND_ELEMENT_COPY() (which is what it does now). That has the nice
side effect of keeping the controllers listed in the XML in sorted
order, and making sure that the XML doesn't magically change when you
edit something unrelated.


I don't doubt that the code you've written below fixes the problem at
the time the commandline is generated (haven't tried it either :-), but
I think it's better to fix it back at its source, when the controllers
are auto-added.

>
> Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1479674
>
> Signed-off-by: Andrea Bolognani <abologna at redhat.com>
> ---
>  src/qemu/qemu_command.c                            | 72 ++++++++++++++++------
>  .../qemuxml2argv-pci-autoadd-idx.args              |  2 +-
>  .../qemuxml2argv-pseries-many-buses-2.args         |  4 +-
>  3 files changed, 56 insertions(+), 22 deletions(-)
>
> diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c
> index 83430b33f..c0e60a184 100644
> --- a/src/qemu/qemu_command.c
> +++ b/src/qemu/qemu_command.c
> @@ -3096,6 +3096,8 @@ qemuBuildControllerDevCommandLine(virCommandPtr cmd,
>                                    virQEMUCapsPtr qemuCaps)
>  {
>      size_t i, j;
> +    size_t *pciContIndex;
> +    size_t npciContIndex;
>      int usbcontroller = 0;
>      bool usblegacy = false;
>      int contOrder[] = {
> @@ -3107,14 +3109,10 @@ qemuBuildControllerDevCommandLine(virCommandPtr cmd,
>           * machines. For newer Q35 machines it is added out of the
>           * controllers loop, after the floppy drives.
>           *
> -         * We don't add PCI/PCIe root controller either, because it's
> -         * implicit, but we do add PCI bridges and other PCI
> -         * controllers, so we leave that in to check each
> -         * one. Likewise, we don't do anything for the primary IDE
> +         * Likewise, we don't do anything for the primary IDE
>           * controller on an i440fx machine or primary SATA on q35, but
>           * we do add those beyond these two exceptions.
>           */
> -        VIR_DOMAIN_CONTROLLER_TYPE_PCI,
>          VIR_DOMAIN_CONTROLLER_TYPE_USB,
>          VIR_DOMAIN_CONTROLLER_TYPE_SCSI,
>          VIR_DOMAIN_CONTROLLER_TYPE_IDE,
> @@ -3124,6 +3122,54 @@ qemuBuildControllerDevCommandLine(virCommandPtr cmd,
>      };
>      int ret = -1;
>  
> +    /* PCI controllers require special handling because they plug into
> +     * each other, which makes getting the ordering right on the QEMU
> +     * command line crucial: if we don't, and eg. list a pci-bridge
> +     * before the pci-root it plugs into, the guest will not start */
> +
> +    npciContIndex = 0;
> +    if (VIR_ALLOC_N(pciContIndex, def->ncontrollers) < 0)
> +        goto cleanup;
> +
> +    for (i = 0; i < def->ncontrollers; i++) {
> +        virDomainControllerDefPtr cont = def->controllers[i];
> +
> +        if (cont->type != VIR_DOMAIN_CONTROLLER_TYPE_PCI)
> +            continue;
> +
> +        pciContIndex[cont->idx] = i;
> +        npciContIndex = MAX(npciContIndex, cont->idx + 1);
> +    }
> +
> +    for (i = 0; i < npciContIndex; i++) {
> +        virDomainControllerDefPtr cont = def->controllers[pciContIndex[i]];
> +        char *devstr;
> +
> +        if (cont->type != VIR_DOMAIN_CONTROLLER_TYPE_PCI)
> +            continue;
> +
> +        /* skip pcie-root */
> +        if (cont->model == VIR_DOMAIN_CONTROLLER_MODEL_PCIE_ROOT)
> +            continue;
> +
> +        /* Skip pci-root, except for pSeries guests (which actually
> +         * support more than one PCI Host Bridge per guest) */
> +        if (!qemuDomainIsPSeries(def) &&
> +            cont->model == VIR_DOMAIN_CONTROLLER_MODEL_PCI_ROOT) {
> +            continue;
> +        }
> +
> +        if (qemuBuildControllerDevStr(def, cont, qemuCaps,
> +                                      &devstr, &usbcontroller) < 0)
> +            goto cleanup;
> +
> +        if (devstr) {
> +            virCommandAddArg(cmd, "-device");
> +            virCommandAddArg(cmd, devstr);
> +            VIR_FREE(devstr);
> +        }
> +    }
> +
>      for (j = 0; j < ARRAY_CARDINALITY(contOrder); j++) {
>          for (i = 0; i < def->ncontrollers; i++) {
>              virDomainControllerDefPtr cont = def->controllers[i];
> @@ -3139,20 +3185,6 @@ qemuBuildControllerDevCommandLine(virCommandPtr cmd,
>                  continue;
>              }
>  
> -            /* skip pcie-root */
> -            if (cont->type == VIR_DOMAIN_CONTROLLER_TYPE_PCI &&
> -                cont->model == VIR_DOMAIN_CONTROLLER_MODEL_PCIE_ROOT) {
> -                continue;
> -            }
> -
> -            /* Skip pci-root, except for pSeries guests (which actually
> -             * support more than one PCI Host Bridge per guest) */
> -            if (!qemuDomainIsPSeries(def) &&
> -                cont->type == VIR_DOMAIN_CONTROLLER_TYPE_PCI &&
> -                cont->model == VIR_DOMAIN_CONTROLLER_MODEL_PCI_ROOT) {
> -                continue;
> -            }
> -
>              /* first SATA controller on Q35 machines is implicit */
>              if (cont->type == VIR_DOMAIN_CONTROLLER_TYPE_SATA &&
>                  cont->idx == 0 && qemuDomainIsQ35(def))
> @@ -3215,6 +3247,8 @@ qemuBuildControllerDevCommandLine(virCommandPtr cmd,
>      ret = 0;
>  
>   cleanup:
> +    VIR_FREE(pciContIndex);
> +
>      return ret;
>  }
>  
> diff --git a/tests/qemuxml2argvdata/qemuxml2argv-pci-autoadd-idx.args b/tests/qemuxml2argvdata/qemuxml2argv-pci-autoadd-idx.args
> index 6b2f21bba..c03f72d96 100644
> --- a/tests/qemuxml2argvdata/qemuxml2argv-pci-autoadd-idx.args
> +++ b/tests/qemuxml2argvdata/qemuxml2argv-pci-autoadd-idx.args
> @@ -17,7 +17,6 @@ QEMU_AUDIO_DRV=none \
>  server,nowait \
>  -mon chardev=charmonitor,id=monitor,mode=readline \
>  -boot c \
> --device pci-bridge,chassis_nr=8,id=pci.8,bus=pci.0,addr=0x3 \
>  -device pci-bridge,chassis_nr=1,id=pci.1,bus=pci.0,addr=0x4 \
>  -device pci-bridge,chassis_nr=2,id=pci.2,bus=pci.0,addr=0x5 \
>  -device pci-bridge,chassis_nr=3,id=pci.3,bus=pci.0,addr=0x6 \
> @@ -25,6 +24,7 @@ server,nowait \
>  -device pci-bridge,chassis_nr=5,id=pci.5,bus=pci.0,addr=0x8 \
>  -device pci-bridge,chassis_nr=6,id=pci.6,bus=pci.0,addr=0x9 \
>  -device pci-bridge,chassis_nr=7,id=pci.7,bus=pci.0,addr=0xa \
> +-device pci-bridge,chassis_nr=8,id=pci.8,bus=pci.0,addr=0x3 \

Yeah, this.
>  -usb \
>  -drive file=/var/iso/f18kde.iso,format=raw,if=none,media=cdrom,\
>  id=drive-ide0-1-0,readonly=on \
> diff --git a/tests/qemuxml2argvdata/qemuxml2argv-pseries-many-buses-2.args b/tests/qemuxml2argvdata/qemuxml2argv-pseries-many-buses-2.args
> index 13fed02f8..7b2171f59 100644
> --- a/tests/qemuxml2argvdata/qemuxml2argv-pseries-many-buses-2.args
> +++ b/tests/qemuxml2argvdata/qemuxml2argv-pseries-many-buses-2.args
> @@ -18,5 +18,5 @@ QEMU_AUDIO_DRV=none \
>  server,nowait \
>  -mon chardev=charmonitor,id=monitor,mode=readline \
>  -boot c \
> --device spapr-pci-host-bridge,index=1,id=pci.2 \
> --device spapr-pci-host-bridge,index=2,id=pci.1
> +-device spapr-pci-host-bridge,index=2,id=pci.1 \
> +-device spapr-pci-host-bridge,index=1,id=pci.2





More information about the libvir-list mailing list