[PATCH RESEND 20/20] virhostdev.c: remove missing PCI devs from hostdev manager

Laine Stump laine at redhat.com
Mon Feb 22 04:12:32 UTC 2021


On 1/18/21 2:53 PM, Daniel Henrique Barboza wrote:
> virHostdevReAttachPCIDevices() is called when we want to re-attach
> a list of hostdevs back to the host, either on the shutdown path or
> via a 'virsh detach-device' call.  This function always count on the
> existence of the device in the host to work, but this can lead to
> problems. For example, a SR-IOV device can be removed via an admin
> "echo 0 > /sys/bus/pci/devices/<addr>/sriov_numvfs", making the kernel
> fire up and eventfd_signal() to the process, asking for the process to
> release the device. The result might vary depending on the device driver
> and OS/arch, but two possible outcomes are:
> 
> 1) the hypervisor driver will detach the device from the VM, issuing a
> delete event to Libvirt. This can be observed in QEMU;
> 
> 2) the 'echo 0 > ...' will hang waiting for the device to be unplugged.
> This means that the VM process failed/refused to release the hostdev back
> to the host, and the hostdev will be detached during VM shutdown.
> 
> Today we don't behave well for both cases. We'll fail to remove the PCI device
> reference from mgr->activePCIHostdevs and mgr->inactivePCIHostdevs because
> we rely on the existence of the PCI device conf file in the sysfs. Attempting
> to re-utilize the same device (assuming it is now present back in the host)
> can result in an error like this:
> 
> $ ./run tools/virsh start vm1-sriov --console
> error: Failed to start domain vm1-sriov
> error: Requested operation is not valid: PCI device 0000:01:00.2 is in use by driver QEMU, domain vm1-sriov
> 
> For (1), a VM destroy/start cycle is needed to re-use the VF in the guest.
> For (2), the effect is more nefarious, requiring a Libvirtd daemon restart
> to use the VF again in any guest.
> 
> We can make it a bit better by checking, during virHostdevReAttachPCIDevices(),
> if there is any missing PCI device that will be left behind in activePCIHostdevs
> and inactivePCIHostdevs lists. Remove any missing device found from both lists,
> unconditionally, matching the current state of the host. This change affects
> the code path in (1) (processDeviceDeletedEvent into qemuDomainRemoveDevice, all
> the way back to qemuHostdevReAttachPCIDevices) and also in (b) (qemuProcessStop
> into qemuHostdevReAttachDomainDevices).
> 
> NB: Although this patch enables the possibility of 'outside Libvirt' SR-IOV
> hotunplug of PCI devices, if the hypervisor and the PCI driver copes with it,
> our goal is to mitigate what it is still considered an user oopsie. For all''

This is one of those odd cases where you use "a" before a word starting 
with a vowel rather than "an". It's that way because the *sound* 
starting the next word is "you" rather than "ooh".

> supported purposes, the admin must remove the SR-IOV VFs from all running domains
> before removing the VFs from the host.
> 
> Resolves:  https://gitlab.com/libvirt/libvirt/-/issues/72
> Signed-off-by: Daniel Henrique Barboza <danielhb413 at gmail.com>
> ---
>   src/hypervisor/virhostdev.c | 38 +++++++++++++++++++++++++++++++++++++
>   1 file changed, 38 insertions(+)
> 
> diff --git a/src/hypervisor/virhostdev.c b/src/hypervisor/virhostdev.c
> index c708791eec..ed43733e71 100644
> --- a/src/hypervisor/virhostdev.c
> +++ b/src/hypervisor/virhostdev.c
> @@ -1077,6 +1077,40 @@ virHostdevReAttachPCIDevicesImpl(virHostdevManagerPtr mgr,
>   }
>   
>   
> +static void
> +virHostdevDeleteMissingPCIDevices(virHostdevManagerPtr mgr,
> +                                  virDomainHostdevDefPtr *hostdevs,
> +                                  int nhostdevs)
> +{
> +    size_t i;
> +
> +    if (nhostdevs == 0)
> +        return;
> +
> +    virObjectLock(mgr->activePCIHostdevs);
> +    virObjectLock(mgr->inactivePCIHostdevs);
> +
> +    for (i = 0; i < nhostdevs; i++) {
> +        virDomainHostdevDef *hostdev = hostdevs[i];
> +        virDomainHostdevSubsysPCI *pcisrc = &hostdev->source.subsys.u.pci;
> +        g_autoptr(virPCIDevice) pci = NULL;
> +
> +        if (virHostdevGetPCIHostDevice(hostdev, &pci) != -2)
> +            continue;
> +
> +        /* The PCI device from 'hostdev' does not exist in the host
> +         * anymore. Delete it from both active and inactive lists to
> +         * reflect the current host state.
> +         */
> +        virPCIDeviceListDel(mgr->activePCIHostdevs, &pcisrc->addr);
> +        virPCIDeviceListDel(mgr->inactivePCIHostdevs, &pcisrc->addr);
> +    }
> +
> +    virObjectUnlock(mgr->activePCIHostdevs);
> +    virObjectUnlock(mgr->inactivePCIHostdevs);

It makes me nervous that you're unlocking these lists in the same order 
that you lock them, since that could theoretically lead to a deadlock. I 
haven't even thought beyond this single function to figure out if it's 
even a possibility in this case, but still would feel better if you 
unlocked in the opposite order that you lock.

With that fixed:

Reviewed-by: Laine Stump <laine at redhat.com>


> +}
> +
> +
>   /* @oldStateDir:
>    * For upgrade purpose: see virHostdevRestoreNetConfig
>    */
> @@ -1102,6 +1136,10 @@ virHostdevReAttachPCIDevices(virHostdevManagerPtr mgr,
>   
>       virHostdevReAttachPCIDevicesImpl(mgr, drv_name, dom_name, pcidevs,
>                                        hostdevs, nhostdevs, oldStateDir);
> +
> +    /* Handle the case where PCI devices from the host went missing
> +     * during the domain lifetime */
> +    virHostdevDeleteMissingPCIDevices(mgr, hostdevs, nhostdevs);
>   }
>   
>   
> 




More information about the libvir-list mailing list