[PATCH RESEND 20/20] virhostdev.c: remove missing PCI devs from hostdev manager
Laine Stump
laine at redhat.com
Mon Feb 22 04:12:32 UTC 2021
On 1/18/21 2:53 PM, Daniel Henrique Barboza wrote:
> virHostdevReAttachPCIDevices() is called when we want to re-attach
> a list of hostdevs back to the host, either on the shutdown path or
> via a 'virsh detach-device' call. This function always count on the
> existence of the device in the host to work, but this can lead to
> problems. For example, a SR-IOV device can be removed via an admin
> "echo 0 > /sys/bus/pci/devices/<addr>/sriov_numvfs", making the kernel
> fire up and eventfd_signal() to the process, asking for the process to
> release the device. The result might vary depending on the device driver
> and OS/arch, but two possible outcomes are:
>
> 1) the hypervisor driver will detach the device from the VM, issuing a
> delete event to Libvirt. This can be observed in QEMU;
>
> 2) the 'echo 0 > ...' will hang waiting for the device to be unplugged.
> This means that the VM process failed/refused to release the hostdev back
> to the host, and the hostdev will be detached during VM shutdown.
>
> Today we don't behave well for both cases. We'll fail to remove the PCI device
> reference from mgr->activePCIHostdevs and mgr->inactivePCIHostdevs because
> we rely on the existence of the PCI device conf file in the sysfs. Attempting
> to re-utilize the same device (assuming it is now present back in the host)
> can result in an error like this:
>
> $ ./run tools/virsh start vm1-sriov --console
> error: Failed to start domain vm1-sriov
> error: Requested operation is not valid: PCI device 0000:01:00.2 is in use by driver QEMU, domain vm1-sriov
>
> For (1), a VM destroy/start cycle is needed to re-use the VF in the guest.
> For (2), the effect is more nefarious, requiring a Libvirtd daemon restart
> to use the VF again in any guest.
>
> We can make it a bit better by checking, during virHostdevReAttachPCIDevices(),
> if there is any missing PCI device that will be left behind in activePCIHostdevs
> and inactivePCIHostdevs lists. Remove any missing device found from both lists,
> unconditionally, matching the current state of the host. This change affects
> the code path in (1) (processDeviceDeletedEvent into qemuDomainRemoveDevice, all
> the way back to qemuHostdevReAttachPCIDevices) and also in (b) (qemuProcessStop
> into qemuHostdevReAttachDomainDevices).
>
> NB: Although this patch enables the possibility of 'outside Libvirt' SR-IOV
> hotunplug of PCI devices, if the hypervisor and the PCI driver copes with it,
> our goal is to mitigate what it is still considered an user oopsie. For all''
This is one of those odd cases where you use "a" before a word starting
with a vowel rather than "an". It's that way because the *sound*
starting the next word is "you" rather than "ooh".
> supported purposes, the admin must remove the SR-IOV VFs from all running domains
> before removing the VFs from the host.
>
> Resolves: https://gitlab.com/libvirt/libvirt/-/issues/72
> Signed-off-by: Daniel Henrique Barboza <danielhb413 at gmail.com>
> ---
> src/hypervisor/virhostdev.c | 38 +++++++++++++++++++++++++++++++++++++
> 1 file changed, 38 insertions(+)
>
> diff --git a/src/hypervisor/virhostdev.c b/src/hypervisor/virhostdev.c
> index c708791eec..ed43733e71 100644
> --- a/src/hypervisor/virhostdev.c
> +++ b/src/hypervisor/virhostdev.c
> @@ -1077,6 +1077,40 @@ virHostdevReAttachPCIDevicesImpl(virHostdevManagerPtr mgr,
> }
>
>
> +static void
> +virHostdevDeleteMissingPCIDevices(virHostdevManagerPtr mgr,
> + virDomainHostdevDefPtr *hostdevs,
> + int nhostdevs)
> +{
> + size_t i;
> +
> + if (nhostdevs == 0)
> + return;
> +
> + virObjectLock(mgr->activePCIHostdevs);
> + virObjectLock(mgr->inactivePCIHostdevs);
> +
> + for (i = 0; i < nhostdevs; i++) {
> + virDomainHostdevDef *hostdev = hostdevs[i];
> + virDomainHostdevSubsysPCI *pcisrc = &hostdev->source.subsys.u.pci;
> + g_autoptr(virPCIDevice) pci = NULL;
> +
> + if (virHostdevGetPCIHostDevice(hostdev, &pci) != -2)
> + continue;
> +
> + /* The PCI device from 'hostdev' does not exist in the host
> + * anymore. Delete it from both active and inactive lists to
> + * reflect the current host state.
> + */
> + virPCIDeviceListDel(mgr->activePCIHostdevs, &pcisrc->addr);
> + virPCIDeviceListDel(mgr->inactivePCIHostdevs, &pcisrc->addr);
> + }
> +
> + virObjectUnlock(mgr->activePCIHostdevs);
> + virObjectUnlock(mgr->inactivePCIHostdevs);
It makes me nervous that you're unlocking these lists in the same order
that you lock them, since that could theoretically lead to a deadlock. I
haven't even thought beyond this single function to figure out if it's
even a possibility in this case, but still would feel better if you
unlocked in the opposite order that you lock.
With that fixed:
Reviewed-by: Laine Stump <laine at redhat.com>
> +}
> +
> +
> /* @oldStateDir:
> * For upgrade purpose: see virHostdevRestoreNetConfig
> */
> @@ -1102,6 +1136,10 @@ virHostdevReAttachPCIDevices(virHostdevManagerPtr mgr,
>
> virHostdevReAttachPCIDevicesImpl(mgr, drv_name, dom_name, pcidevs,
> hostdevs, nhostdevs, oldStateDir);
> +
> + /* Handle the case where PCI devices from the host went missing
> + * during the domain lifetime */
> + virHostdevDeleteMissingPCIDevices(mgr, hostdevs, nhostdevs);
> }
>
>
>
More information about the libvir-list
mailing list