[PATCH RESEND 06/20] virhostdev.c: virHostdevGetPCIHostDevice() now reports missing device

Laine Stump laine at redhat.com
Thu Jan 21 01:33:49 UTC 2021


On 1/18/21 2:53 PM, Daniel Henrique Barboza wrote:
> Gitlab issue #72 [1] reports that removing SR-IOVs VFs before
> removing the devices from the running domains can have strange
> consequences.


Resolution: "Don't do that!!" :-)


Seriously, though, there are any number of things that someone with root 
access to the host could do that would completely confuse libvirt and 
break everything. For example, assigning a PF to a guest after having 
assigned one or more of that PF's VFs to guests.


But I guess there's no harm in trying to mitigate the damage. As long as 
it doesn't complicate the code to the extent that it becomes more 
confusing, difficult to maintain, and liable to other regressions.


>   QEMU might be able to hotunplug the device inside the
> guest, but Libvirt will not be aware of that,


If qemu knows enough to unplug the device from the guest, then I'm 
presuming that it must also be sending a DEVICE_DELETED event up to 
libvirt. Of course libvirt isn't expecting this, and so probably throws 
it away, right?


>   and then the guest is
> now inconsistent with the domain definition.
>
> There's also the possibility of the VFs removal not succeeding
> while the domain is running but then, as soon as the domain
> is shutdown, all the VFs are removed. Libvirt can't handle
> the removal of the PCI devices while trying to reattach the
> hostdevs, and the Libvirt daemon can be left in an inconsistent
> state (see [2]).
>
> This patch starts to address the issue related in Gitlab #72, most
> notably the issue described in [2]. When shutting down a domain
> with SR-IOV hostdevs that got missing, virHostdevReAttachPCIDevices()
> is failing the whole process and failing to reattach all the
> PCI devices, including the ones that aren't related to the VFs that
> went missing. Let's make it more resilient with host changes by
> changing virHostdevGetPCIHostDevice() to return an exclusive error
> code '-2' for this case. virHostdevGetPCIHostDeviceList() can then
> tell when virHostdevGetPCIHostDevice() failed to find the PCI
> device of a hostdev and continue to make the list of PCI devices.
>
> virHostdevReAttachPCIDevices() will now be able to proceed reattaching
> all other valid PCI devices, at least. The 'ghost hostdevs' will be
> handled later on.
>
> [1] https://gitlab.com/libvirt/libvirt/-/issues/72
> [2] https://gitlab.com/libvirt/libvirt/-/issues/72#note_459032148
>
> Signed-off-by: Daniel Henrique Barboza <danielhb413 at gmail.com>
> ---
>   src/hypervisor/virhostdev.c | 8 ++++++--
>   1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/src/hypervisor/virhostdev.c b/src/hypervisor/virhostdev.c
> index bd35397f2c..dbba36193b 100644
> --- a/src/hypervisor/virhostdev.c
> +++ b/src/hypervisor/virhostdev.c
> @@ -220,7 +220,8 @@ virHostdevManagerGetDefault(void)
>    * is returned.
>    *
>    * Returns: 0 on success (@pci might be NULL though),
> - *         -1 otherwise (with error reported).
> + *         -1 otherwise (with error reported),
> + *         -2 PCI device not found. @pci will be NULL
>    */
>   static int
>   virHostdevGetPCIHostDevice(const virDomainHostdevDef *hostdev,
> @@ -235,6 +236,9 @@ virHostdevGetPCIHostDevice(const virDomainHostdevDef *hostdev,
>           hostdev->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI)
>           return 0;
>   
> +    if (!virPCIDeviceExists(&pcisrc->addr))
> +        return -2;


What happens if the device disappeared and then came back before you got 
here? (i.e. by setting sriov_numvfs = 0, and then sriov_numvfs = 128 (or 
whatever)?

It's too late in the day for my to process this too much, so I'm going 
to come back to it tomorrow. I will continue on to see if there is other 
"simple" stuff in subsequent patches though - even if we can't get all 
of this pushed right away, there's a significant amount that can be pushed.


> +
>       actual = virPCIDeviceNew(&pcisrc->addr);
>   
>       if (!actual)
> @@ -270,7 +274,7 @@ virHostdevGetPCIHostDeviceList(virDomainHostdevDefPtr *hostdevs, int nhostdevs)
>           virDomainHostdevDefPtr hostdev = hostdevs[i];
>           g_autoptr(virPCIDevice) pci = NULL;
>   
> -        if (virHostdevGetPCIHostDevice(hostdev, &pci) < 0)
> +        if (virHostdevGetPCIHostDevice(hostdev, &pci) == -1)
>               return NULL;
>   
>           if (!pci)





More information about the libvir-list mailing list