[libvirt PATCH v4 25/25] nodedev: fix hang when destroying an mdev in use

Erik Skultety eskultet at redhat.com
Mon Feb 22 07:29:54 UTC 2021


On Wed, Feb 03, 2021 at 11:39:09AM -0600, Jonathon Jongsma wrote:
> Calling `mdevctl stop` for a mediated device that is in use by an active
> domain will block until that vm exits (or the vm closes the device).
> Since the nodedev driver cannot query the hypervisor driver to see
> whether any active domains are using the device, we resort to a
> workaround that relies on the fact that a vfio group can only be opened
> by one user at a time. If we get an EBUSY error when attempting to open
> the group file, we assume the device is in use and refuse to try to
> destroy that device.
> 
> Signed-off-by: Jonathon Jongsma <jjongsma at redhat.com>
> ---
>  src/node_device/node_device_driver.c | 17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
> 
> diff --git a/src/node_device/node_device_driver.c b/src/node_device/node_device_driver.c
> index bf97291041..e6b0213157 100644
> --- a/src/node_device/node_device_driver.c
> +++ b/src/node_device/node_device_driver.c
> @@ -1164,6 +1164,23 @@ nodeDeviceDestroy(virNodeDevicePtr device)
>  
>          ret = 0;
>      } else if (nodeDeviceHasCapability(def, VIR_NODE_DEV_CAP_MDEV)) {
> +        /* If this mediated device is in use by a vm, attempting to stop it
> +         * will block until the vm closes the device. Since the nodedev driver
> +         * cannot query the hypervisor driver to determine whether the device

A nice detailed commentary, but for future reference I'd still add the reason
*why* it is that nodedev driver cannot poke the hypervisor driver.

> +         * is in use by any active domains, we need to resort to a workaround.
> +         * vfio only allows the group for a device to be opened by one user at
> +         * a time. So if we get EBUSY when opening the group, we infer that the
> +         * device is in use and shouldn't try to remove the device. */
> +        g_autofree char *vfiogroup =
> +            virMediatedDeviceGetIOMMUGroupDev(def->caps->data.mdev.uuid);
> +        VIR_AUTOCLOSE fd = open(vfiogroup, O_RDONLY);
> +
> +        if (fd < 0 && errno == EBUSY) {
> +            virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
> +                           _("Unable to destroy a mediated device that is in use"));

I think a slightly better message would look like:
_("Unable to destroy '%s': device in use"), def->name


This was a simple workaround indeed :).

Reviewed-by: Erik Skultety <eskultet at redhat.com>




More information about the libvir-list mailing list