[libvirt PATCH v6 27/30] nodedev: fix hang when destroying an mdev in use

John Ferlan jferlan at redhat.com
Mon Apr 12 11:57:13 UTC 2021



On 3/26/21 12:48 PM, Jonathon Jongsma wrote:
> Calling `mdevctl stop` for a mediated device that is in use by an active
> domain will block until that vm exits (or the vm closes the device).
> Since the nodedev driver cannot query the hypervisor driver to see
> whether any active domains are using the device, we resort to a
> workaround that relies on the fact that a vfio group can only be opened
> by one user at a time. If we get an EBUSY error when attempting to open
> the group file, we assume the device is in use and refuse to try to
> destroy that device.
> 
> Signed-off-by: Jonathon Jongsma <jjongsma at redhat.com>
> Reviewed-by: Erik Skultety <eskultet at redhat.com>
> ---
>   src/node_device/node_device_driver.c | 19 +++++++++++++++++++
>   1 file changed, 19 insertions(+)
> 
> diff --git a/src/node_device/node_device_driver.c b/src/node_device/node_device_driver.c
> index b1b3217afb..31322e3afb 100644
> --- a/src/node_device/node_device_driver.c
> +++ b/src/node_device/node_device_driver.c
> @@ -1177,8 +1177,27 @@ nodeDeviceDestroy(virNodeDevicePtr device)
>   
>           ret = 0;
>       } else if (nodeDeviceHasCapability(def, VIR_NODE_DEV_CAP_MDEV)) {
> +        /* If this mediated device is in use by a vm, attempting to stop it
> +         * will block until the vm closes the device. The nodedev driver
> +         * cannot query the hypervisor driver to determine whether the device
> +         * is in use by any active domains, since that would introduce circular
> +         * dependencies between daemons and add a risk of deadlocks. So we need
> +         * to resort to a workaround.  vfio only allows the group for a device
> +         * to be opened by one user at a time. So if we get EBUSY when opening
> +         * the group, we infer that the device is in use and therefore we
> +         * shouldn't try to remove the device. */
> +        g_autofree char *vfiogroup =
> +            virMediatedDeviceGetIOMMUGroupDev(def->caps->data.mdev.uuid);

FWIW: Coverity points out @vfiogroup could be returned as NULL here (w/ 
virReportError called) thus making the subsequent call possibly fail...

John

> +        VIR_AUTOCLOSE fd = open(vfiogroup, O_RDONLY);
>           g_autofree char *errmsg = NULL;
>   
> +        if (fd < 0 && errno == EBUSY) {
> +            virReportError(VIR_ERR_INTERNAL_ERROR,
> +                           _("Unable to destroy '%s': device in use"),
> +                           def->name);
> +            goto cleanup;
> +        }
> +
>           if (virMdevctlStop(def, &errmsg) < 0) {
>               virReportError(VIR_ERR_INTERNAL_ERROR,
>                              _("Unable to destroy '%s': %s"), def->name,
> 




More information about the libvir-list mailing list