[libvirt PATCH v6 27/30] nodedev: fix hang when destroying an mdev in use
John Ferlan
jferlan at redhat.com
Mon Apr 12 11:57:13 UTC 2021
On 3/26/21 12:48 PM, Jonathon Jongsma wrote:
> Calling `mdevctl stop` for a mediated device that is in use by an active
> domain will block until that vm exits (or the vm closes the device).
> Since the nodedev driver cannot query the hypervisor driver to see
> whether any active domains are using the device, we resort to a
> workaround that relies on the fact that a vfio group can only be opened
> by one user at a time. If we get an EBUSY error when attempting to open
> the group file, we assume the device is in use and refuse to try to
> destroy that device.
>
> Signed-off-by: Jonathon Jongsma <jjongsma at redhat.com>
> Reviewed-by: Erik Skultety <eskultet at redhat.com>
> ---
> src/node_device/node_device_driver.c | 19 +++++++++++++++++++
> 1 file changed, 19 insertions(+)
>
> diff --git a/src/node_device/node_device_driver.c b/src/node_device/node_device_driver.c
> index b1b3217afb..31322e3afb 100644
> --- a/src/node_device/node_device_driver.c
> +++ b/src/node_device/node_device_driver.c
> @@ -1177,8 +1177,27 @@ nodeDeviceDestroy(virNodeDevicePtr device)
>
> ret = 0;
> } else if (nodeDeviceHasCapability(def, VIR_NODE_DEV_CAP_MDEV)) {
> + /* If this mediated device is in use by a vm, attempting to stop it
> + * will block until the vm closes the device. The nodedev driver
> + * cannot query the hypervisor driver to determine whether the device
> + * is in use by any active domains, since that would introduce circular
> + * dependencies between daemons and add a risk of deadlocks. So we need
> + * to resort to a workaround. vfio only allows the group for a device
> + * to be opened by one user at a time. So if we get EBUSY when opening
> + * the group, we infer that the device is in use and therefore we
> + * shouldn't try to remove the device. */
> + g_autofree char *vfiogroup =
> + virMediatedDeviceGetIOMMUGroupDev(def->caps->data.mdev.uuid);
FWIW: Coverity points out @vfiogroup could be returned as NULL here (w/
virReportError called) thus making the subsequent call possibly fail...
John
> + VIR_AUTOCLOSE fd = open(vfiogroup, O_RDONLY);
> g_autofree char *errmsg = NULL;
>
> + if (fd < 0 && errno == EBUSY) {
> + virReportError(VIR_ERR_INTERNAL_ERROR,
> + _("Unable to destroy '%s': device in use"),
> + def->name);
> + goto cleanup;
> + }
> +
> if (virMdevctlStop(def, &errmsg) < 0) {
> virReportError(VIR_ERR_INTERNAL_ERROR,
> _("Unable to destroy '%s': %s"), def->name,
>
More information about the libvir-list
mailing list