[libvirt PATCH v4 25/25] nodedev: fix hang when destroying an mdev in use
Erik Skultety
eskultet at redhat.com
Mon Feb 22 07:29:54 UTC 2021
On Wed, Feb 03, 2021 at 11:39:09AM -0600, Jonathon Jongsma wrote:
> Calling `mdevctl stop` for a mediated device that is in use by an active
> domain will block until that vm exits (or the vm closes the device).
> Since the nodedev driver cannot query the hypervisor driver to see
> whether any active domains are using the device, we resort to a
> workaround that relies on the fact that a vfio group can only be opened
> by one user at a time. If we get an EBUSY error when attempting to open
> the group file, we assume the device is in use and refuse to try to
> destroy that device.
>
> Signed-off-by: Jonathon Jongsma <jjongsma at redhat.com>
> ---
> src/node_device/node_device_driver.c | 17 +++++++++++++++++
> 1 file changed, 17 insertions(+)
>
> diff --git a/src/node_device/node_device_driver.c b/src/node_device/node_device_driver.c
> index bf97291041..e6b0213157 100644
> --- a/src/node_device/node_device_driver.c
> +++ b/src/node_device/node_device_driver.c
> @@ -1164,6 +1164,23 @@ nodeDeviceDestroy(virNodeDevicePtr device)
>
> ret = 0;
> } else if (nodeDeviceHasCapability(def, VIR_NODE_DEV_CAP_MDEV)) {
> + /* If this mediated device is in use by a vm, attempting to stop it
> + * will block until the vm closes the device. Since the nodedev driver
> + * cannot query the hypervisor driver to determine whether the device
A nice detailed commentary, but for future reference I'd still add the reason
*why* it is that nodedev driver cannot poke the hypervisor driver.
> + * is in use by any active domains, we need to resort to a workaround.
> + * vfio only allows the group for a device to be opened by one user at
> + * a time. So if we get EBUSY when opening the group, we infer that the
> + * device is in use and shouldn't try to remove the device. */
> + g_autofree char *vfiogroup =
> + virMediatedDeviceGetIOMMUGroupDev(def->caps->data.mdev.uuid);
> + VIR_AUTOCLOSE fd = open(vfiogroup, O_RDONLY);
> +
> + if (fd < 0 && errno == EBUSY) {
> + virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
> + _("Unable to destroy a mediated device that is in use"));
I think a slightly better message would look like:
_("Unable to destroy '%s': device in use"), def->name
This was a simple workaround indeed :).
Reviewed-by: Erik Skultety <eskultet at redhat.com>
More information about the libvir-list
mailing list