[libvirt] [PATCH 1/2] qemu: Fix @vm locking issue when connecting to the monitor
Daniel Henrique Barboza
danielhb413 at gmail.com
Tue Oct 8 20:15:24 UTC 2019
On 10/8/19 6:15 AM, Michal Privoznik wrote:
> When connecting to qemu's monitor the @vm object is unlocked.
> This is justified - connecting may take a long time and we don't
> want to wait with the domain object locked. However, just before
> the domain object is locked again, the monitor's FD is registered
> in the event loop. Therefore, there is a small window where the
> event loop has a chance to call a handler for an event that
> occurred on the monitor FD but vm is not initalized properly just
> yet (i.e. priv->mon is not set). For instance, if there's an
> incoming migration, qemu creates its socket but then fails to
> initialize (for various reasons, I'm reproducing this by using
> hugepages but leaving the HP pool empty) then the following may
> happen:
>
> 1) qemuConnectMonitor() unlocks @vm
>
> 2) qemuMonitorOpen() connects to the monitor socket and by
> calling qemuMonitorOpenInternal() which subsequently calls
> qemuMonitorRegister() the event handler is installed
>
> 3) qemu fails to initialize and exit()-s, which closes the
> monitor
>
> 4) The even loop sees EOF on the monitor and the control gets to
> qemuProcessEventHandler() which locks @vm and calls
> processMonitorEOFEvent() which then calls
> qemuMonitorLastError(priv->mon). But priv->mon is not set just
> yet.
>
> 5) qemuMonitorLastError() dereferences NULL pointer
>
> The solution is to unlock the domain object for a shorter time
> and most importantly, register event handler with domain object
> locked so that any possible event processing is done only after
> @vm's private data was properly initialized.
>
> This issue is also mentioned in v4.2.0-99-ga5a777a8ba.
>
> Since we are unlocking @vm and locking it back, another thread
> might have destroyed the domain meanwhile. Therefore we have to
> check if domain is still activem, and we have to do it at the
s/activem/active
> same place where domain lock is acquired back, i.e. in
> qemuMonitorOpen(). This creates a small problem for our test
> suite which calls qemuMonitorOpen() directly and passes @vm which
> has no definition. This makes virDomainObjIsActive() call crash.
> Fortunately, allocating empty domain definition is sufficient.
>
> Signed-off-by: Michal Privoznik <mprivozn at redhat.com>
> ---
LGTM. Hopefully this will be enough to prevent this vm lock race
with the monitor at VM startup.
Reviewed-by: Daniel Henrique Barboza <danielhb413 at gmail.com>
>
> This is an alternative approach to:
>
> https://www.redhat.com/archives/libvir-list/2019-September/msg00749.html
>
> src/qemu/qemu_monitor.c | 33 +++++++++++++++++++++++++--------
> src/qemu/qemu_process.c | 12 ------------
> tests/qemumonitortestutils.c | 2 ++
> 3 files changed, 27 insertions(+), 20 deletions(-)
>
> diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c
> index 58de26a276..a53d3b1d60 100644
> --- a/src/qemu/qemu_monitor.c
> +++ b/src/qemu/qemu_monitor.c
> @@ -810,35 +810,52 @@ qemuMonitorOpen(virDomainObjPtr vm,
> qemuMonitorCallbacksPtr cb,
> void *opaque)
> {
> - int fd;
> + int fd = -1;
> bool hasSendFD = false;
> - qemuMonitorPtr ret;
> + qemuMonitorPtr ret = NULL;
>
> timeout += QEMU_DEFAULT_MONITOR_WAIT;
>
> + /* Hold an extra reference because we can't allow 'vm' to be
> + * deleted until the monitor gets its own reference. */
> + virObjectRef(vm);
> +
> switch (config->type) {
> case VIR_DOMAIN_CHR_TYPE_UNIX:
> hasSendFD = true;
> - if ((fd = qemuMonitorOpenUnix(config->data.nix.path,
> - vm->pid, retry, timeout)) < 0)
> - return NULL;
> + virObjectUnlock(vm);
> + fd = qemuMonitorOpenUnix(config->data.nix.path,
> + vm->pid, retry, timeout);
> + virObjectLock(vm);
> break;
>
> case VIR_DOMAIN_CHR_TYPE_PTY:
> - if ((fd = qemuMonitorOpenPty(config->data.file.path)) < 0)
> - return NULL;
> + virObjectUnlock(vm);
> + fd = qemuMonitorOpenPty(config->data.file.path);
> + virObjectLock(vm);
> break;
>
> default:
> virReportError(VIR_ERR_INTERNAL_ERROR,
> _("unable to handle monitor type: %s"),
> virDomainChrTypeToString(config->type));
> - return NULL;
> + break;
> + }
> +
> + if (fd < 0)
> + goto cleanup;
> +
> + if (!virDomainObjIsActive(vm)) {
> + virReportError(VIR_ERR_OPERATION_FAILED, "%s",
> + _("domain is not running"));
> + goto cleanup;
> }
>
> ret = qemuMonitorOpenInternal(vm, fd, hasSendFD, cb, opaque);
> + cleanup:
> if (!ret)
> VIR_FORCE_CLOSE(fd);
> + virObjectUnref(vm);
> return ret;
> }
>
> diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
> index a50cd54393..b303e5ba05 100644
> --- a/src/qemu/qemu_process.c
> +++ b/src/qemu/qemu_process.c
> @@ -1955,13 +1955,8 @@ qemuConnectMonitor(virQEMUDriverPtr driver, virDomainObjPtr vm, int asyncJob,
> * 1GiB of guest RAM. */
> timeout = vm->def->mem.total_memory / (1024 * 1024);
>
> - /* Hold an extra reference because we can't allow 'vm' to be
> - * deleted until the monitor gets its own reference. */
> - virObjectRef(vm);
> -
> ignore_value(virTimeMillisNow(&priv->monStart));
> monConfig = virObjectRef(priv->monConfig);
> - virObjectUnlock(vm);
>
> mon = qemuMonitorOpen(vm,
> monConfig,
> @@ -1978,15 +1973,8 @@ qemuConnectMonitor(virQEMUDriverPtr driver, virDomainObjPtr vm, int asyncJob,
> qemuProcessMonitorLogFree);
> }
>
> - virObjectLock(vm);
> virObjectUnref(monConfig);
> - virObjectUnref(vm);
> priv->monStart = 0;
> -
> - if (!virDomainObjIsActive(vm)) {
> - qemuMonitorClose(mon);
> - mon = NULL;
> - }
> priv->mon = mon;
>
> if (qemuSecurityClearSocketLabel(driver->securityManager, vm->def) < 0) {
> diff --git a/tests/qemumonitortestutils.c b/tests/qemumonitortestutils.c
> index e9dff123f8..c7580c5f28 100644
> --- a/tests/qemumonitortestutils.c
> +++ b/tests/qemumonitortestutils.c
> @@ -1085,6 +1085,8 @@ qemuMonitorCommonTestNew(virDomainXMLOptionPtr xmlopt,
> test->vm = virDomainObjNew(xmlopt);
> if (!test->vm)
> goto error;
> + if (!(test->vm->def = virDomainDefNew()))
> + goto error;
> }
>
> if (virNetSocketNewListenUNIX(path, 0700, geteuid(), getegid(),
More information about the libvir-list
mailing list