[libvirt] [PATCH] libxl: rework reference counting

Martin Kletzander mkletzan at redhat.com
Tue Jul 14 15:31:20 UTC 2015


On Mon, Jun 15, 2015 at 08:36:47PM -0600, Jim Fehlig wrote:
>Similar to commit 540c339a for the QEMU driver, rework reference
>counting in the libxl driver to make it more deterministic and
>the code a bit cleaner.
>
>Signed-off-by: Jim Fehlig <jfehlig at suse.com>
>---
>
>I've been testing this patch on and off for a few weeks now using
>libvirt-tck domain tests, local test scripts, and some manual tests
>for good measure.  I sent the patch to Anthony Perard (cc'd) nearly
>two weeks ago for testing in his OpenStack+Xen+libvirt CI loop,
>although I haven't received any feedback thus far.  Also included
>Martin in the cc since he encouraged this patch
>

I'm sorry I didn't get to it earlier.  I'm not familiar with the libxl
driver, but I'll at least give it a try since this is on the list for
some time.

>https://www.redhat.com/archives/libvir-list/2015-April/msg00014.html
>
> src/libxl/libxl_domain.c    |  32 ++----
> src/libxl/libxl_domain.h    |   5 +-
> src/libxl/libxl_driver.c    | 274 ++++++++++++++++++--------------------------
> src/libxl/libxl_migration.c |   6 +-
> 4 files changed, 128 insertions(+), 189 deletions(-)
>
>diff --git a/src/libxl/libxl_domain.c b/src/libxl/libxl_domain.c
>index 0652270..ce188ee 100644
>--- a/src/libxl/libxl_domain.c
>+++ b/src/libxl/libxl_domain.c
>@@ -294,6 +294,25 @@ libxlDomObjFromDomain(virDomainPtr dom)
>     return vm;
> }
>
>+/*
>+ * Finish working with a domain object in an API.  This function
>+ * clears whatever was left of a domain that was gathered using
>+ * libxlDomObjFromDomain().  Currently that means only unlocking and
>+ * decrementing the reference counter of that domain.  And in order to
>+ * make sure the caller does not access the domain, the pointer is
>+ * cleared.
>+ */
>+static void
>+libxlDomObjEndAPI(virDomainObjPtr *vm)
>+{

This function already exists as virDomainObjEndAPI().

>+    if (!*vm)
>+        return;
>+
>+    virObjectUnlock(*vm);
>+    virObjectUnref(*vm);
>+    *vm = NULL;
>+}
>+
> static int
> libxlAutostartDomain(virDomainObjPtr vm,
>                      void *opaque)
>@@ -2760,8 +2729,7 @@ libxlDomainUndefineFlags(virDomainPtr dom,
>
>  cleanup:
>     VIR_FREE(name);
>-    if (vm)
>-        virObjectUnlock(vm);
>+    libxlDomObjEndAPI(&vm);

In this function the domain can leak because the reference you get by
libxlDomObjFromDomain() is not returned anywhere.  You might get to
this cleanup phase with a domain that has a reference count, but vm is
set to NULL.  virDomainObjListRemove() leaves the domain unlocked,
Unless I missed something this will leak.

Stream of Random thoughts follows.

Not setting 'vm = NULL' will help a bit, but in that case the domain
will be unlocked twice.  At first I thought maybe something similar to
how QEMU driver does that in qemuDomainRemoveInactive() might work.
But after looking at it for a while, I see this bug is in QEMU, the
domain will be unlocked twice there.  I need to send a fix for that
because now we are resetting a job when the domain is unlocked.  That
leads me to an idea.  Could we somehow make most of the things that
are so similar abstracted somehow?  That's not related to this patch
though.  But for example that would fix stuff like the fact that libxl
doesn't hold a job when removing the domain.

>@@ -4963,23 +4915,20 @@ libxlDomainMigrateFinish3Params(virConnectPtr dconn,
>                        NULLSTR(dname));
>         return NULL;
>     }
>+    virObjectRef(vm);
>
>-    if (virDomainMigrateFinish3ParamsEnsureACL(dconn, vm->def) < 0) {
>-        virDomainObjEndAPI(&vm);
>-        return NULL;
>-    }
>+    if (virDomainMigrateFinish3ParamsEnsureACL(dconn, vm->def) < 0)
>+        goto cleanup;
>
>-    if (libxlDomainObjBeginJob(driver, vm, LIBXL_JOB_MODIFY) < 0) {
>-        virDomainObjEndAPI(&vm);
>-        return NULL;
>-    }
>+    if (libxlDomainObjBeginJob(driver, vm, LIBXL_JOB_MODIFY) < 0)
>+        goto cleanup;
>
>     ret = libxlDomainMigrationFinish(dconn, vm, flags, cancelled);
>
>-    if (!libxlDomainObjEndJob(driver, vm))
>-        vm = NULL;
>+    libxlDomainObjEndJob(driver, vm);
>
>-    virDomainObjEndAPI(&vm);
>+ cleanup:
>+    libxlDomObjEndAPI(&vm);
>

Wow, and it was called here, how funny :)  Although that might mean
something is wrong since you're not adding unref, but added ref on top
of this hunk.

Martin

P.S.: Now 'make check' fails for me due to false-positives about
mismatched ACL functions from migrate APIs, but that might be due to
poor rebase on top of master which I did.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20150714/32461bc8/attachment-0001.sig>


More information about the libvir-list mailing list