[libvirt] [PATCH v2 2/2] qemu: completely rework reference counting

Martin Kletzander mkletzan at redhat.com
Tue Dec 16 15:51:59 UTC 2014


On Tue, Dec 16, 2014 at 02:00:09PM +0100, Michal Privoznik wrote:
>On 16.12.2014 11:15, Martin Kletzander wrote:
>> There is one problem that causes various errors in the daemon.  When
>> domain is waiting for a job, it is unlocked while waiting on the
>> condition.  However, if that domain is for example transient and being
>> removed in another API (e.g. cancelling incoming migration), it get's
>> unref'd.  If the first call, that was waiting, fails to get the job, it
>> unref's the domain object, and because it was the last reference, it
>> causes clearing of the whole domain object.  However, when finishing the
>> call, the domain must be unlocked, but there is no way for the API to
>> know whether it was cleaned or not (unless there is some ugly temporary
>> variable, but let's scratch that).
>>
>> The root cause is that our APIs don't ref the objects they are using and
>> all use the implicit reference that the object has when it is in the
>> domain list.  That reference can be removed when the API is waiting for
>> a job.  And because each domain doesn't do its ref'ing, it results in
>> the ugly checking of the return value of virObjectUnref() that we have
>> everywhere.
>>
>> This patch changes qemuDomObjFromDomain() to ref the domain (using
>> virDomainObjListFindByUUIDRef()) and adds qemuDomObjEndAPI() which
>> should be the only function in which the return value of
>> virObjectUnref() is checked.  This makes all reference counting
>> deterministic and makes the code a bit clearer.
>>
>> Signed-off-by: Martin Kletzander <mkletzan at redhat.com>
>> ---
>> v2:
>>   - Comments from Peter and Daniel on v1 implemented, rather not
>>     listing them here as the list was pretty comprehensive and it would
>>     make the reviewer focus on that.
>>
>>   src/qemu/THREADS.txt      |  40 ++-
>>   src/qemu/qemu_domain.c    |  49 ++--
>>   src/qemu/qemu_domain.h    |  12 +-
>>   src/qemu/qemu_driver.c    | 708 ++++++++++++++++------------------------------
>>   src/qemu/qemu_migration.c | 111 +++-----
>>   src/qemu/qemu_migration.h |  10 +-
>>   src/qemu/qemu_process.c   |  77 ++---
>>   7 files changed, 371 insertions(+), 636 deletions(-)
>>
>
>> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
>> index f652237..30ea9df 100644
>> --- a/src/qemu/qemu_driver.c
>> +++ b/src/qemu/qemu_driver.c
>
>> @@ -4409,8 +4333,7 @@ static void qemuProcessEventHandler(void *data, void *opaque)
>>           break;
>>       }
>>
>> -    if (virObjectUnref(vm))
>> -        virObjectUnlock(vm);
>> +    qemuDomObjEndAPI(&vm);
>
>Interesting, why the heck have we Unref() here? I mean, in the calltrace
>I don't see where the @vm is Ref()-ed. And this should not use the
>reference obtained via qemuMonitorOpen() as we are connecting only once
>and receiving events multiple times. But this is pre-existing code.
>

You won't see that as that is done in another thread.  Each one of
processing methods (e.g. qemuProcessHandleDeviceDeleted,
qemuProcessHandleGuestPanic) reference the VM themselves before
unlocking it to make sure it is not getting disposed of before the
event is processed.  And that processing is left to the worker pool
that does qemuProcessEventHandler().

>>       VIR_FREE(processEvent);
>>   }
>>
>
>> diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
>> index a19e71a..4d9ce09 100644
>> --- a/src/qemu/qemu_process.c
>> +++ b/src/qemu/qemu_process.c
>
>> @@ -5491,6 +5470,7 @@ int qemuProcessAutoDestroyAdd(virQEMUDriverPtr driver,
>>                                 virConnectPtr conn)
>>   {
>>       VIR_DEBUG("vm=%s, conn=%p", vm->def->name, conn);
>> +    virObjectRef(vm);
>>       return virCloseCallbacksSet(driver->closeCallbacks, vm, conn,
>>                                   qemuProcessAutoDestroy);
>
>This function is called conditionally in qemuProcessStart() depending on
>VIR_QEMU_PROCESS_START_AUTODESTROY flag passed...
>
>>   }
>> @@ -5498,9 +5478,12 @@ int qemuProcessAutoDestroyAdd(virQEMUDriverPtr driver,
>>   int qemuProcessAutoDestroyRemove(virQEMUDriverPtr driver,
>>                                    virDomainObjPtr vm)
>>   {
>> +    int ret;
>>       VIR_DEBUG("vm=%s", vm->def->name);
>> -    return virCloseCallbacksUnset(driver->closeCallbacks, vm,
>> -                                  qemuProcessAutoDestroy);
>> +    ret = virCloseCallbacksUnset(driver->closeCallbacks, vm,
>> +                                 qemuProcessAutoDestroy);
>> +    virObjectUnref(vm);
>> +    return ret;
>
>.. on the other hand, this function is called unconditionally in
>qemuProcessStop(). So after 'virsh start' and 'virsh destroy' the
>reference counts don't match. I've been able to debug this and I came up
>with this diff:
>

This diff seems accurate even for use outside of qemu.  I'm squashing
it in, but I'll wait with the pushing so others have a chance to
react.

Thanks for the review,
Martin

>diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
>index 4d9ce09..cbd0389 100644
>--- a/src/qemu/qemu_process.c
>+++ b/src/qemu/qemu_process.c
>@@ -5470,7 +5470,6 @@ int qemuProcessAutoDestroyAdd(virQEMUDriverPtr driver,
>                                virConnectPtr conn)
>  {
>      VIR_DEBUG("vm=%s, conn=%p", vm->def->name, conn);
>-    virObjectRef(vm);
>      return virCloseCallbacksSet(driver->closeCallbacks, vm, conn,
>                                  qemuProcessAutoDestroy);
>  }
>@@ -5482,7 +5481,6 @@ int qemuProcessAutoDestroyRemove(virQEMUDriverPtr
>driver,
>      VIR_DEBUG("vm=%s", vm->def->name);
>      ret = virCloseCallbacksUnset(driver->closeCallbacks, vm,
>                                   qemuProcessAutoDestroy);
>-    virObjectUnref(vm);
>      return ret;
>  }
>
>diff --git a/src/util/virclosecallbacks.c b/src/util/virclosecallbacks.c
>index 4f26172..4128057 100644
>--- a/src/util/virclosecallbacks.c
>+++ b/src/util/virclosecallbacks.c
>@@ -138,6 +138,7 @@ virCloseCallbacksSet(virCloseCallbacksPtr
>closeCallbacks,
>              VIR_FREE(closeDef);
>              goto cleanup;
>          }
>+        virObjectRef(vm);
>      }
>
>      ret = 0;
>@@ -172,7 +173,11 @@ virCloseCallbacksUnset(virCloseCallbacksPtr
>closeCallbacks,
>          goto cleanup;
>      }
>
>-    ret = virHashRemoveEntry(closeCallbacks->list, uuidstr);
>+    if (virHashRemoveEntry(closeCallbacks->list, uuidstr) < 0)
>+        goto cleanup;
>+
>+    virObjectUnref(vm);
>+    ret = 0;
>   cleanup:
>      virObjectUnlock(closeCallbacks);
>      return ret;
>
>
>ACK with this (or equivalent) diff squashed in.
>
>Michal
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: Digital signature
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20141216/ae8c1634/attachment-0001.sig>


More information about the libvir-list mailing list