[libvirt] [PATCH] events: Fix domain event race on client disconnect

Christophe Fergeau cfergeau at redhat.com
Fri Sep 7 12:44:03 UTC 2012


On Fri, Sep 07, 2012 at 01:24:35PM +0100, Daniel P. Berrange wrote:
> A nice long detailed explanation. I agree that this scenario you
> outline is plausible as an explanation for why Boxes sometimes
> stops getting events from libvirtd.

I've ran more tests in the mean time without this patch applied, but
with the one below to add some debugging:

diff --git a/src/conf/domain_event.c b/src/conf/domain_event.c
index 43ecdcf..33d90fb 100644
--- a/src/conf/domain_event.c
+++ b/src/conf/domain_event.c
@@ -1501,7 +1501,13 @@ virDomainEventStateRegisterID(virConnectPtr conn,
     int ret = -1;

     virDomainEventStateLock(state);
+    VIR_WARN("RegisterID");

+    if ((state->callbacks->count == 0) && (state->timer == -1)) {
+        if (state->queue->count != 0) {
+            VIR_WARN("REG: queue's not empty: %d", state->queue->count);
+        }
+    }
     if ((state->callbacks->count == 0) &&
         (state->timer == -1) &&
         (state->timer = virEventAddTimeout(-1,
@@ -1584,6 +1590,7 @@ virDomainEventStateDeregisterID(virConnectPtr conn,
 {
     int ret;

+    VIR_WARN("DeregisterID");
     virDomainEventStateLock(state);
     if (state->isDispatching)
         ret = virDomainEventCallbackListMarkDeleteID(conn,
@@ -1596,6 +1603,9 @@ virDomainEventStateDeregisterID(virConnectPtr conn,
         state->timer != -1) {
         virEventRemoveTimeout(state->timer);
         state->timer = -1;
+        if (state->queue->count != 0) {
+            VIR_WARN("DEREG: queue's not empty: %d", state->queue->count);
+        }
     }

     virDomainEventStateUnlock(state);


I've hit the event lost issue once, and right when this started happening,
the log was:
2012-09-06 11:37:06.094+0000: 30498: warning :
virDomainEventStateDeregisterID:1593 : DeregisterID
2012-09-06 11:37:06.094+0000: 30498: warning :
virDomainEventStateDeregisterID:1607 : DEREG: queue's not empty: 1
2012-09-06 11:45:42.363+0000: 30502: warning :
virDomainEventStateRegisterID:1504 : RegisterID
2012-09-06 11:45:42.363+0000: 30502: warning :
virDomainEventStateRegisterID:1508 : REG: queue's not empty: 1

and after that, no events and these warnings kept happening with an
increasing number of queued events which is consistent with the hypothesis I made
in this patch.

Christophe
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20120907/622dab6b/attachment-0001.sig>


More information about the libvir-list mailing list