[libvirt] [PATCH 2/2 RFC] Purge marked callbacks before dispatching events

Michal Privoznik mprivozn at redhat.com
Tue Oct 11 01:25:48 UTC 2016


On 10.10.2016 21:13, Martin Kletzander wrote:
> On Mon, Oct 10, 2016 at 04:03:32PM +0800, Michal Privoznik wrote:
>> On 07.10.2016 19:52, Martin Kletzander wrote:
>>> I don't have any justification for this except an empirical one: with
>>> this patch the code from Bug 1363628 doesn't crash after "leaking".
>>>
>>> I currently don't have properly working valgrind, but I'm working on it
>>> and it might shed some light into this (although it might also not
>>> happen due to the slowdown).
>>>
>>> However in the meantime I attempted some analysis and I got even more
>>> confused than before, I guess.  With this patch the code doesn't crash,
>>> even though virObjectEventStateQueueDispatch() properly skips callbacks
>>> marked as deleted.  What I feel is even more weird, that if I duplicate
>>> the purgatory function (instead of moving it), it crashes again, and I
>>> feel like even more often.
>>>
>>> Weirdly-fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1363628
>>>
>>> Signed-off-by: Martin Kletzander <mkletzan at redhat.com>
>>> ---
>>>  src/conf/object_event.c | 6 +++---
>>>  1 file changed, 3 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/src/conf/object_event.c b/src/conf/object_event.c
>>> index e5af4be68a7e..4066b4673b42 100644
>>> --- a/src/conf/object_event.c
>>> +++ b/src/conf/object_event.c
>>> @@ -821,13 +821,13 @@ virObjectEventStateFlush(virObjectEventStatePtr
>>> state)
>>>      if (state->timer != -1)
>>>          virEventUpdateTimeout(state->timer, -1);
>>>
>>> +    /* Purge any deleted callbacks */
>>> +    virObjectEventCallbackListPurgeMarked(state->callbacks);
>>> +
>>>      virObjectEventStateQueueDispatch(state,
>>>                                       &tempQueue,
>>>                                       state->callbacks);
>>>
>>> -    /* Purge any deleted callbacks */
>>> -    virObjectEventCallbackListPurgeMarked(state->callbacks);
>>> -
>>>      state->isDispatching = false;
>>>      virObjectEventStateUnlock(state);
>>>  }
>>
>> Well, I have no idea either, because virObjectEventStateQueueDispatch()
>> calls virObjectEventStateDispatchCallbacks() which in turn calls
>> virObjectEventDispatchMatchCallback() which returns false if the
>> callback->deleted is truem and thus the callback is skipped - just like
>> it is after this patch of yours.
>>
>> This whole bug feels like a refcounting problem and IMO your patch is
>> just making the scenario more unlikely.
>>
> 
> So I've found out what it is thanks to Luayo Huang.  And if I understand
> it correctly, it won't happen with stateful drivers.  Stateless,
> however, will have this problem.
> 
> The problem is that testConnectClose() clears the driver structure,
> including calling virObjectEventStateFree() on the eventState.  It
> clears the whole eventState, but it can happen while the eventState is
> dispatching.  So there's the race.  And it would also explain what I was
> seeing all the time.  qemuCleanup() is the only function in qemu driver
> that would cleanup the eventState and since that gets called from the
> daemon only after other stuff is cleaned up, it won't happen there.
> 

Ah, yeah. I can see that in the code now too. Good job!

> I'll try turning virObjectEventStatePtr into virObject, we'll have
> refcounting then and it will not disappear until it's stopped being
> used.  I'll see if that helps or not, but it could.

Yep, this is classical example of why we need refcounted objects.

Michal




More information about the libvir-list mailing list