[libvirt] [PATCH 2/2 RFC] Purge marked callbacks before dispatching events
Michal Privoznik
mprivozn at redhat.com
Tue Oct 11 01:25:48 UTC 2016
On 10.10.2016 21:13, Martin Kletzander wrote:
> On Mon, Oct 10, 2016 at 04:03:32PM +0800, Michal Privoznik wrote:
>> On 07.10.2016 19:52, Martin Kletzander wrote:
>>> I don't have any justification for this except an empirical one: with
>>> this patch the code from Bug 1363628 doesn't crash after "leaking".
>>>
>>> I currently don't have properly working valgrind, but I'm working on it
>>> and it might shed some light into this (although it might also not
>>> happen due to the slowdown).
>>>
>>> However in the meantime I attempted some analysis and I got even more
>>> confused than before, I guess. With this patch the code doesn't crash,
>>> even though virObjectEventStateQueueDispatch() properly skips callbacks
>>> marked as deleted. What I feel is even more weird, that if I duplicate
>>> the purgatory function (instead of moving it), it crashes again, and I
>>> feel like even more often.
>>>
>>> Weirdly-fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1363628
>>>
>>> Signed-off-by: Martin Kletzander <mkletzan at redhat.com>
>>> ---
>>> src/conf/object_event.c | 6 +++---
>>> 1 file changed, 3 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/src/conf/object_event.c b/src/conf/object_event.c
>>> index e5af4be68a7e..4066b4673b42 100644
>>> --- a/src/conf/object_event.c
>>> +++ b/src/conf/object_event.c
>>> @@ -821,13 +821,13 @@ virObjectEventStateFlush(virObjectEventStatePtr
>>> state)
>>> if (state->timer != -1)
>>> virEventUpdateTimeout(state->timer, -1);
>>>
>>> + /* Purge any deleted callbacks */
>>> + virObjectEventCallbackListPurgeMarked(state->callbacks);
>>> +
>>> virObjectEventStateQueueDispatch(state,
>>> &tempQueue,
>>> state->callbacks);
>>>
>>> - /* Purge any deleted callbacks */
>>> - virObjectEventCallbackListPurgeMarked(state->callbacks);
>>> -
>>> state->isDispatching = false;
>>> virObjectEventStateUnlock(state);
>>> }
>>
>> Well, I have no idea either, because virObjectEventStateQueueDispatch()
>> calls virObjectEventStateDispatchCallbacks() which in turn calls
>> virObjectEventDispatchMatchCallback() which returns false if the
>> callback->deleted is truem and thus the callback is skipped - just like
>> it is after this patch of yours.
>>
>> This whole bug feels like a refcounting problem and IMO your patch is
>> just making the scenario more unlikely.
>>
>
> So I've found out what it is thanks to Luayo Huang. And if I understand
> it correctly, it won't happen with stateful drivers. Stateless,
> however, will have this problem.
>
> The problem is that testConnectClose() clears the driver structure,
> including calling virObjectEventStateFree() on the eventState. It
> clears the whole eventState, but it can happen while the eventState is
> dispatching. So there's the race. And it would also explain what I was
> seeing all the time. qemuCleanup() is the only function in qemu driver
> that would cleanup the eventState and since that gets called from the
> daemon only after other stuff is cleaned up, it won't happen there.
>
Ah, yeah. I can see that in the code now too. Good job!
> I'll try turning virObjectEventStatePtr into virObject, we'll have
> refcounting then and it will not disappear until it's stopped being
> used. I'll see if that helps or not, but it could.
Yep, this is classical example of why we need refcounted objects.
Michal
More information about the libvir-list
mailing list