[libvirt] [PATCH v5 3/3] libvirtd: fix crash on termination
Nikolay Shirokovskiy
nshirokovskiy at virtuozzo.com
Tue Oct 31 06:54:20 UTC 2017
On 30.10.2017 19:21, Martin Kletzander wrote:
> On Mon, Oct 30, 2017 at 07:14:39AM -0400, John Ferlan wrote:
>> From: Nikolay Shirokovskiy <nshirokovskiy at virtuozzo.com>
>>
>> The problem is incorrect order of qemu driver shutdown and shutdown
>> of netserver threads that serve client requests (thru qemu driver
>> particularly).
>>
>> Net server threads are shutdown upon dispose which is triggered
>> by last daemon object unref at the end of main function. At the same
>> time qemu driver is shutdown earlier in virStateCleanup. As a result
>> netserver threads see invalid driver object in the middle of request
>> processing.
>>
>> Let's move shutting down netserver threads earlier to virNetDaemonClose.
>>
>> Note: order of last daemon unref and virStateCleanup
>> is introduced in 85c3a182 for a valid reason.
>>
>
> I must say I don't believe that's true. Reading it back, that patch is
> wrong IMHO. I haven't gone through all the details of it and I don't
> remember them from when I rewrote part of it, but the way this should
> be handled is different.
>
> The way you can clearly identify such issues is when you see that one
> thread determines the validity of data (pointer in this case). This
> must never happen. That means that the pointer is used from more places
> than how many references that object has. However each of the pointers
> for such object should have their own reference.
>
> So the problem is probably somewhere else.
If I understand you correctly we can fix issue in 85c3a182 in
a differenct way. Like adding reference to daemon for every
driver that uses it for shutdown inhibition. It will require to
pass unref function to driver init function or a bit of wild casting
to virObjectPtr for unref. Not sure it is worth it in this case.
Anyway the initical conditions for current patch stay the same -
daemon object will be unrefed after state drivers cleanup and
RPC requests could deal with invalid state driver objects.
>
>> Signed-off-by: John Ferlan <jferlan at redhat.com>
>> ---
>> src/rpc/virnetdaemon.c | 1 +
>> 1 file changed, 1 insertion(+)
>>
>> diff --git a/src/rpc/virnetdaemon.c b/src/rpc/virnetdaemon.c
>> index 8c21414897..33bd8e3b06 100644
>> --- a/src/rpc/virnetdaemon.c
>> +++ b/src/rpc/virnetdaemon.c
>> @@ -881,6 +881,7 @@ virNetDaemonClose(virNetDaemonPtr dmn)
>> virObjectLock(dmn);
>>
>> virHashForEach(dmn->servers, daemonServerClose, NULL);
>> + virHashRemoveAll(dmn->servers);
>>
>
> If I get this correctly, you are removing the servers so that their workers get
> cleaned up, but that should be done in a separate step. Most probably what
> daemonServerClose should be doing. This is a workaround, but not a proper fix.
Well, original patch has a different approach for stopping RPC calls (see [1]) close
to what you suggest. I think in this case it is matter of taste because there are
no usecases which help us to prefer one way over another so I'm ok with both of them.
[1] https://www.redhat.com/archives/libvir-list/2017-September/msg00999.html
[2] https://www.redhat.com/archives/libvir-list/2017-October/msg00643.html (thread dicussing [1])
> If that's not true than the previous commit mentioned here should also be fixed
> differently.
Sorry I don't understand it. Can you elaborate on this point?
Nikolay
>
> So this is a clear NACK from my POV.
>
>> virObjectUnlock(dmn);
>> }
>> --
>> 2.13.6
>>
>> --
>> libvir-list mailing list
>> libvir-list at redhat.com
>> https://www.redhat.com/mailman/listinfo/libvir-list
>
>
> --
> libvir-list mailing list
> libvir-list at redhat.com
> https://www.redhat.com/mailman/listinfo/libvir-list
>
More information about the libvir-list
mailing list