[Date Prev][Date Next] [Thread Prev][Thread Next]
Re: [libvirt] [PATCH v5 3/3] libvirtd: fix crash on termination
- From: Nikolay Shirokovskiy <nshirokovskiy virtuozzo com>
- To: John Ferlan <jferlan redhat com>, Martin Kletzander <mkletzan redhat com>
- Cc: libvir-list redhat com
- Subject: Re: [libvirt] [PATCH v5 3/3] libvirtd: fix crash on termination
- Date: Thu, 2 Nov 2017 10:49:35 +0300
On 01.11.2017 21:51, John Ferlan wrote:
> On 10/31/2017 02:54 AM, Nikolay Shirokovskiy wrote:
>> On 30.10.2017 19:21, Martin Kletzander wrote:
>>> On Mon, Oct 30, 2017 at 07:14:39AM -0400, John Ferlan wrote:
>>>> From: Nikolay Shirokovskiy <nshirokovskiy virtuozzo com>
>>>> The problem is incorrect order of qemu driver shutdown and shutdown
>>>> of netserver threads that serve client requests (thru qemu driver
>>>> Net server threads are shutdown upon dispose which is triggered
>>>> by last daemon object unref at the end of main function. At the same
>>>> time qemu driver is shutdown earlier in virStateCleanup. As a result
>>>> netserver threads see invalid driver object in the middle of request
>>>> Let's move shutting down netserver threads earlier to virNetDaemonClose.
>>>> Note: order of last daemon unref and virStateCleanup
>>>> is introduced in 85c3a182 for a valid reason.
>>> I must say I don't believe that's true. Reading it back, that patch is
>>> wrong IMHO. I haven't gone through all the details of it and I don't
>>> remember them from when I rewrote part of it, but the way this should
>>> be handled is different.
>>> The way you can clearly identify such issues is when you see that one
>>> thread determines the validity of data (pointer in this case). This
>>> must never happen. That means that the pointer is used from more places
>>> than how many references that object has. However each of the pointers
>>> for such object should have their own reference.
>>> So the problem is probably somewhere else.
>> If I understand you correctly we can fix issue in 85c3a182 in
>> a differenct way. Like adding reference to daemon for every
>> driver that uses it for shutdown inhibition. It will require to
>> pass unref function to driver init function or a bit of wild casting
>> to virObjectPtr for unref. Not sure it is worth it in this case.
> caveat: I'm still in a post KVM Forum travel brain fog...
> Perhaps a bit more simple than that... Since the 'inhibitCallback' and
> the 'inhibitOpaque' are essentially the mechanism that would pass/use
> @dmn, then it'd be up to the inhibitCallback to add/remove the reference
> with the *given* that the inhibitOpaque is a lockable object anyway...
> Thus, virNetDaemonAddShutdownInhibition could use virObjectRef and
> virNetDaemonRemoveShutdownInhibition could use virObjectUnref... The
> "catch" is that for Shutdown the Unref would need to go after Unlock.
> I believe that then would "replicate" the virObjectRef done in
> daemonStateInit and the virObjectUnref done at the end of
> daemonRunStateInit with respect to "some outside thread" being able to
> use @dmn and not have it be Unref'd for the last time at any point in
> time until the "consumer" was done with it.
> Moving the Unref to after all the StateCleanup calls were done works
> because it accomplishesd the same/similar task, but just held onto @dmn
> longer because the original implementation didn't properly reference dmn
> when it was being used for some other consumer/thread. Of course that
> led to other problems which this series is addressing and I'm not clear
> yet as to the impact vis-a-vis your StateShutdown series.
Ok, 85c3a182 can be rewritten this way. It is more staightforward to
ref/unref by consumer thread instead of relying on consumer structure
(in this case 85c3a182 relies upon the fact that after virStateCleanup
no hypervisor driver would use @dmn object).
But we still have the issues address by this patch series and state
shutdown series because the order in which @dmn and other objects
will be disposed would be same for 85c3a182 and proposed 85c3a182
>> Anyway the initical conditions for current patch stay the same -
>> daemon object will be unrefed after state drivers cleanup and
>> RPC requests could deal with invalid state driver objects.
>>>> Signed-off-by: John Ferlan <jferlan redhat com>
>>>> src/rpc/virnetdaemon.c | 1 +
>>>> 1 file changed, 1 insertion(+)
>>>> diff --git a/src/rpc/virnetdaemon.c b/src/rpc/virnetdaemon.c
>>>> index 8c21414897..33bd8e3b06 100644
>>>> --- a/src/rpc/virnetdaemon.c
>>>> +++ b/src/rpc/virnetdaemon.c
>>>> @@ -881,6 +881,7 @@ virNetDaemonClose(virNetDaemonPtr dmn)
>>>> virHashForEach(dmn->servers, daemonServerClose, NULL);
>>>> + virHashRemoveAll(dmn->servers);
>>> If I get this correctly, you are removing the servers so that their workers get
>>> cleaned up, but that should be done in a separate step. Most probably what
>>> daemonServerClose should be doing. This is a workaround, but not a proper fix.
> Are you sure? The daemonServerClose is the callback for virHashForEach
> which means the table->iterating would be set thus preventing
> daemonServerClose from performing a virHashRemoveEntry of the server
> element from the list.
> So this is to me the right fix.
>> Well, original patch has a different approach for stopping RPC calls (see ) close
>> to what you suggest. I think in this case it is matter of taste because there are
>> no usecases which help us to prefer one way over another so I'm ok with both of them.
>>  https://www.redhat.com/archives/libvir-list/2017-September/msg00999.html
>>  https://www.redhat.com/archives/libvir-list/2017-October/msg00643.html (thread dicussing )
>>> If that's not true than the previous commit mentioned here should also be fixed
>> Sorry I don't understand it. Can you elaborate on this point?
>>> So this is a clear NACK from my POV.
>>>> libvir-list mailing list
>>>> libvir-list redhat com
>>> libvir-list mailing list
>>> libvir-list redhat com
[Date Prev][Date Next] [Thread Prev][Thread Next]