[libvirt] [PATCH v5 3/3] libvirtd: fix crash on termination

Nikolay Shirokovskiy nshirokovskiy at virtuozzo.com
Fri Nov 3 09:20:53 UTC 2017



On 03.11.2017 11:42, Martin Kletzander wrote:
> On Fri, Nov 03, 2017 at 11:07:36AM +0300, Nikolay Shirokovskiy wrote:
>> On 02.11.2017 19:32, Martin Kletzander wrote:
>>> This just makes the window of opportunity (between daemonServerClose()
>>> and the actual removal of the virNetServerPtr from the hash) smaller.
>>> That's why I don't see it as a fix, rather as a workaround.
>>>
>>> What I think should be done is the order of what's done with services,
>>> servers, drivers, daemon, workers, etc.
>>>
>>> I read the other threds and I think we're on the same note here, it's
>>> just that there is lot of confusion and we're all approaching it
>>> differently.
>>>
>>> So first let's agree on what should be done when virNetDaemonClose() is
>>> called and the loop in virNetDaemonRun() exits.  My opinion it is:
>>>
>>> 1) Stop accepting new calls and services:
>>>    virNetServerServiceToggle(false);
>>
>> This not really necessary as this has effect only if loop is running.
>>
> 
> Sure, good point.
> 
>>>
>>> 2) Wait for all the workers to finish:
>>>    First part of virThreadPoolFree()
>>
>> I was thinking of splitting this function to finish/free too.
>> After finish it is not really usable because there are no
>> more threads in pool anymore so we have to introduce state
>> finished and check it in every thread pool api function to
>> be safe. So this will introduce a bit of complexity only
>> for the sake of some scheme IMO.
>>
> 
> Yeah, it's similar to the daemon when you are between close and dispose, but for
> the daemon we need it.
> 
> What function do you think we would need to check this for?  The finish would
> end all the threads and since the event loop is not running no new jobs could be
> posted.

virThreadPoolSendJob is already good in this respect as it checks @quit at the beginning.
Looks like we need only to add same check to virThreadPoolSetParameters. So
not so much complexity. 

I can imagine thread pool function being called from another
thread pools thread. AFAIK there are not such situation though. So just to be on a safe side.

> 
>>>
>>> 3) Kill off clients:
>>>    virNetServerClientClose()
>>>
>>> 4) Clean up drivers:
>>>    virStateCleanup()
>>
>> I think it is better for net daemon/servers not to know about
>> hypervisor drivers directly.
>>
> 
> Sure, I was only talking about the libvirtd for the sake of simplicity.
> 
>>>
>>> 5) Clean up services, servers, daemon, etc.
>>>    including the second part of virThreadPoolFree()
>>>
>>> The last thing is what should be done in virNetDaemonDispose(), but
>>> unfortunately we (mis)use it for more than that.  Numbers 1-4 should be,
>>> IMO in virNet{Daemon,Server}Close().
>>>
>>> Would this solve your problem?  I mean I'm not against more reference
>>> counting, it should be done properly, but I think the above would do the
>>> trick and also make sense from the naming point of view.
>>>
>>> Since both the daemon and server will probably not ever be consistent
>>> after calling the Close function, it might make sense to just include
>>> virNetDaemonClose() into virNetDaemonDispose() and I'm not against that.
>>> But the order of the steps should still be as described above IMO.
>>
>> This won't work IMO. If we are already in virNetDaemonDispose then
>> @dmn object is invalid but we don't yet call virStateCleanup yet
>> (if it is moved to virNetDaemonClose) and this is the problem 85c3a182 dealt with.
>>
> 
> It would solve it if order above is followed.  Sure, virNetDaemon should not

It is ok if virStateCleanup is called from virNetDaemonClose somehow. I was
only against calling virNetDaemonClose from virNetDaemonDispose in this case.

> know about any drivers, but we can simply register a cleanup callback for the
> daemon which the Dispose function will call after the thread pools are cleaned
> up (just thought of that now).

Well to me it is too generic for this case. We can code this order just
as order of statements.

1. close netservers (which imply finishing rpc thread pool)
2. close hypervisor drivers
3. unref @dmn

Or we can use refcounts and take point 3 out of consideration.
Also there is a patch series [1] that introduces 0 step to help 1 step to finish actually
in the situation when loop is down.

0. shutdown hypervirsor drivers.

[1] https://www.redhat.com/archives/libvir-list/2017-October/msg01134.html
> 
>>>
>>> Let me know what you think.  And don't forget to have a nice day!
>>>
>>> Martin
>>>
>>> P.S.: The order of clean ups in libvirtd's main() is absolutely terrible and
>>>      unreadable.  Also, even though I guess it goes without saying, if proper
>>>      reference counting is in place, the order of virObjectUnref()s and
>>>      VIR_FREE()s should not matter.  I think it chould be changed like this:
>>
>> It makes sense that in cleanup section we can first make all cleanup and
>> then all dispose. It is only not clear to me can we call virStateCleanup
>> before virNetlinkShutdown. Also I would keep comment for @dmn unref
>> until proper refcount for @dmn is implemented.
>>
> 
> I don't think virStateCleanup() and virNetlinkShutdown() are dependent.  The
> comment can stay, I was just trying to show that it's also a mess.  What I would
> like, actually, is to have the daemon running with all the references and
> pointers (that the main() function doesn't need anymore) freed, but both this
> and what I suggested below is unrelated to what we're trying to fix, I don't
> know why I started talking about it.




More information about the libvir-list mailing list