[libvirt] [PATCH v5 3/3] libvirtd: fix crash on termination

Nikolay Shirokovskiy nshirokovskiy at virtuozzo.com
Fri Dec 22 07:52:45 UTC 2017



On 21.12.2017 15:14, Erik Skultety wrote:
> On Thu, Dec 21, 2017 at 12:48:44PM +0300, Nikolay Shirokovskiy wrote:
>>
>>
>> On 21.12.2017 11:49, Erik Skultety wrote:
>>> On Thu, Dec 21, 2017 at 09:39:16AM +0100, Cedric Bosdonnat wrote:
>>>> Hi John,
>>>>
>>>> On Wed, 2017-12-20 at 16:52 -0500, John Ferlan wrote:
>>>>> So I moved the Unref(dmn) after virStateCleanup... That of course led to
>>>>> the virt-manager issue.  So I figured I'd give adding this patch in and
>>>>> that then fixed the virt-manager crash (obviously as we'd be Unref'ing
>>>>> all those servers we added).
>>>>
>>>> Seems like I forgot to git-send my patch yesterday evening: here it is:
>>>>
>>>> https://www.redhat.com/archives/libvir-list/2017-December/msg00739.html
>>>>
>>>> That fixes the virt-manager crash at least.
>>>
>>> Hi, so the patch looks good, at first glance it makes complete sense to close
>>> the client connections when closing the server and unreffing the objects when
>>> disposing of the server object. However, since this and the other related
>>> issues have been bugging us for quite some time I'm not going to say ACK just
>>> yet, I'd like to think about it for a bit - but if someone else already has and
>>> they feel brave enough :P, then I'm not going to block this to go in.
>>>
>>> Erik
>>>
>>
>> Patch looks good to me too. But still original "libvirtd: fix crash on termination"
>> fixes another issue and if applied fixes "virt-manager issue" as well as John
>> figured out.
> 
> Finally I'm back on track with this (sorry for it taking so long), although
> you're right about this, it's not the correct fix, it's just a byproduct of
> your patch, however, the whole thing about closing connections and releasing
> memory is a bit of a mess. For example, right now, we only dispose of service
> objs (but we don't close them, we do that in virNetServerClose), but we both
> close and dispose of the client objs. Another thing, we toggle the service to
> stop accepting any new connections, but that's irrelevant at that point because
> we've closed them already by that time as a product of calling
> virNetDaemonClose->virNetServerClose->virNetServerServiceClose - so that
> should be removed...I drifted a bit here, anyway, we need to make a clear
> distinction when we're releasing memory and when we're shutting down some kind
> of service - sometimes both can be done at the same time (see below), however
> it's very clear we can't do it here. Your issue is that Close currently only
> closes the sockets (this reflects my point of "shutting down a service") but
> does nothing with the threadpool connected to the server, thus leaving the
> worker threads to continue running and executing APIs the results of which they
> shouldn't even be able to return back to the client, since we're shutting down.

To me it is not the first order problem. On daemon shutdown clients will anyway
most probably receive some kind of error. 

But we should finish worker threads before we cleanup drivers.

> Now the thing with threadpool is that both memory release and teardown are
> merged into one "object disposal" operation and therefore done as part of
> virNetServerDispose. Since I understand a removal from the hash table as a
> memory release operation, we should not be calling virHashRemoveAll from
> virNetDaemonClose. Now, I see 2 options:
> 
> 1) split virThreadPoolFree into 2 functions, one which only broadcasts the
> "die" message and joins the threads (or waits for them in this case...) and the
> other releasing the resources - can't say I'm a fan of this one
> 
> 2) call virThreadPoolFree from virNetServerClose instead...
> 
> None of those approaches is ideal, but I can't seem to think off anything
> better at the moment.
> 
> I'm open to discuss any other suggestions.
> Erik
> 

2nd options looks good to me. Netserver is finished upon close so there are no workers
anymore. And we are not going to restart/reuse netserver after that and reininialize
workers (and if we do we change thread pool as well to wait for threads in a pool
on stop but not finishing them etc etc). So it quite sane to free thread pool
on close to me. Why we not freeing other internal objects, why we not destroing
locks? Well other threads could be using this object so to keep things simple we
keep some basic internal objects likes locks but we can dispose clients and
services on close as well.

By the way we will need to check whether workers != NULL in virNetServerGetThreadPoolParameters
for example after that. (Quite possible situation - somebody queries thread pool parameters of main
server thru admin server).

Nikolay




More information about the libvir-list mailing list