[libvirt] [PATCH 0/9] Resolve libvirtd hang on termination with connected long running client

Nikolay Shirokovskiy nshirokovskiy at virtuozzo.com
Fri Jan 26 08:47:04 UTC 2018



On 19.01.2018 20:23, John Ferlan wrote:
> RFC:
> https://www.redhat.com/archives/libvir-list/2018-January/msg00318.html
> 
> Adjustments since RFC...
> 
> Patches 1&2: No change, were already R-B'd
> Patch 3: Removed code as noted in code review, update commit message
> Patch 4: From old series removed, see below for more details
> Patch 9: no change
> NB: Patches 5-8 and 10 from Nikolay Shirokovskiy <nshirokovskiy at virtuozzo.com>
>     are removed as they seemed to not be necessary
> 
> Replaced the former patch 4 with series of patches to (slowly) provide
> support to disable new connections, handle removing waiting jobs, causing
> the waiting workers to quit, and allow any running jobs to complete.
> 
> As it turns out, waiting for running jobs to complete cannot be done
> from the virNetServerClose callbacks because that means the event loop
> processing done during virNetServerRun will not allow any currently
> long running worker job thread a means to complete.

Hi, John.

So you suggest a different appoarch. Instead of introducing means to 
help rpc threads to finish after event loop is finished (stateShutdown hook in my series)
you suggest to block futher new rpc processing and then waiting running
rpc calls to finish keeping event loop running for that purpose.

This could lead to libvirtd never finish its termination if one of
qemu processes do not respond for example. In case of long running
operation such as migration finishing can take much time. On the 
over hand if we finish rpc threads abruptly as in case of stateShutdown hook
we will deal with all possible error paths on daemon finishing. May
be good approach is to abort long running operation, then give rpc threads
some time to finish as you suggest and only after that finish them abruptly to handle
hanging qemu etc.

Also in this approach we keep event loop running for rpc threads only 
but there can be other threads that depend on the loop. For example if 
qemu is rebooted we spawn a thread that executes qemuProcessFakeReboot
that sends commands to qemu (i.e. depends on event loop). We need to await
such threads finishing too while keep event loop running. In approach of
stateShutdown hook we finish all threads that uses qemu monitor by closing
the monitor.

And hack with timeout in a loop... I think of a different aproach for waiting
rpc threads to finish their work. First let's make drain function only flush
job queue and take some callback which will be called when all workers will
be free from work (let's keep worker threads as Dan suggested). In this callback we
can use same technique as in virNetDaemonSignalHandler. That is make some
IO to set some flag in the event loop thread and cause virEventRunDefaultImpl
to finish and then test this flag in virNetDaemonRun.

Nikolay

> 
> So when virNetDaemonQuit is called as a result of the libvirtd signal
> handlers for SIG{QUIT|INT|TERM}, instead of just causing virNetServerRun
> to quit immediately, alter to using a quitRequested flag and then use
> that quitRequested flag to check for long running worker threads before
> causing the event loop to quit resulting in libvirtd being able to run
> through the virNetDaemonClose processing.
> 
> John Ferlan (9):
>   libvirtd: Alter refcnt processing for domain server objects
>   libvirtd: Alter refcnt processing for server program objects
>   netserver: Remove ServiceToggle during ServerDispose
>   util: Introduce virThreadPoolDrain
>   rpc: Introduce virNetServerQuitRequested
>   rpc: Introduce virNetServerWorkerCount
>   rpc: Alter virNetDaemonQuit processing
>   docs: Add news article for libvirtd issue
>   APPLY ONLY FOR TESTING PURPOSES
> 
>  daemon/libvirtd.c        | 43 +++++++++++++++++++++++---------
>  docs/news.xml            | 12 +++++++++
>  src/libvirt_private.syms |  1 +
>  src/libvirt_remote.syms  |  2 ++
>  src/qemu/qemu_driver.c   |  5 ++++
>  src/rpc/virnetdaemon.c   | 45 +++++++++++++++++++++++++++++++++-
>  src/rpc/virnetserver.c   | 52 ++++++++++++++++++++++++++++++++++++---
>  src/rpc/virnetserver.h   |  4 +++
>  src/util/virthreadpool.c | 64 ++++++++++++++++++++++++++++++++++++++++--------
>  src/util/virthreadpool.h |  2 ++
>  10 files changed, 204 insertions(+), 26 deletions(-)
> 




More information about the libvir-list mailing list