[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [libvirt] [PATCH 0/4] libvirtd: fix hang on termination in qemu driver

On 10/25/2017 05:05 AM, Nikolay Shirokovskiy wrote:
> Libvirtd termination can hang. For example if some API call in qemu
> driver awaiting monitor response it will never finish because event
> loop does not functional during termination. As a result we hang
> in virNetDaemonClose call during termination as this call finishes RPC
> threads.
> Let's ask hypervisor drivers to finish all API calls by calling
> introduced state driver shutdown function before call to virNetDaemonClose.
> Nikolay Shirokovskiy (4):
>   libvirt: introduce hypervisor driver shutdown function
>   qemu: implement state driver shutdown function
>   qemu: agent: fix monitor close during first sync
>   qemu: monitor: check monitor not closed upon send
>  daemon/libvirtd.c        |  2 ++
>  src/driver-state.h       |  4 ++++
>  src/libvirt.c            | 18 ++++++++++++++++++
>  src/libvirt_internal.h   |  1 +
>  src/libvirt_private.syms |  1 +
>  src/qemu/qemu_agent.c    | 14 +++++++-------
>  src/qemu/qemu_driver.c   | 39 +++++++++++++++++++++++++++++++++++++++
>  src/qemu/qemu_monitor.c  | 27 +++++++++++++--------------
>  8 files changed, 85 insertions(+), 21 deletions(-)

Moving the discussion started in the refcount thread to here via just
simple cut-n-paste... I didn't CC Erik or Martin directly because I'm
fairly confident they read libvir-list posts anyway ;-)


[jferlan comment]

The other series seems to be adding a state shutdown step to be run
before state cleanup; however, if we alter the cleanup code in libvirtd
does that make the state shutdown step unnecessary? I don't know and
really didn't want to think about it until we got through thinking about
the cleanup processing order. Still if one considers that state
initialization is really two steps (init and auto start), then splitting
cleanup into shutdown and cleanup seems reasonable, but I don't think
affects whether we have a crash or latent usage of dmn->servers. I could
be wrong, but it's so hard to tell.

[eskultety comment/followup]

Actually, no, the stateShutdown was addressing the issue with running
<driver>StateCleanup while there still was an active worker thread that
could potentially access the @driver object, in which case - SIGSEGV,
but as I wrote in my response, I don't like the idea of having another
state driver callback, as I feel it introduces a bit of ambiguity, since
by just looking at the callback names, you can't have an idea, what the
difference between stateCleanup and stateShutdown is. I think this
should be handled as part of the final stage of leaving the event loop
and simply register an appropriate callback to the handle, that way, you
don't need to define a new state callback which would most likely used
by a couple of drivers.

[mkletzan comment/followup]

If stateShutdown is what makes everything go away, then why not?  The
names are pretty descriptive, shutdown is called when the daemon is
shutting down and cleanup when you need to clean up after yourself.
Used by only few drivers? Well, we can implement it everywhere and not
make it make optional, but rather mandatory.  If that is checked on the
registration then we'll know right away when we missed something (aat
runtime, but right after someone tries running it).


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]