[PATCH v2 10/13] qemu: avoid deadlock in qemuDomainObjStopWorker

Daniel P. Berrangé berrange at redhat.com
Fri Sep 4 16:51:22 UTC 2020


On Fri, Sep 04, 2020 at 05:46:18PM +0100, Daniel P. Berrangé wrote:
> On Thu, Jul 23, 2020 at 01:14:10PM +0300, Nikolay Shirokovskiy wrote:
> > We are dropping the only reference here so that the event loop thread
> > is going to be exited synchronously. In order to avoid deadlocks we
> > need to unlock the VM so that any handler being called can finish
> > execution and thus even loop thread be finished too.
> > 
> > Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy at virtuozzo.com>
> > ---
> >  src/qemu/qemu_domain.c | 18 ++++++++++++++----
> >  1 file changed, 14 insertions(+), 4 deletions(-)
> > 
> > diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
> > index 5b22eb2..82b3d11 100644
> > --- a/src/qemu/qemu_domain.c
> > +++ b/src/qemu/qemu_domain.c
> > @@ -1637,11 +1637,21 @@ void
> >  qemuDomainObjStopWorker(virDomainObjPtr dom)
> >  {
> >      qemuDomainObjPrivatePtr priv = dom->privateData;
> > +    virEventThread *eventThread;
> >  
> > -    if (priv->eventThread) {
> > -        g_object_unref(priv->eventThread);
> > -        priv->eventThread = NULL;
> > -    }
> > +    if (!priv->eventThread)
> > +        return;
> > +
> > +    /*
> > +     * We are dropping the only reference here so that the event loop thread
> > +     * is going to be exited synchronously. In order to avoid deadlocks we
> > +     * need to unlock the VM so that any handler being called can finish
> > +     * execution and thus even loop thread be finished too.
> > +     */
> > +    eventThread = g_steal_pointer(&priv->eventThread);
> > +    virObjectUnlock(dom);
> > +    g_object_unref(eventThread);
> > +    virObjectLock(dom);
> 
> Hmm, I'm really not at all comfortable with this. I don't have confidence
> that qemuProcessStop() safe if we drpo & reacquire the lock half way
> through its cleanup process.  The 'virDomainObj' is in a semi-clean
> state, 1/2 running 1/2 shutoff.
> 
> This is the key reason I think I didn't do the synchronous thread join
> in the event loop.

Hmm, I see your justification in the other reply now.

Reviewed-by: Daniel P. Berrangé <berrange at redhat.com>


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|




More information about the libvir-list mailing list