[libvirt] libvirtd deadlock on shutdown

Daniel P. Berrange berrange at redhat.com
Thu Jun 21 16:00:25 UTC 2012


On Thu, Jun 21, 2012 at 09:56:54AM -0600, Jim Fehlig wrote:
> Daniel P. Berrange wrote:
> > On Wed, Jun 20, 2012 at 03:07:16PM -0600, Jim Fehlig wrote:
> >   
> >> I'm looking into a libvirtd deadlock on daemon shutdown.  The deadlock
> >> occurs when shutting down virNetServer.  If the handling of a job is in
> >> flight in virNetServerHandleJob(), the virNetServer lock is acquired
> >> when freeing job->prog (src/rpc/virnetserver.c:167).  But the lock is
> >> already held in virNetServerFree(), which is blocked in
> >> virThreadPoolFree() waiting for all the workers to finish.  No progress
> >> can be made.
> >>
> >> The attached hack fixes the problem, but I'm not convinced this is an
> >> appropriate fix.  Is it necessary to hold the virNetServer lock when
> >> calling virNetServerProgramFree(job->prog)?  I notice the lock is not
> >> held in the error path of virNetServerHandleJob().
> >>
> >> Thanks,
> >> Jim
> >>
> >>     
> >
> >   
> >> diff --git a/src/rpc/virnetserver.c b/src/rpc/virnetserver.c
> >> index ae19e84..edd3196 100644
> >> --- a/src/rpc/virnetserver.c
> >> +++ b/src/rpc/virnetserver.c
> >> @@ -774,7 +774,9 @@ void virNetServerFree(virNetServerPtr srv)
> >>      for (i = 0 ; i < srv->nservices ; i++)
> >>          virNetServerServiceToggle(srv->services[i], false);
> >>  
> >> +    virNetServerUnlock(srv);
> >>      virThreadPoolFree(srv->workers);
> >> +    virNetServerLock(srv);
> >>  
> >>      for (i = 0 ; i < srv->nsignals ; i++) {
> >>          sigaction(srv->signals[i]->signum, &srv->signals[i]->oldaction, NULL);
> >>     
> >
> > The virNetServerPtr object is ref-counted, so technical;ly
> > only decrementing the ref count needs to be protected. Once
> > the ref count hits 0, then only the current thread (and the
> > workers) should be using the virNetServerPtr instance.
> >   
> 
> Ah, right.  Thanks for clarifying.
> 
> > So, yes, it is safe to call virNetServerUnlock before
> > virThreadPoolFree. Furthermore, it is /not/ required
> > to call virNetServerLock afterwards - no other thread
> > should be using it now the workers are dead. So you
> > can skip that extra lock call, and also remove the
> > unlock call much later
> >   
> 
> Something like the attached patch?
> 
> Regards,
> Jim
> 

> From 583be33213e922899b23f036494886397b2549dc Mon Sep 17 00:00:00 2001
> From: Jim Fehlig <jfehlig at suse.com>
> Date: Thu, 21 Jun 2012 09:21:44 -0600
> Subject: [PATCH] Fix deadlock on libvirtd shutdown
> 
> When shutting down libvirtd, the virNetServer shutdown can deadlock
> if there are in-flight jobs being handled by virNetServerHandleJob().
> virNetServerFree() will acquire the virNetServer lock and call
> virThreadPoolFree() to terminate the workers, waiting for the workers
> to finish.  But in-flight workers will attempt to acquire the
> virNetServer lock, resulting in deadlock.
> 
> Fix the deadlock by unlocking the virNetServer lock before calling
> virThreadPoolFree().  This is safe since the virNetServerPtr object
> is ref-counted and only decrementing the ref count needs to be
> protected.  Additionally, there is no need to re-acquire the lock
> after virThreadPoolFree() completes as all the workers have
> terminated.
> ---
>  src/rpc/virnetserver.c |    6 ++----
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/src/rpc/virnetserver.c b/src/rpc/virnetserver.c
> index ae19e84..9d71e53 100644
> --- a/src/rpc/virnetserver.c
> +++ b/src/rpc/virnetserver.c
> @@ -766,10 +766,9 @@ void virNetServerFree(virNetServerPtr srv)
>      virNetServerLock(srv);
>      VIR_DEBUG("srv=%p refs=%d", srv, srv->refs);
>      srv->refs--;
> -    if (srv->refs > 0) {
> -        virNetServerUnlock(srv);
> +    virNetServerUnlock(srv);
> +    if (srv->refs > 0)
>          return;
> -    }
>  
>      for (i = 0 ; i < srv->nservices ; i++)
>          virNetServerServiceToggle(srv->services[i], false);
> @@ -805,7 +804,6 @@ void virNetServerFree(virNetServerPtr srv)
>      virNetServerMDNSFree(srv->mdns);
>  #endif
>  
> -    virNetServerUnlock(srv);
>      virMutexDestroy(&srv->lock);
>      VIR_FREE(srv);
>  }

ACK,

We really should resurrect the patches adding virAtomic for refcounting :-)


Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|




More information about the libvir-list mailing list