[libvirt] libvirtd deadlock on shutdown

Hu Tao hutao at cn.fujitsu.com
Fri Jun 22 03:38:14 UTC 2012


Cced Danp.

I'm sorry for the unconvenience, but my mail server occasionally cuts my
CC list for unknown reason.

On Fri, Jun 22, 2012 at 11:26:03AM +0800, Hu Tao wrote:
> On Thu, Jun 21, 2012 at 09:56:54AM -0600, Jim Fehlig wrote:
> > Daniel P. Berrange wrote:
> > > On Wed, Jun 20, 2012 at 03:07:16PM -0600, Jim Fehlig wrote:
> > >   
> > >> I'm looking into a libvirtd deadlock on daemon shutdown.  The deadlock
> > >> occurs when shutting down virNetServer.  If the handling of a job is in
> > >> flight in virNetServerHandleJob(), the virNetServer lock is acquired
> > >> when freeing job->prog (src/rpc/virnetserver.c:167).  But the lock is
> > >> already held in virNetServerFree(), which is blocked in
> > >> virThreadPoolFree() waiting for all the workers to finish.  No progress
> > >> can be made.
> > >>
> > >> The attached hack fixes the problem, but I'm not convinced this is an
> > >> appropriate fix.  Is it necessary to hold the virNetServer lock when
> > >> calling virNetServerProgramFree(job->prog)?  I notice the lock is not
> > >> held in the error path of virNetServerHandleJob().
> > >>
> > >> Thanks,
> > >> Jim
> > >>
> > >>     
> > >
> > >   
> > >> diff --git a/src/rpc/virnetserver.c b/src/rpc/virnetserver.c
> > >> index ae19e84..edd3196 100644
> > >> --- a/src/rpc/virnetserver.c
> > >> +++ b/src/rpc/virnetserver.c
> > >> @@ -774,7 +774,9 @@ void virNetServerFree(virNetServerPtr srv)
> > >>      for (i = 0 ; i < srv->nservices ; i++)
> > >>          virNetServerServiceToggle(srv->services[i], false);
> > >>  
> > >> +    virNetServerUnlock(srv);
> > >>      virThreadPoolFree(srv->workers);
> > >> +    virNetServerLock(srv);
> > >>  
> > >>      for (i = 0 ; i < srv->nsignals ; i++) {
> > >>          sigaction(srv->signals[i]->signum, &srv->signals[i]->oldaction, NULL);
> > >>     
> > >
> > > The virNetServerPtr object is ref-counted, so technical;ly
> > > only decrementing the ref count needs to be protected. Once
> > > the ref count hits 0, then only the current thread (and the
> > > workers) should be using the virNetServerPtr instance.
> > >   
> > 
> > Ah, right.  Thanks for clarifying.
> > 
> > > So, yes, it is safe to call virNetServerUnlock before
> > > virThreadPoolFree. Furthermore, it is /not/ required
> > > to call virNetServerLock afterwards - no other thread
> > > should be using it now the workers are dead. So you
> > > can skip that extra lock call, and also remove the
> > > unlock call much later
> > >   
> > 
> > Something like the attached patch?
> > 
> > Regards,
> > Jim
> > 
> 
> > >From 583be33213e922899b23f036494886397b2549dc Mon Sep 17 00:00:00 2001
> > From: Jim Fehlig <jfehlig at suse.com>
> > Date: Thu, 21 Jun 2012 09:21:44 -0600
> > Subject: [PATCH] Fix deadlock on libvirtd shutdown
> > 
> > When shutting down libvirtd, the virNetServer shutdown can deadlock
> > if there are in-flight jobs being handled by virNetServerHandleJob().
> > virNetServerFree() will acquire the virNetServer lock and call
> > virThreadPoolFree() to terminate the workers, waiting for the workers
> > to finish.  But in-flight workers will attempt to acquire the
> > virNetServer lock, resulting in deadlock.
> > 
> > Fix the deadlock by unlocking the virNetServer lock before calling
> > virThreadPoolFree().  This is safe since the virNetServerPtr object
> > is ref-counted and only decrementing the ref count needs to be
> > protected.  Additionally, there is no need to re-acquire the lock
> > after virThreadPoolFree() completes as all the workers have
> > terminated.
> > ---
> >  src/rpc/virnetserver.c |    6 ++----
> >  1 file changed, 2 insertions(+), 4 deletions(-)
> > 
> > diff --git a/src/rpc/virnetserver.c b/src/rpc/virnetserver.c
> > index ae19e84..9d71e53 100644
> > --- a/src/rpc/virnetserver.c
> > +++ b/src/rpc/virnetserver.c
> > @@ -766,10 +766,9 @@ void virNetServerFree(virNetServerPtr srv)
> >      virNetServerLock(srv);
> >      VIR_DEBUG("srv=%p refs=%d", srv, srv->refs);
> >      srv->refs--;
> > -    if (srv->refs > 0) {
> > -        virNetServerUnlock(srv);
> > +    virNetServerUnlock(srv);
> 
> At this point other threads may have changed srv->refs...
> 
> > +    if (srv->refs > 0)
> 
> ...so it's unsafe to test srv->refs here without locking.
> 
> For example, assume srv->refs is 2 at the beginning of virNetServerFree,
> 
>   thread A              thread B
> 
>   lock srv              lock srv (blocked)
> 
>   dec srv->refs
> 
>   (srv->refs is 1)
> 
>   unlock srv
> 
>                         lock srv
> 
> 			dec srv->refs
> 
> 			(srv->refs is 0)
> 
> 			unlock src
> 
>   test srv->refs        test srv->refs
> 
> 
> In this case both threads have found srv->refs is 0 and are going to
> free srv...
> 
> Following patch fixes problem.
> 
> >From 25aa97e05aa76054b781a4e5e83781ee16d5afee Mon Sep 17 00:00:00 2001
> From: Hu Tao <hutao at cn.fujitsu.com>
> Date: Fri, 22 Jun 2012 11:15:01 +0800
> Subject: [PATCH] fix a bug of ref count in virnetserver.c
> 
> The test of ref count is not protected by lock, which is unsafe because
> the ref count may have been changed by other threads during the test.
> 
> This patch fixes this.
> ---
>  src/rpc/virnetserver.c |    5 +++--
>  1 files changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/src/rpc/virnetserver.c b/src/rpc/virnetserver.c
> index 9d71e53..247ddd7 100644
> --- a/src/rpc/virnetserver.c
> +++ b/src/rpc/virnetserver.c
> @@ -759,15 +759,16 @@ void virNetServerQuit(virNetServerPtr srv)
>  void virNetServerFree(virNetServerPtr srv)
>  {
>      int i;
> +    int refs;
>  
>      if (!srv)
>          return;
>  
>      virNetServerLock(srv);
>      VIR_DEBUG("srv=%p refs=%d", srv, srv->refs);
> -    srv->refs--;
> +    refs = --srv->refs;
>      virNetServerUnlock(srv);
> -    if (srv->refs > 0)
> +    if (refs > 0)
>          return;
>  
>      for (i = 0 ; i < srv->nservices ; i++)
> -- 
> 1.7.4.4
> 
> 
> -- 
> Thanks,
> Hu Tao
> 
> --
> libvir-list mailing list
> libvir-list at redhat.com
> https://www.redhat.com/mailman/listinfo/libvir-list

-- 
Thanks,
Hu Tao




More information about the libvir-list mailing list