[libvirt] libvirtd deadlock on shutdown
Daniel P. Berrange
berrange at redhat.com
Thu Jun 21 16:00:25 UTC 2012
On Thu, Jun 21, 2012 at 09:56:54AM -0600, Jim Fehlig wrote:
> Daniel P. Berrange wrote:
> > On Wed, Jun 20, 2012 at 03:07:16PM -0600, Jim Fehlig wrote:
> >
> >> I'm looking into a libvirtd deadlock on daemon shutdown. The deadlock
> >> occurs when shutting down virNetServer. If the handling of a job is in
> >> flight in virNetServerHandleJob(), the virNetServer lock is acquired
> >> when freeing job->prog (src/rpc/virnetserver.c:167). But the lock is
> >> already held in virNetServerFree(), which is blocked in
> >> virThreadPoolFree() waiting for all the workers to finish. No progress
> >> can be made.
> >>
> >> The attached hack fixes the problem, but I'm not convinced this is an
> >> appropriate fix. Is it necessary to hold the virNetServer lock when
> >> calling virNetServerProgramFree(job->prog)? I notice the lock is not
> >> held in the error path of virNetServerHandleJob().
> >>
> >> Thanks,
> >> Jim
> >>
> >>
> >
> >
> >> diff --git a/src/rpc/virnetserver.c b/src/rpc/virnetserver.c
> >> index ae19e84..edd3196 100644
> >> --- a/src/rpc/virnetserver.c
> >> +++ b/src/rpc/virnetserver.c
> >> @@ -774,7 +774,9 @@ void virNetServerFree(virNetServerPtr srv)
> >> for (i = 0 ; i < srv->nservices ; i++)
> >> virNetServerServiceToggle(srv->services[i], false);
> >>
> >> + virNetServerUnlock(srv);
> >> virThreadPoolFree(srv->workers);
> >> + virNetServerLock(srv);
> >>
> >> for (i = 0 ; i < srv->nsignals ; i++) {
> >> sigaction(srv->signals[i]->signum, &srv->signals[i]->oldaction, NULL);
> >>
> >
> > The virNetServerPtr object is ref-counted, so technical;ly
> > only decrementing the ref count needs to be protected. Once
> > the ref count hits 0, then only the current thread (and the
> > workers) should be using the virNetServerPtr instance.
> >
>
> Ah, right. Thanks for clarifying.
>
> > So, yes, it is safe to call virNetServerUnlock before
> > virThreadPoolFree. Furthermore, it is /not/ required
> > to call virNetServerLock afterwards - no other thread
> > should be using it now the workers are dead. So you
> > can skip that extra lock call, and also remove the
> > unlock call much later
> >
>
> Something like the attached patch?
>
> Regards,
> Jim
>
> From 583be33213e922899b23f036494886397b2549dc Mon Sep 17 00:00:00 2001
> From: Jim Fehlig <jfehlig at suse.com>
> Date: Thu, 21 Jun 2012 09:21:44 -0600
> Subject: [PATCH] Fix deadlock on libvirtd shutdown
>
> When shutting down libvirtd, the virNetServer shutdown can deadlock
> if there are in-flight jobs being handled by virNetServerHandleJob().
> virNetServerFree() will acquire the virNetServer lock and call
> virThreadPoolFree() to terminate the workers, waiting for the workers
> to finish. But in-flight workers will attempt to acquire the
> virNetServer lock, resulting in deadlock.
>
> Fix the deadlock by unlocking the virNetServer lock before calling
> virThreadPoolFree(). This is safe since the virNetServerPtr object
> is ref-counted and only decrementing the ref count needs to be
> protected. Additionally, there is no need to re-acquire the lock
> after virThreadPoolFree() completes as all the workers have
> terminated.
> ---
> src/rpc/virnetserver.c | 6 ++----
> 1 file changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/src/rpc/virnetserver.c b/src/rpc/virnetserver.c
> index ae19e84..9d71e53 100644
> --- a/src/rpc/virnetserver.c
> +++ b/src/rpc/virnetserver.c
> @@ -766,10 +766,9 @@ void virNetServerFree(virNetServerPtr srv)
> virNetServerLock(srv);
> VIR_DEBUG("srv=%p refs=%d", srv, srv->refs);
> srv->refs--;
> - if (srv->refs > 0) {
> - virNetServerUnlock(srv);
> + virNetServerUnlock(srv);
> + if (srv->refs > 0)
> return;
> - }
>
> for (i = 0 ; i < srv->nservices ; i++)
> virNetServerServiceToggle(srv->services[i], false);
> @@ -805,7 +804,6 @@ void virNetServerFree(virNetServerPtr srv)
> virNetServerMDNSFree(srv->mdns);
> #endif
>
> - virNetServerUnlock(srv);
> virMutexDestroy(&srv->lock);
> VIR_FREE(srv);
> }
ACK,
We really should resurrect the patches adding virAtomic for refcounting :-)
Daniel
--
|: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org -o- http://virt-manager.org :|
|: http://autobuild.org -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
More information about the libvir-list
mailing list