[libvirt] [PATCH] Fix race in starting transient VMs

Daniel P. Berrange berrange at redhat.com
Fri Nov 1 09:44:43 UTC 2013


On Thu, Oct 31, 2013 at 01:02:54PM -0600, Eric Blake wrote:
> On 10/31/2013 01:00 PM, Eric Blake wrote:
> > On 10/31/2013 12:41 PM, Daniel P. Berrange wrote:
> >> From: "Daniel P. Berrange" <berrange at redhat.com>
> >>
> >> When starting a transient VM the first thing done is to check
> >> for duplicates. The check looks if there are any running VMs
> >> with the matching name/uuid. It explicitly allows there to
> >> be inactive VMs, so that a persistent VM can be temporarily
> >> booted with a different config.
> >>
> > 
> >>
> >> The fix is to only allow an existing inactive VM if it is
> >> marked as persistent.
> >>
> >> Signed-off-by: Daniel P. Berrange <berrange at redhat.com>
> >> ---
> >>  src/conf/domain_conf.c | 6 ++++++
> >>  1 file changed, 6 insertions(+)
> > 
> > ACK.  What a nasty bug to track down.
> 
> If I'm correct, this bug is a regression from the time that we first
> converted to no longer holding the driver lock around entire API calls
> (commit a9e97e0, v1.0.3)

No, it goes waaaaaaaaaaaaaaaaaaaaaaaaaaaaaay back. It at least affects
0.10.2, and probably all earlier versions too, for as long as the
qemuDomainBeginJobWithDriver code has existed.

Removing the driver lock merely made the problem worse, as it introduced
a crash scenario, as well as the previous orphaned VMs problem.

Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|




More information about the libvir-list mailing list