[libvirt] [PATCH] Fix race condition reconnecting to vms & loading configs

Daniel P. Berrange berrange at redhat.com
Mon Oct 28 17:06:36 UTC 2013


On Mon, Oct 28, 2013 at 01:03:49PM -0400, Cole Robinson wrote:
> On 10/28/2013 07:52 AM, Daniel P. Berrange wrote:
> > From: "Daniel P. Berrange" <berrange at redhat.com>
> > 
> > The following sequence
> > 
> >  1. Define a persistent QMEU guest
> >  2. Start the QEMU guest
> >  3. Stop libvirtd
> >  4. Kill the QEMU process
> >  5. Start libvirtd
> >  6. List persistent guets
> > 
> > At the last step, the previously running persistent guest
> > will be missing. This is because of a race condition in the
> > QEMU driver startup code. It does
> > 
> >  1. Load all VM state files
> >  2. Spawn thread to reconnect to each VM
> >  3. Load all VM config files
> > 
> > Only at the end of step 3, does the 'virDomainObjPtr' get
> > marked as "persistent". There is therefore a window where
> > the thread reconnecting to the VM will remove the persistent
> > VM from the list.
> > 
> > The easy fix is to simply switch the order of steps 2 & 3.
> > 
> > Signed-off-by: Daniel P. Berrange <berrange at redhat.com>
> > ---
> >  src/qemu/qemu_driver.c | 3 +--
> >  1 file changed, 1 insertion(+), 2 deletions(-)
> > 
> > diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
> > index c613967..9c3daad 100644
> > --- a/src/qemu/qemu_driver.c
> > +++ b/src/qemu/qemu_driver.c
> > @@ -816,8 +816,6 @@ qemuStateInitialize(bool privileged,
> >  
> >      conn = virConnectOpen(cfg->uri);
> >  
> > -    qemuProcessReconnectAll(conn, qemu_driver);
> > -
> >      /* Then inactive persistent configs */
> >      if (virDomainObjListLoadAllConfigs(qemu_driver->domains,
> >                                         cfg->configDir,
> > @@ -828,6 +826,7 @@ qemuStateInitialize(bool privileged,
> >                                         NULL, NULL) < 0)
> >          goto error;
> >  
> > +    qemuProcessReconnectAll(conn, qemu_driver);
> >  
> >      virDomainObjListForEach(qemu_driver->domains,
> >                              qemuDomainSnapshotLoad,
> > 
> 
> I tried testing this patch to see if it would fix:
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=1015246
> 
> from current master I did:
> 
> git revert a924d9d083c215df6044387057c501d9aa338b96
> reproduce the bug
> git am <your-patch>
> 
> But the daemon won't even start up after your patch is built:
> 
> (gdb) bt
> #0  qemuMonitorOpen (vm=vm at entry=0x7fffd4211090, config=0x0, json=false,
>     cb=cb at entry=0x7fffddcae720 <monitorCallbacks>,
>     opaque=opaque at entry=0x7fffd419b840) at qemu/qemu_monitor.c:852
> #1  0x00007fffdda1083d in qemuConnectMonitor (
>     driver=driver at entry=0x7fffd419b840, vm=vm at entry=0x7fffd4211090,
>     logfd=logfd at entry=-1) at qemu/qemu_process.c:1412
> #2  0x00007fffdda1685a in qemuProcessReconnect (
>     opaque=opaque at entry=0x7fffd422fef0) at qemu/qemu_process.c:3086
> #3  0x00007ffff7528dce in virThreadHelper (data=<optimized out>)
>     at util/virthreadpthread.c:161
> #4  0x00007ffff4782f33 in start_thread (arg=0x7fffcb7fe700)
>     at pthread_create.c:309
> #5  0x00007ffff40a9ead in clone ()
>     at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

What is this trace showing ? or rather what is the error reported
when it fails to start ?


Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|




More information about the libvir-list mailing list