[libvirt] [PATCH] Fix race condition reconnecting to vms & loading configs

Cole Robinson crobinso at redhat.com
Mon Oct 28 17:03:49 UTC 2013


On 10/28/2013 07:52 AM, Daniel P. Berrange wrote:
> From: "Daniel P. Berrange" <berrange at redhat.com>
> 
> The following sequence
> 
>  1. Define a persistent QMEU guest
>  2. Start the QEMU guest
>  3. Stop libvirtd
>  4. Kill the QEMU process
>  5. Start libvirtd
>  6. List persistent guets
> 
> At the last step, the previously running persistent guest
> will be missing. This is because of a race condition in the
> QEMU driver startup code. It does
> 
>  1. Load all VM state files
>  2. Spawn thread to reconnect to each VM
>  3. Load all VM config files
> 
> Only at the end of step 3, does the 'virDomainObjPtr' get
> marked as "persistent". There is therefore a window where
> the thread reconnecting to the VM will remove the persistent
> VM from the list.
> 
> The easy fix is to simply switch the order of steps 2 & 3.
> 
> Signed-off-by: Daniel P. Berrange <berrange at redhat.com>
> ---
>  src/qemu/qemu_driver.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
> index c613967..9c3daad 100644
> --- a/src/qemu/qemu_driver.c
> +++ b/src/qemu/qemu_driver.c
> @@ -816,8 +816,6 @@ qemuStateInitialize(bool privileged,
>  
>      conn = virConnectOpen(cfg->uri);
>  
> -    qemuProcessReconnectAll(conn, qemu_driver);
> -
>      /* Then inactive persistent configs */
>      if (virDomainObjListLoadAllConfigs(qemu_driver->domains,
>                                         cfg->configDir,
> @@ -828,6 +826,7 @@ qemuStateInitialize(bool privileged,
>                                         NULL, NULL) < 0)
>          goto error;
>  
> +    qemuProcessReconnectAll(conn, qemu_driver);
>  
>      virDomainObjListForEach(qemu_driver->domains,
>                              qemuDomainSnapshotLoad,
> 

I tried testing this patch to see if it would fix:

https://bugzilla.redhat.com/show_bug.cgi?id=1015246

from current master I did:

git revert a924d9d083c215df6044387057c501d9aa338b96
reproduce the bug
git am <your-patch>

But the daemon won't even start up after your patch is built:

(gdb) bt
#0  qemuMonitorOpen (vm=vm at entry=0x7fffd4211090, config=0x0, json=false,
    cb=cb at entry=0x7fffddcae720 <monitorCallbacks>,
    opaque=opaque at entry=0x7fffd419b840) at qemu/qemu_monitor.c:852
#1  0x00007fffdda1083d in qemuConnectMonitor (
    driver=driver at entry=0x7fffd419b840, vm=vm at entry=0x7fffd4211090,
    logfd=logfd at entry=-1) at qemu/qemu_process.c:1412
#2  0x00007fffdda1685a in qemuProcessReconnect (
    opaque=opaque at entry=0x7fffd422fef0) at qemu/qemu_process.c:3086
#3  0x00007ffff7528dce in virThreadHelper (data=<optimized out>)
    at util/virthreadpthread.c:161
#4  0x00007ffff4782f33 in start_thread (arg=0x7fffcb7fe700)
    at pthread_create.c:309
#5  0x00007ffff40a9ead in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

The reverted commit is the fix for a simple crash that I used to reproduce
1015246. There might be multiple issues here but I don't have time to poke at
it right now.

Thanks,
Cole




More information about the libvir-list mailing list