[libvirt] [PATCH] qemu: fix libvirtd crash in migration after vm shutdown

John Ferlan jferlan at redhat.com
Tue Sep 13 22:15:40 UTC 2016


something weird with your email client and adding extra <LF>'s
everywhere... Plaintext is your friend as much as 'git send-email'...

On 08/01/2016 10:20 PM, weifuqiang wrote:
> [PATCH] qemu: fix libvirtd crash in migration after vm shutdown
> 
>  
> 
>  
> 
> If we shutdown a guest, then migrate it without the arg XML, libvirtd
> will get crashed.
> 

Well it would be "nice" to see the crash trace(s). You give some context
below, but it skips a few steps...  I think understanding how
reproducible something is would be nice to know as well as whether
there's some edge (or error) condition that causes what you saw.

We're engineers, we can handle and want the details especially for crashes.

>  
> 
> The reason is that:
> 
> 1 during shutdown callback, qemuProcessStop() , it points vm->def  to
> vm->newDef
> 

Since changed to be in virDomainObjRemoveTransientDef

Hmmm... is this traces of some error path or edge condition? What causes
the shutdown?  From inside the guest?

> 2 during migration, it frees persistentDef, which points to vm->newDef
> when the arg XML is NULL.
> 
>    However, because vm->newDef is now vm->def, what we IN FACT freed is
> vm->def.
> 
> 3 it will refer to vm->def after step2, thus invalid read/write causes
> libvirtd crash
> 
>  
> 
> We needn't to free persistentDef if persist_xml is NULL, because no
> extra def was alloced if persistent_xml is NULL.
> 
>  
> 
>  
> 

I know there's been recent changes in migration - I wonder if those fix
what you've seen here... Perhaps not, but I never rule that out either.

> ---
> 
> src/qemu/qemu_migration.c | 2 +-
> 
> 1 file changed, 1 insertion(+), 1 deletion(-)
> 
>  
> 
> diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c
> 
> index 6a683f7..3636c93 100644
> 
> --- a/src/qemu/qemu_migration.c
> 
> +++ b/src/qemu/qemu_migration.c
> 
> @@ -4915,7 +4915,7 @@ qemuMigrationRun(virQEMUDriverPtr driver,
> 
>          VIR_WARN("Unable to encode migration cookie");
> 
>      }
> 
> -    if (persistDef != vm->newDef)
> 
> +    if (persist_xml && persistDef)

So, looking at the code,

1. "persistDef" is only ever set if "flags & VIR_MIGRATE_PERSIST_DEST",
so we have to have taken that path

2. If we have "persist_xml", then we "know" that qemuMigrationPrepareDef
was called.

So I suppose this could be a solution. Alternatively, I would think

    if (persistDef != vm->newDef && persistDef != vm->def)
       virDomainDefFree(persistDef);

would work as well; however, is this the right solution?

It's the what has caused vm->newDef to become vm->def and whether that
should be happening while we're migrating.

So I wonder should/could the shutdown be blocked until the migration is
complete or are we perhaps lacking a lock or extra check somewhere to
sure we don't hit some edge condition.

Alternatively, is there some nuance with virDomainObjRemoveTransientDef
that needs adjustment vis-a-vis domain->persistent (I get brain cramps
going through that code).

In any case, perhaps this analysis prompts others to chime in...

John

> 
>          virDomainDefFree(persistDef);
> 
>      qemuMigrationCookieFree(mig);
> 
> -- 
> 
> 1.9.5.msysgit.1
> 
> 
> 
> --
> libvir-list mailing list
> libvir-list at redhat.com
> https://www.redhat.com/mailman/listinfo/libvir-list
> 




More information about the libvir-list mailing list