[libvirt] [PATCH] migration: Fix possible bug for migrate cancel

Gonglei (Arei) arei.gonglei at huawei.com
Tue Mar 25 11:15:51 UTC 2014


> -----Original Message-----
> From: Eric Blake [mailto:eblake at redhat.com]
> Sent: Tuesday, March 25, 2014 12:01 AM
> To: Paolo Bonzini; Gonglei (Arei); qemu-devel at nongnu.org
> Cc: quintela at redhat.com; owasserm at redhat.com; Yanqiangjun; Zhaoyanbin
> (A); Zengjunliang; libvir-list at redhat.com
> Subject: Re: [PATCH] migration: Fix possible bug for migrate cancel
> 
> [adding libvirt]
> 
> On 03/24/2014 09:47 AM, Paolo Bonzini wrote:
> > Il 24/03/2014 14:04, arei.gonglei at huawei.com ha scritto:
> >> From: zengjunliang <zengjunliang at huawei.com>
> >>
> >> Return error for migrate cancel, when migration status is not
> >> MIG_STATE_SETUP or MIG_STATE_ACTIVE. Thus, libvirt can can
> >> perceive the operation fails.
> >>
> >> Signed-off-by: zengjunliang <zengjunliang at huawei.com>
> >> Signed-off-by: Gonglei <arei.gonglei at huawei.com>
> >
> > I think this is done on purpose, because canceling migration is racy.
> > Instead, libvirt should do "query-migrate" and check if the migration
> > was completed or canceled.
> 
> Can you please give more details at how you are triggering the problem
> with libvirt?  I think Paolo is probably right - the bug is more likely
> to be in libvirt not expecting the race and not recovering correctly
> when the race occurs, than it is to be in changing qemu's state algorithm.
> 
When the migration progress reaches 100%, and the migration status becomes MIG_STATE_COMPLETED in Qemu.
It will take some time which from MIG_STATE_COMPLETED to the migration thread resources are recovered.
If we cancel the migration at this moment, the migrate_fd_cancel function will break directly without reporting
error code. Then, libvirt considers the cancle operation a success, contrary facts.

Best regards,
-Gonglei





More information about the libvir-list mailing list