[libvirt PATCH 2/2] qemu: Ignore failure in post-copy migration when QEMU says completed

Jiri Denemark jdenemar at redhat.com
Fri Nov 18 15:37:22 UTC 2022


When post-copy migration is running in Finish phase we already did
everything needed and we're just waiting for all the memory to transfer
to the destination. The domain is already running on there at this
point. Once all data is transferred (QEMU sends a MIGRATION completed
event) we're done. So in this specific post-copy case the source does
not need to care about the result of the Finish call as long as QEMU
says migration completed. The Finish call to the destination daemon may
fail for reasons that do not affect QEMU, e.g., libvirt daemon was
restarted there or the libvirt connection broke.

Currently we just mark the post-copy migration as failed on the source
and keep the domain paused there. But when libvirt daemon is restarted
at this point, it will detect migration finished successfully and kill
the domain as migrated. It make sense to do this even without having to
restart the daemon.

Closes: https://gitlab.com/libvirt/libvirt/-/issues/338

Signed-off-by: Jiri Denemark <jdenemar at redhat.com>
---
 src/qemu/qemu_migration.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c
index bba4e1dbf3..bef06f4caf 100644
--- a/src/qemu/qemu_migration.c
+++ b/src/qemu/qemu_migration.c
@@ -3901,6 +3901,7 @@ qemuMigrationSrcConfirmPhase(virQEMUDriver *driver,
     g_autoptr(qemuMigrationCookie) mig = NULL;
     qemuDomainObjPrivate *priv = vm->privateData;
     qemuDomainJobPrivate *jobPriv = vm->job->privateData;
+    qemuDomainJobDataPrivate *currentData = vm->job->current->privateData;
     virDomainJobData *jobData = NULL;
     qemuMigrationJobPhase phase;
 
@@ -3911,6 +3912,13 @@ qemuMigrationSrcConfirmPhase(virQEMUDriver *driver,
 
     virCheckFlags(QEMU_MIGRATION_FLAGS, -1);
 
+    if (retcode != 0 &&
+        virDomainObjIsPostcopy(vm, VIR_DOMAIN_JOB_OPERATION_MIGRATION_OUT) &&
+        currentData->stats.mig.status == QEMU_MONITOR_MIGRATION_STATUS_COMPLETED) {
+        VIR_DEBUG("Finish phase failed, but QEMU reports post-copy migration is completed; forcing success");
+        retcode = 0;
+    }
+
     if (flags & VIR_MIGRATE_POSTCOPY_RESUME) {
         phase = QEMU_MIGRATION_PHASE_CONFIRM_RESUME;
     } else if (virDomainObjIsFailedPostcopy(vm)) {
-- 
2.38.1



More information about the libvir-list mailing list