[libvirt PATCH 50/80] qemu: Use QEMU_MIGRATION_PHASE_POSTCOPY_FAILED

Jiri Denemark jdenemar at redhat.com
Wed May 18 12:53:50 UTC 2022


On Thu, May 12, 2022 at 13:50:08 +0200, Peter Krempa wrote:
> On Tue, May 10, 2022 at 17:21:11 +0200, Jiri Denemark wrote:
> > This phase marks a migration protocol as broken in a post-copy phase.
> > Libvirt is no longer actively watching the migration in this phase as
> > the migration API that started the migration failed.
> > 
> > This may either happen when post-copy migration really fails (QEMU
> > enters postcopy-paused migration state) or when the migration still
> > progresses between both QEMU processes, but libvirt lost control of it
> > because the connection between libvirt daemons (in p2p migration) or a
> > daemon and client (non-p2p migration) was closed. For example, when one
> > of the daemons was restarted.
> > 
> > Signed-off-by: Jiri Denemark <jdenemar at redhat.com>
> > ---
> >  src/qemu/qemu_migration.c | 15 +++++++++++----
> >  src/qemu/qemu_process.c   | 16 +++++++++++++---
> >  2 files changed, 24 insertions(+), 7 deletions(-)
> > 
> 
> [...]
> 
> 
> > @@ -6327,9 +6334,9 @@ qemuMigrationProcessUnattended(virQEMUDriver *driver,
> >                vm->def->name);
> >  
> >      if (job == VIR_ASYNC_JOB_MIGRATION_IN)
> > -        phase = QEMU_MIGRATION_PHASE_FINISH3;
> > +        phase = QEMU_MIGRATION_PHASE_FINISH_RESUME;
> >      else
> > -        phase = QEMU_MIGRATION_PHASE_CONFIRM3;
> > +        phase = QEMU_MIGRATION_PHASE_CONFIRM_RESUME;
> >  
> >      if (qemuMigrationJobStartPhase(vm, phase) < 0)
> >          return;
> 
> This hunk seems to be misplaced or at least doesn't really seem to be
> related to anything this patch is claiming to do.

Thanks to changes in this patch migration is in
QEMU_MIGRATION_PHASE_POSTCOPY_FAILED phase when we get here so to make
it all work, we need to use RESUME phases >
QEMU_MIGRATION_PHASE_POSTCOPY_FAILED. Otherwise qemuMigrationCheckPhase
would complain. Perhaps this could be moved to "Add new migration phases
for post-copy recovery", but I haven't checked for sure.

> 
> [..]
> 
> > @@ -3751,9 +3752,18 @@ qemuProcessRecoverMigration(virQEMUDriver *driver,
> >          return -1;
> >  
> >      if (rc > 0) {
> > +        job->phase = QEMU_MIGRATION_PHASE_POSTCOPY_FAILED;
> > +
> >          if (migStatus == VIR_DOMAIN_JOB_STATUS_POSTCOPY) {
> >              VIR_DEBUG("Post-copy migration of domain %s still running, it "
> >                        "will be handled as unattended", vm->def->name);
> > +
> > +            if (state == VIR_DOMAIN_RUNNING)
> > +                reason = VIR_DOMAIN_RUNNING_POSTCOPY;
> > +            else
> > +                reason = VIR_DOMAIN_PAUSED_POSTCOPY;
> 
> This bit also doesn't seem to be justified by what this patch is
> supposed to do.

Until now broken migration protocol in post-copy phase has been
indicated with VIR_DOMAIN_*_POSTCOPY_FAILED states, which changed in
this patch and we can use the right state here depending on the QEMU
state. That said, it could be moved into a separate patch.

Jirka


More information about the libvir-list mailing list