[libvirt PATCH 50/80] qemu: Use QEMU_MIGRATION_PHASE_POSTCOPY_FAILED
Peter Krempa
pkrempa at redhat.com
Wed May 18 14:01:51 UTC 2022
On Wed, May 18, 2022 at 14:53:50 +0200, Jiri Denemark wrote:
> On Thu, May 12, 2022 at 13:50:08 +0200, Peter Krempa wrote:
> > On Tue, May 10, 2022 at 17:21:11 +0200, Jiri Denemark wrote:
> > > This phase marks a migration protocol as broken in a post-copy phase.
> > > Libvirt is no longer actively watching the migration in this phase as
> > > the migration API that started the migration failed.
> > >
> > > This may either happen when post-copy migration really fails (QEMU
> > > enters postcopy-paused migration state) or when the migration still
> > > progresses between both QEMU processes, but libvirt lost control of it
> > > because the connection between libvirt daemons (in p2p migration) or a
> > > daemon and client (non-p2p migration) was closed. For example, when one
> > > of the daemons was restarted.
> > >
> > > Signed-off-by: Jiri Denemark <jdenemar at redhat.com>
> > > ---
> > > src/qemu/qemu_migration.c | 15 +++++++++++----
> > > src/qemu/qemu_process.c | 16 +++++++++++++---
> > > 2 files changed, 24 insertions(+), 7 deletions(-)
> > >
> >
> > [...]
> >
> >
> > > @@ -6327,9 +6334,9 @@ qemuMigrationProcessUnattended(virQEMUDriver *driver,
> > > vm->def->name);
> > >
> > > if (job == VIR_ASYNC_JOB_MIGRATION_IN)
> > > - phase = QEMU_MIGRATION_PHASE_FINISH3;
> > > + phase = QEMU_MIGRATION_PHASE_FINISH_RESUME;
> > > else
> > > - phase = QEMU_MIGRATION_PHASE_CONFIRM3;
> > > + phase = QEMU_MIGRATION_PHASE_CONFIRM_RESUME;
> > >
> > > if (qemuMigrationJobStartPhase(vm, phase) < 0)
> > > return;
> >
> > This hunk seems to be misplaced or at least doesn't really seem to be
> > related to anything this patch is claiming to do.
>
> Thanks to changes in this patch migration is in
> QEMU_MIGRATION_PHASE_POSTCOPY_FAILED phase when we get here so to make
> it all work, we need to use RESUME phases >
> QEMU_MIGRATION_PHASE_POSTCOPY_FAILED. Otherwise qemuMigrationCheckPhase
> would complain. Perhaps this could be moved to "Add new migration phases
> for post-copy recovery", but I haven't checked for sure.
>
> >
> > [..]
> >
> > > @@ -3751,9 +3752,18 @@ qemuProcessRecoverMigration(virQEMUDriver *driver,
> > > return -1;
> > >
> > > if (rc > 0) {
> > > + job->phase = QEMU_MIGRATION_PHASE_POSTCOPY_FAILED;
> > > +
> > > if (migStatus == VIR_DOMAIN_JOB_STATUS_POSTCOPY) {
> > > VIR_DEBUG("Post-copy migration of domain %s still running, it "
> > > "will be handled as unattended", vm->def->name);
> > > +
> > > + if (state == VIR_DOMAIN_RUNNING)
> > > + reason = VIR_DOMAIN_RUNNING_POSTCOPY;
> > > + else
> > > + reason = VIR_DOMAIN_PAUSED_POSTCOPY;
> >
> > This bit also doesn't seem to be justified by what this patch is
> > supposed to do.
>
> Until now broken migration protocol in post-copy phase has been
> indicated with VIR_DOMAIN_*_POSTCOPY_FAILED states, which changed in
> this patch and we can use the right state here depending on the QEMU
> state. That said, it could be moved into a separate patch.
Fair enough on both points.
Reviewed-by: Peter Krempa <pkrempa at redhat.com>
More information about the libvir-list
mailing list