[libvirt] [PATCH] Migration: Preserve the failed job in case migration job is terminated beyond the perform phase.

Jiri Denemark jdenemar at redhat.com
Thu Jan 25 15:13:44 UTC 2018


On Thu, Jan 25, 2018 at 19:51:23 +0530, Prerna Saxena wrote:
> In case of non-p2p migration, in case libvirt client gets disconnected from source libvirt
> after PERFORM phase is over, the daemon just resets the current migration job.
> However, the VM could be left paused on both source and destination in such case. In case
> the client reconnects and queries migration status, the job has been blanked out from source libvirt,
> and this reconnected client has no clear way of figuring out if an unclean migration had previously
> been aborted.

The virDomainGetState API should return VIR_DOMAIN_PAUSED with
VIR_DOMAIN_PAUSED_MIGRATION reason. Is this not enough?

> This patch calls out a "potentially" incomplete migration as a failed
> job, so that a client may

As you say it's potentially incomplete, so marking it as failed is not
completely correct. It's a split brain when the source cannot
distinguish whether the migration was successful or not.

> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
> index e8e0313..7c60d17 100644
> --- a/src/qemu/qemu_domain.c
> +++ b/src/qemu/qemu_domain.c
> @@ -4564,6 +4564,22 @@ qemuDomainObjDiscardAsyncJob(virQEMUDriverPtr driver, virDomainObjPtr obj)
>      qemuDomainObjSaveJob(driver, obj);
>  }
>  
> +
> +void
> +qemuDomainObjFailAsyncJob(virQEMUDriverPtr driver, virDomainObjPtr obj)
> +{
> +    qemuDomainObjPrivatePtr priv = obj->privateData;
> +    VIR_FREE(priv->job.completed);
> +    if (VIR_ALLOC(priv->job.completed) == 0) {
> +        priv->job.current->type = VIR_DOMAIN_JOB_FAILED;
> +        priv->job.completed = priv->job.current;

This will just leak the memory allocated for priv->job.completed by
overwriting the pointer to the one from priv->job.current, ...

> +    } else {
> +        VIR_WARN("Unable to allocate job.completed for VM %s", obj->def->name);
> +    }
> +    qemuDomainObjResetAsyncJob(priv);

which will point to a freed memory after this call.

> +    qemuDomainObjEndJob(driver, obj);

And while this is almost certainly (I didn't really check though) not
something you should call from a close callback, you don't save the
changes to the status XML so once libvirtd restarts, it will think the
domain is still being migrated.

Jirka




More information about the libvir-list mailing list