[libvirt] [RFC PATCH] Add new migration flag VIR_MIGRATE_DRY_RUN

Mon Nov 12 18:33:04 UTC 2018

On 11/12/18 4:26 AM, Daniel P. Berrangé wrote:
> On Fri, Nov 02, 2018 at 04:34:02PM -0600, Jim Fehlig wrote:
>> A dry run can be used as a best-effort check that a migration command
>> will succeed. The destination host will be checked to see if it can
>> accommodate the resources required by the domain. DRY_RUN will fail if
>> the destination host is not capable of running the domain. Although a
>> subsequent migration will likely succeed, the success of DRY_RUN does not
>> ensure a future migration will succeed. Resources on the destination host
>> could become unavailable between a DRY_RUN and actual migration.
> 
> I'm not really convinced this is a particularly useful concept,
> as it is only going to catch a very small number of the reasons
> why migration can fail. So you still have to expect the real
> migration invokation to have a strong chance of failing.

I agree it is difficult to reliably check that a migration will succeed. TBH, I 
was expecting opposition due to libvirt already providing info for applications 
to do the check themselves. E.g. as nova has done with 
check_can_live_migrate_{source,destination} APIs.

Do you think libvirt provides enough information for an app to determine if a VM 
can be migrated between two hosts? Or maybe better asked: What info is currently 
missing for an app to reliably check if a VM can be migrated between two hosts?

>>
>> Signed-off-by: Jim Fehlig <jfehlig at suse.com>
>> ---
>>
>> If it is agreed this is useful, my thought was to use the begin and
>> prepare phases of migration to implement it. qemuMigrationDstPrepareAny()
>> already does a lot of the heavy lifting wrt checking the host can
>> accommodate the domain. Some of it, and the remaining migration phases,
>> can be short-circuited in the case of dry run.
>>
>> One interesting wrinkle I've observed is the check for cpu compatibility.
>> AFAICT qemu is actually invoked on the dst, "filtered-features" of the cpu
>> are requested via qmp, and results are checked against cpu in domain config.
>> If cpu on dst is insufficient, migration fails in the prepare phase with
>> something like "guest CPU doesn't match specification: missing features: z y z".
>> I was hoping to avoid launching qemu in the case of dry run, but that may
>> be unavoidable if we'd like a dependable dry run result.
> 
> Even launching QEMU isn't good enough - it has to actually process the
> migration data stream for devices to get a good indication of success,
> at which point you're basically doing a real migration.

Bummer. I guess that answers my question above: no. It also implies apps cannot 
reliably check if a migration will succeed and should instead put effort into 
handling errors from an actual migration :-).

Regards,
Jim