[libvirt-users] question about libvirt and suspending guests during live migration

Chris Friesen chris.friesen at windriver.com
Fri Mar 10 23:31:04 UTC 2017


Hi,

I hope someone can help me out.

I'm running into an issue with libvirt 1.2.12 reporting "operation failed: 
domain is no longer running" for a migration when qemu thinks it was fine.

The steps are:
1) create guest with stress test running in it to dirty memory at a high rate 
(fast enough that it would not normally complete live-migration)
2) trigger live migration with dom.migrateToURI2()
3) while migration is in progress, call dom.suspend() on the migrating domain.

What I see at this point is the following:

a) At time 50.465 the monitoring code sees a VIR_DOMAIN_EVENT_SUSPENDED event, 
as expected.
b) An instrumented qemu logs the following:
51.143: done transferring state
51.143: done migration
51.144: qmp_query_migrate reporting state completed
c) At time 51.468 the monitoring code sees a VIR_DOMAIN_EVENT_RESUMED event, 
with detail of VIR_DOMAIN_EVENT_RESUMED_UNPAUSED
c) At time 51.469 the the monitoring code sees a VIR_DOMAIN_EVENT_RESUMED event, 
with detail of VIR_DOMAIN_EVENT_RESUMED_MIGRATED

e) At time 51.471 the dom.migrateToURI2() call raises an exception (this is 
python).  The corresponding libvirt log file shows:
	"error : virNetClientProgramDispatchError:177 : operation failed: domain is no 
longer running"


For what it's worth, the problem seems to be fixed in libvirt 1.2.17.  In that 
version and later I don't see the VIR_DOMAIN_EVENT_RESUMED event, the migration 
just completes.

I'm looking at the libvirt history, but I figured I'd ask here too...

Thanks,
Chris




More information about the libvirt-users mailing list