[libvirt-users] question about libvirt and suspending guests during live migration

Martin Kletzander mkletzan at redhat.com
Mon Mar 13 12:38:31 UTC 2017


On Fri, Mar 10, 2017 at 05:31:04PM -0600, Chris Friesen wrote:
>Hi,
>
>I hope someone can help me out.
>
>I'm running into an issue with libvirt 1.2.12 reporting "operation failed:
>domain is no longer running" for a migration when qemu thinks it was fine.
>
>The steps are:
>1) create guest with stress test running in it to dirty memory at a high rate
>(fast enough that it would not normally complete live-migration)
>2) trigger live migration with dom.migrateToURI2()
>3) while migration is in progress, call dom.suspend() on the migrating domain.
>
>What I see at this point is the following:
>
>a) At time 50.465 the monitoring code sees a VIR_DOMAIN_EVENT_SUSPENDED event,
>as expected.
>b) An instrumented qemu logs the following:
>51.143: done transferring state
>51.143: done migration
>51.144: qmp_query_migrate reporting state completed
>c) At time 51.468 the monitoring code sees a VIR_DOMAIN_EVENT_RESUMED event,
>with detail of VIR_DOMAIN_EVENT_RESUMED_UNPAUSED
>c) At time 51.469 the the monitoring code sees a VIR_DOMAIN_EVENT_RESUMED event,
>with detail of VIR_DOMAIN_EVENT_RESUMED_MIGRATED
>
>e) At time 51.471 the dom.migrateToURI2() call raises an exception (this is
>python).  The corresponding libvirt log file shows:
>	"error : virNetClientProgramDispatchError:177 : operation failed: domain is no
>longer running"
>
>
>For what it's worth, the problem seems to be fixed in libvirt 1.2.17.  In that
>version and later I don't see the VIR_DOMAIN_EVENT_RESUMED event, the migration
>just completes.
>
>I'm looking at the libvirt history, but I figured I'd ask here too...
>

I briefly looked at `g log v1.2.12..v1.2.17` and I haven't found
anything fixing this particular bug.  So it was probably fixed during
some refactor.

It could've been c1a7f199e82e201e4f6f9401f65b9edc80f98349 or the fact
that we started using migration events or just some block drive refactors
(as it looks like there were many).  But since that version is so old
nobody will remember what happened exactly, I think.

Sorry I can't help more than that,
Martin

>Thanks,
>Chris
>
>_______________________________________________
>libvirt-users mailing list
>libvirt-users at redhat.com
>https://www.redhat.com/mailman/listinfo/libvirt-users
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: Digital signature
URL: <http://listman.redhat.com/archives/libvirt-users/attachments/20170313/547874a3/attachment.sig>


More information about the libvirt-users mailing list