[libvirt] [PATCH] RFC: Support QEMU live uprgade

Zheng Sheng ZS Zhou zhshzhou at cn.ibm.com
Wed Nov 13 04:15:30 UTC 2013


Hi Daniel,

on 2013/11/12/ 20:23, Daniel P. Berrange wrote:> On Tue, Nov 12, 2013 at 08:14:11PM +0800, Zheng Sheng ZS Zhou wrote:
>> Hi all,
>>
>> Recently QEMU developers are working on a feature to allow upgrading
>> a live QEMU instance to a new version without restarting the VM. This
>> is implemented as live migration between the old and new QEMU process
>> on the same host [1]. Here is the the use case:
>>
>> 1) Guests are running QEMU release 1.6.1.
>> 2) Admin installs QEMU release 1.6.2 via RPM or deb.
>> 3) Admin starts a new VM using the updated QEMU binary, and asks the old
>> QEMU process to migrate the VM to the newly started VM.
>>
>> I think it will be very useful to support QEMU live upgrade in libvirt.
>> After some investigations, I found migrating to the same host breaks
>> the current migration code. I'd like to propose a new work flow for
>> QEMU live migration. It is to implement the above step 3).
> 
> How does it break migration code ? Your patch below is effectively
> re-implementing the multistep migration workflow, leaving out many
> important features (seemless reconnect to SPICE clients for example)
> which is really bad for our ongoing code support burden, so not
> something I want to see.
> 
> Daniel
> 

Actually I wrote another hacking patch to investigate how we can re-use existing framework to do local migration. I found the following problems.

(1) When migrate to different host, the destination domain uses the same UUID and name as the source, and this is OK. When migrate to localhost, destination domain UUID and name causes conflict with the source. In QEMU driver, it maintains a hash table of domain objects, the reference key is the UUID of the virtual machine. The closeCallbacks is also a hash table with domain UUID as key, and maybe there are other data structures using UUID as key. This implies we use a different name and UUID for the destination domain. In the migration framework, during the Begin and Prepare stage, it calls virDomainDefCheckABIStability to prevent us using a different UUID, and it also checks the hostname and host UUID to be different. If we want to enable local migration, we have to skip these check and generate new UUID and name for destination domain. Of course we restore the original UUID after migration. UUID is used in higher level management software to identify virtual machines. It should stay the same after QEMU live upgrade.

(2) If I understand the code correctly, libvirt uses thread pool to handle RPC requests. This means local migration may cause deadlock in P2P migration mode. Suppose there are some concurrent local migration requests and all the worker threads are occupied by these requests. When source libvirtd connects destination libvirtd on the same host to negotiate the migration, the negotiation request is queued, but the negotiation request will never be handled, because the original migration request from client is waiting for the negotiation request to finish to progress, while the negotiation request is queued waiting for the original request to end. This is one of the dealock risk I can think of.
I guess in traditional migration mode, in which the client opens two connections to source and destination libvirtd, there is also risk to cause deadlock.

(3) Libvirt supports Unix domain socket transport, but this is only used in a tunnelled migration. For native migration, it only supports TCP. We need to enable Unix domain socket transport in native migration. Now we already have a hypervisor migration URI argument in the migration API, but there is no support for parsing and verifying a "unix:/full/path" URI and passing that URI transparently to QEMU. We can add this to current migration framework but direct Unix socket transport looks meaningless for normal migration.

(4) When migration fails, the source domain is resumed, and this may not work if we enable page-flipping in QEMU. With page-flipping enabled, QEMU transfers memory page ownership to the destination QEMU, so the source virtual machine should be restarted but not resumed when the migration fails.

To summarize, I made a call migration flow with things I hacked to enable local migration in the existing migration framework. It's a bit long, I put it at the end of the mail. I found if I was to re-use migration framework, I need to change interface of a few functions, add some flags, pass them deep into the inner functions.

So I propose a new and compact work flow dedicated for QEMU live upgrade. After all, it's an upgrade operation based on tricky migration. When developing the previous RFC patch for the new API, I focused on the correctness of the work flow, so many other things are missing. I think I can add things like Spice seamless migration when I submitting new versions. I am also really happy if you could give me some advice to re-use the migration framework. Re-using the current framework can saves a lot of effort.


Appendix
Call flow to enable local migration in current migration framework
All conn->XXX() and dconn->XXX() are remote calls to libvirtd, then libvirtd dispatches the request to QEMU driver.
"domain", "conn" means source domain and libvirt connection, "ddomain", "dconn" means destination domain and libvirt connection.
Things I hacked are marked with /* HACKED */.

virDomainMigrate(...)
  virDomainMigrateVersion3(...) -> virDomainMigrateVersion3Full(...)
    dom_xml = conn->domainMigrateBegin3(&cookieout...) => qemuDomainMigrateBegin3
      vm = qemuDomObjFromDomain(domain)
      qemuMigrationBegin(vm, ...)
        qemuMigrationBeginPhase(vm, ...)
          Generate migration cookie: qemuMigrationEatCookie(NULL, ...); qemuMigrationBakeCookie(...)
          if (xmlin) def = virDomainDefParseString(xmlin, ...)
            virDomainDefCheckABIStability(def, vm->def) /* HACKED to skip domain uuid check */
            qemuDomainDefFormatLive(def / vm->def, ...)

    cookiein = cookieout
    dconn->domainMigratePrepare3(uri, &uri_out, cookiein, &cookieout, dname, dom_xml, ...) => qemuDomainMigratePrepare3
      def = qemuMigrationPrepareDef(dom_xml, ...)
      qemuMigrationPrepareDirect(uri, uri_out, &def, ...)
        parse uri and build uri_out;
        qemuMigrationPrepareAny(def, ...)
          new_def = call migration hooks to manipulate def
          if (virDomainDefCheckABIStability(def, new_def, ...)) /* HACKED to skip domain uuid check */
            def = new_def;
          migrateFrom = "tcp:host:port" /* HACKED to "unix:/full/path" */
          vm = virDomainObjListAdd(domains, def, ...)
          Verify cookie: qemuMigrationEatCookie(cookiein, ...)
            qemuMigrationCookieXMLParseStr(...)
              qemuMigrationCookieXMLParse(...)
                verify our hostname and host UUID are different from source /* HACKED to skip host uuid & name check*/
          qemuProcessStart(vm, migrateFrom, ...)
            qemu --incoming "tcp:host:port" --hda ...
          qemuMigrationBakeCookie(cookieout, ...)

    uri = uri_out
    cookiein = cookieout
    conn->domainMigratePerform3(uri, coockiein, &cookieout, ...) =>  qemuDomainMigratePerform3
      vm = qemuDomObjFromDomain(dom)
      qemuMigrationPerform(uri, ...)
        qemuMigrationPerformPhase(uri, ...)
          doNativeMigrate(vm, uri, ...)
            build qemu migration spec /* HACKED to replace tcp transport spec with unix socket spec */
            qemuMigrationRun(vm, spec, ...)
              qemuMigrationEatCookie(...) /* HACKED to skip host uuid & name check*/
              qemuMonitorMigrateToHost / qemuMonitorMigrateToUnix / qemuMonitorMigrateToFd
                qemuMonitorJSONMigrate(uri, ...)
                  cmd = qemuMonitorJSONMakeCommand("migrate", "s:uri", uri, ...)
                  qemuMonitorJSONCommand(cmd)
              start tunneling between src and dst unix socket /* HACKED to skip tunneling*/
              qemuMigrationWaitForCompletion(...)
                Poll every 50ms for progress, check error, and allow cancellation
              qemuMigrationBakeCookie(cookieout, ...)

    cookiein = cookieout
    ddomain = dconn->domainMigrateFinish3(dname, uri, cookiein, &cookieout...) => qemuDomainMigrateFinish3
      vm = virDomainObjListFindByName(driver->domains, dname)
      qemuMigrationFinish(vm, ...)
        qemuMigrationEatCookie(...) /* HACKED to skip host uuid & name check*/
        qemuProcessStartCPUs(...) -> ... Send QMP command "cont" to QEMU
        dom = virGetDomain(dconn, vm->def->name, vm->def->uuid)
        qemuMigrationBakeCookie(cookieout, ...)
        return dom

    cookiein = cookieout
    conn->domainMigrateConfirm3(cookiein, ...) => qemuDomainMigrateConfirm3
      vm = qemuDomObjFromDomain(domain)
      qemuMigrationConfirm(vm, cookiein, ...)
        qemuMigrationConfirmPhase(vm, cookiein, ...)
          qemuMigrationEatCookie(cookiein, ...) /* HACKED to skip host uuid & name check*/
          qemuProcessStop(vm, ..)

    return ddomain

Thanks and best regards!
_____________________________
Zhou Zheng Sheng / 周征晟
Software Engineer
E-mail: zhshzhou at cn.ibm.com
Telephone: 86-10-82454397




More information about the libvir-list mailing list