[libvirt] using sync_manager with libvirt

Wed Aug 11 21:19:27 UTC 2010

On Wed, Aug 11, 2010 at 04:53:20PM -0400, Chris Lalancette wrote:
> > 1. sm-S holds the lease, and is monitoring qemu
> > 2. migration begins from S to D
> > 3. libvirt-D runs sm-D: sync_manager -c qemu with the addition of a new
> >    sync_manager option --receive-lease
> > 4. sm-D writes its hostid D to the lease area signaling sm-S that it wants
> >    to be the lease owner when S is done with it
> > 5. sm-D begins monitoring the lease owner on disk (which is still S)
> > 6. sm-D forks qemu-D
> > 7. sm-S sees that D wants the lease
> > 8. qemu-S exits with success
> > 9. sm-S sees qemu-S exit with success
> > 10. sm-S writes D as the lease owner into the lease area and exits
> >     (in the non-migration/transfer case, sm-S writes owner=LEASE_FREE)
> > 11. sm-D (still monitoring the lease owner) sees that it has become the
> >     owner, and begins renewing the lease
> > 12. qemu-D runs fully
> 
> Unfortunately, this is not how migration works in qemu/kvm.  Using your
> nomenclature above, it's more like the following:
> 
> A guest is running on S.  A migration is then initiated, at which point D
> fires up a qemu process with a -incoming argument.  

libvirt starts qemu -incoming on D, right?   So with sync_manager, libvirt
would start: sync_manager --receive_lease -c qemu -incoming

> This is sort of
> a container process that will receive all of the migration data.  Crucially
> for sync-manager, though, qemu completely starts up and "attaches" to all of
> the resources (including disks) *while* qemu at S is still running.  Then it
> enters a sort of paused state (where the guest cannot run), and receives
> all of the migration data.  

That should all be fine.

> Once all of the migration data has been received, the guest on S is destroyed,

ok, sm-S sees qemu-S exit at that point.

> and the guest on D is unpaused.  

The critical bit would be ensuring that sm-S has written owner=D into
the lease area before qemu-D is unpaused.  Hooking into the sequence at
that point in time might be too difficult or ugly, I don't know.

> That's why Dan
> mentioned that we need two hosts to access the disk at once.

It would be easiest, of course, if the lease owner always represented where
qemu was running, but that obviously won't work with migration.  So we have
to settle for the lease owner always representing where qemu is unpaused.

Dave