[libvirt] [PATCH 3/6] Introduce yet another migration version in API.

Daniel P. Berrange berrange at redhat.com
Thu Apr 21 11:52:37 UTC 2011


On Wed, Apr 20, 2011 at 10:38:40PM -0500, Christian Benvenuti (benve) wrote:
> > On 04/20/2011 05:28 PM, Christian Benvenuti (benve) wrote:
> > > Daniel,
> > >    I looked at the patch-set you sent out on the 2/9/11
> > >
> > >    [libvirt] [PATCH 0/6] Introduce a new migration protocol
> > >                          to QEMU driver
> > >    http://www.mail-archive.com/libvir-list@redhat.com/msg33223.html
> > >
> > > What is the status of this new migration protocol?
> > > Is there any pending issue blocking its integration?
> > >
> > > I would like to propose an RFC enhancement to the migration
> > > algorithm.
> > >
> > > Here is a quick summary of the proposal/idea.
> > >
> > > - finer control on migration result
> > >
> > >    - possibility of specifying what features cannot fail
> > >      their initialization on the dst host during migration.
> > >      Migration should not succeed if any of them fails.
> > >      - optional: each one of those features should be able to
> > >                  provide a deinit function to cleanup resources
> > >                  on the dst host if migration fails.
> > >
> > > This functionality would come useful for the (NIC) set port
> > > profile feature VDP (802.1Qbg/1Qbh), but what I propose is
> > > a generic config option / API that can be used by any feature.
> > >
> > > And now the details.
> > >
> > > ----------------------------------------------
> > > enhancement: finer control on migration result
> > > ----------------------------------------------
> > >
> > > There are different reasons why a VM may need (or be forced) to
> > > migrate.
> > > You can classify the types of the migrations also based on
> > > different semantics.
> > > For simplicity I'll classify them into two categories, based on
> > > how important it is for the VM to migrate as fast as possible:
> > >
> > > (1) It IS important
> > >
> > >     In this case, whether the VM will not be able to (temporary)
> > >     make use of certain resources (for example the network) on the
> > >     dst host, is not that important, because the completion of the
> > >     migration is considered higher priority.
> > >     A possible scenario could be a server that must migrate ASAP
> > >     because of a disaster/emergency.
> > >
> > > (2) It IS NOT important
> > >
> > >     I can think of a VM whose applications/servers need a network
> > >     connection in order to work properly. Loosing such network
> > >     connectivity as a consequence of a migration would not be
> > >     acceptable (or highly undesirable).
> > >
> > > Given the case (2) above, I have a comment about the Finish
> > > step, with regards to the port profile (VDP) codepath.
> > >
> > > The call to
> > >
> > >      qemuMigrationVPAssociatePortProfile
> > >
> > > in
> > >      qemuMigrationFinish
> > >
> > > can fail, but its result (success or failure) does not influence
> > > the result of the migration Finish step (it was already like this
> > > in migration V2).
> > 
> > I *believe* the underlying problem is Qemu's switch-over. Once Qemu
> > decides that the migration was successful, Qemu on the source side
> dies
> > and continues running on the destination side. I don't think there are
> > more handshakes foreseen with higher layers that this could be
> reversed
> > or the switch-over delayed, but correct me if I am wrong...
> 
> Actually I think this is not what happens in migration V3.
> My understanding is this:
> 
> - the qemu cmdline built by Libvirt on the dst host during Prepare3
>   includes the "-S" option (ie no autostart)
> 
> - the VM on the dst host does not start running until libvirt
>   calls qemuProcessStartCPUs in the Finish3 step.
>   This fn simply sends the "-cont" cmd to the monitor to
>   start the VM/CPUs.
>   
> If I am right, libvirt does have full control on how/when to start
> the CPU on the dst host, it is not QEMU to do it.

That is correct. It is libvirt that decides when to kill the src
QEMU, and in theory when to start the CPUs on the dst. In practice
we can't reliably determine the latter, until QEMU gives us more
info, so we just start CPUs once src has finished sending data.

> The only thing libvirt does not control is when to pause the VM
> on the src host: QEMU does it during the stage-2 of the live-ram-copy
> based on the max_downtime config.
> However I do not think this represents a problem.

Correct, that's no problem. The key thing is that libvirt
decides when to start dst CPUs & kill src QEMU process.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|




More information about the libvir-list mailing list