[libvirt] [PATCH 3/6] Introduce yet another migration version in API.
Stefan Berger
stefanb at linux.vnet.ibm.com
Thu Apr 21 11:37:30 UTC 2011
On 04/20/2011 11:38 PM, Christian Benvenuti (benve) wrote:
>> On 04/20/2011 05:28 PM, Christian Benvenuti (benve) wrote:
>>> Daniel,
>>> I looked at the patch-set you sent out on the 2/9/11
>>>
>>> [libvirt] [PATCH 0/6] Introduce a new migration protocol
>>> to QEMU driver
>>> http://www.mail-archive.com/libvir-list@redhat.com/msg33223.html
>>>
>>> What is the status of this new migration protocol?
>>> Is there any pending issue blocking its integration?
>>>
>>> I would like to propose an RFC enhancement to the migration
>>> algorithm.
>>>
>>> Here is a quick summary of the proposal/idea.
>>>
>>> - finer control on migration result
>>>
>>> - possibility of specifying what features cannot fail
>>> their initialization on the dst host during migration.
>>> Migration should not succeed if any of them fails.
>>> - optional: each one of those features should be able to
>>> provide a deinit function to cleanup resources
>>> on the dst host if migration fails.
>>>
>>> This functionality would come useful for the (NIC) set port
>>> profile feature VDP (802.1Qbg/1Qbh), but what I propose is
>>> a generic config option / API that can be used by any feature.
>>>
>>> And now the details.
>>>
>>> ----------------------------------------------
>>> enhancement: finer control on migration result
>>> ----------------------------------------------
>>>
>>> There are different reasons why a VM may need (or be forced) to
>>> migrate.
>>> You can classify the types of the migrations also based on
>>> different semantics.
>>> For simplicity I'll classify them into two categories, based on
>>> how important it is for the VM to migrate as fast as possible:
>>>
>>> (1) It IS important
>>>
>>> In this case, whether the VM will not be able to (temporary)
>>> make use of certain resources (for example the network) on the
>>> dst host, is not that important, because the completion of the
>>> migration is considered higher priority.
>>> A possible scenario could be a server that must migrate ASAP
>>> because of a disaster/emergency.
>>>
>>> (2) It IS NOT important
>>>
>>> I can think of a VM whose applications/servers need a network
>>> connection in order to work properly. Loosing such network
>>> connectivity as a consequence of a migration would not be
>>> acceptable (or highly undesirable).
>>>
>>> Given the case (2) above, I have a comment about the Finish
>>> step, with regards to the port profile (VDP) codepath.
>>>
>>> The call to
>>>
>>> qemuMigrationVPAssociatePortProfile
>>>
>>> in
>>> qemuMigrationFinish
>>>
>>> can fail, but its result (success or failure) does not influence
>>> the result of the migration Finish step (it was already like this
>>> in migration V2).
>> I *believe* the underlying problem is Qemu's switch-over. Once Qemu
>> decides that the migration was successful, Qemu on the source side
> dies
>> and continues running on the destination side. I don't think there are
>> more handshakes foreseen with higher layers that this could be
> reversed
>> or the switch-over delayed, but correct me if I am wrong...
> Actually I think this is not what happens in migration V3.
> My understanding is this:
>
> - the qemu cmdline built by Libvirt on the dst host during Prepare3
> includes the "-S" option (ie no autostart)
>
> - the VM on the dst host does not start running until libvirt
> calls qemuProcessStartCPUs in the Finish3 step.
> This fn simply sends the "-cont" cmd to the monitor to
> start the VM/CPUs.
That's correct, but it's doing this already in v2. The non-autostart
(-S) corresponds to Qemu's autostart here (migration.c):
void process_incoming_migration(QEMUFile *f)
{
if (qemu_loadvm_state(f) < 0) {
fprintf(stderr, "load of migration failed\n");
exit(0);
}
qemu_announce_self();
DPRINTF("successfully loaded vm state\n");
incoming_expected = false;
if (autostart)
vm_start();
}
and simply doesn't start the VM. After this function is called all
sockets are closed and the communication with the source host is cut. I
don't think it allows for fall-back at this point.
Rather we may need a 'wait' option for migration and before the
qemu_put_byte(f, QEMU_VM_EOF);
in qemu_savevm_state_complete() sync with the monitor and either wait
for something like migrate_finish or migrate_cancel.
Regards,
Stefan
More information about the libvir-list
mailing list