[libvirt] [RFC v1 4/6] migration: Migration support for ephemeral hostdevs

Fri May 15 01:08:23 UTC 2015

On 2015/05/14 17:38, Daniel P. Berrange wrote:
> On Thu, May 14, 2015 at 10:02:39AM +0800, Chen Fan wrote:
>>
>> On 05/13/2015 10:30 PM, Laine Stump wrote:
>>> On 05/13/2015 05:57 AM, Daniel P. Berrange wrote:
>>>> On Wed, May 13, 2015 at 11:36:30AM +0800, Chen Fan wrote:
>>>>> add migration support for ephemeral host devices, introduce
>>>>> two 'detach' and 'restore' functions to unplug/plug host devices
>>>>> during migration.
>>>>>
>>>>> Signed-off-by: Chen Fan <chen.fan.fnst at cn.fujitsu.com>
>>>>> ---
>>>>>   src/qemu/qemu_migration.c | 171 ++++++++++++++++++++++++++++++++++++++++++++--
>>>>>   src/qemu/qemu_migration.h |   9 +++
>>>>>   src/qemu/qemu_process.c   |  11 +++
>>>>>   3 files changed, 187 insertions(+), 4 deletions(-)
>>>>>
>>>>> diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c
>>>>> index 56112f9..d5a698f 100644
>>>>> --- a/src/qemu/qemu_migration.c
>>>>> +++ b/src/qemu/qemu_migration.c
>>>>> +void
>>>>> +qemuMigrationRestoreEphemeralDevices(virQEMUDriverPtr driver,
>>>>> +                                     virConnectPtr conn,
>>>>> +                                     virDomainObjPtr vm,
>>>>> +                                     bool live)
>>>>> +{
>>>>> +    qemuDomainObjPrivatePtr priv = vm->privateData;
>>>>> +    virDomainDeviceDefPtr dev;
>>>>> +    int ret = -1;
>>>>> +    size_t i;
>>>>> +
>>>>> +    VIR_DEBUG("Rum domain restore ephemeral devices");
>>>>> +
>>>>> +    for (i = 0; i < priv->nEphemeralDevices; i++) {
>>>>> +        dev = priv->ephemeralDevices[i];
>>>>> +
>>>>> +        switch ((virDomainDeviceType) dev->type) {
>>>>> +        case VIR_DOMAIN_DEVICE_NET:
>>>>> +            if (live) {
>>>>> +                ret = qemuDomainAttachNetDevice(conn, driver, vm,
>>>>> +                                                dev->data.net);
>>>>> +            } else {
>>>>> +                ret = virDomainNetInsert(vm->def, dev->data.net);
>>>>> +            }
>>>>> +
>>>>> +            if (!ret)
>>>>> +                dev->data.net = NULL;
>>>>> +            break;
>>>>> +        case VIR_DOMAIN_DEVICE_HOSTDEV:
>>>>> +            if (live) {
>>>>> +                ret = qemuDomainAttachHostDevice(conn, driver, vm,
>>>>> +                                                 dev->data.hostdev);
>>>>> +           } else {
>>>>> +                ret =virDomainHostdevInsert(vm->def, dev->data.hostdev);
>>>>> +            }
>>>> This re-attach step is where we actually have far far far worse problems
>>>> than with detach. This is blindly assuming that the guest on the target
>>>> host can use the same hostdev that it was using on the source host.
>>> (kind of pointless to comment on, since pkrempa has changed my opinion
>>> by forcing me to think about the "failure to reattach" condition, but
>>> could be useful info for others)
>>>
>>> For a <hostdev>, yes, but not for <interface type='network'> (which
>>> would point to a libvirt network pool of VFs).
>>>
>>>> This
>>>> is essentially useless in the real world.
>>> Agreed (for plain <hostdev>)
>>>
>>>> Even if the same vendor/model
>>>> device is available on the target host, it is very unlikely to be available
>>>> at the same bus/slot/function that it was on the source. It is quite likely
>>>> neccessary to allocate a complete different NIC, or if using SRIOV allocate
>>>> a different function. It is also not uncommon to have different vendor/models,
>>>> so a completely different NIC may be required.
>>> In the case of a network device, a different brand/model of NIC at a
>>> different PCI address using a different guest driver shouldn't be a
>>> problem for the guest, as long as the MAC address is the same (for a
>>> Linux guest anyway; not sure what a Windows guest would do with a NIC
>>> that had the same MAC but used a different driver). This points out the
>>> folly of trying to do migration with attached hostdevs (managed at *any*
>>> level), for anything other than SRIOV VFs (which can have their MAC
>>> address set before attach, unlike non-SRIOV NICs).
>>>
>>> .
>> So should we focus on implementing the feature that support migration with
>> SRIOV
>> VFs at first?
>>
>> I think that is simple to achieve my original target that implement NIC
>> passthrough
>> device migration. because sometimes we assign a native NIC to guest to keep
>> the
>> performance of network I/O, due to the MAC limitation of the non-SRIOV NICs,
>> as
>> laine said the cost of SRIOV NIC is cheaper than what we try.
>
> No, I think you should /not/ attempt to implement this in libvirt at all
> and instead focus on the higher level apps.
>

Hmm, I think there are some roles which libvirt can take in the whole operations.

Let me clarify how things will go at pci-pass through + migration.

  (1) the user(or high level apps) make a pair of pci devices which can be
     replaced before/after migration.

  (2) the pair of devices in 2 hosts are described somewhere.

  (3) before starting migration, migration initiator takes care of another side of the pair devices
      are available at target host.

  (4) unplug pci devices, which are descrived as part of paired devices.

  (5) migration with checking all pci-passthrough devices are unplugged.

  (6)  at success, plug pci devices,  which are descrived as part of paired devices.
  (6') at failure, plug unplugged devices back.

I think
  (1) should be done by higher level apps or user (by hand).
  (2) should be a generic/vm-independent format
  (3) should be checked by migration initiator
  (4) shoule be done by agent in the guest.
  (5) should be chekced by migration initiator
  (6) should be done by agent in the guest.

can't libvirt take a role in (2)(3)(5) ? All should be done by the higher level app ?
I wonder we should have virt-tool to do this before going the higher level.

Thanks,
-Kame