[libvirt] [RFC v1 0/6] Live Migration with ephemeral host NIC devices

Wed May 13 14:42:38 UTC 2015

* Laine Stump (laine at redhat.com) wrote:
> On 05/13/2015 04:28 AM, Peter Krempa wrote:
> > On Wed, May 13, 2015 at 09:08:39 +0100, Dr. David Alan Gilbert wrote:
> >> * Peter Krempa (pkrempa at redhat.com) wrote:
> >>> On Wed, May 13, 2015 at 11:36:26 +0800, Chen Fan wrote:
> >>>> my main goal is to add support migration with host NIC
> >>>> passthrough devices and keep the network connectivity.
> >>>>
> >>>> this series patch base on Shradha's patches on
> >>>> https://www.redhat.com/archives/libvir-list/2012-November/msg01324.html
> >>>> which is add migration support for host passthrough devices.
> >>>>
> >>>>  1) unplug the ephemeral devices before migration
> >>>>
> >>>>  2) do native migration
> >>>>
> >>>>  3) when migration finished, hotplug the ephemeral devices
> >>>
> >>> IMHO this algorithm is something that an upper layer management app
> >>> should do. The device unplug operation is complex and it might not
> >>> succeed which will make the current migration thread hang or fail in an
> >>> intermediate state that will not be recoverable.
> >>
> >> However you wouldn't want each of the upper layer management apps implementing
> >> their own hacks for this; so something somewhere needs to standardise
> >> what the guest sees.
> > 
> > The guest still will see an PCI device unplug request and will have to
> > respond to it, then will be paused and after resume a new PCI device
> > will appear. This is standardised. The nonstandardised part (which can't
> > really be standardised) is how the bonding or other guest-dependant
> > stuff will be handled, but that is up to the guest OS to handle.
> > 
> > From libvirt's perspective this is only something that will trigger the
> > device unplug and plug the devices back. And there are a lot of issues
> > here:
> > 
> > 1) the destination of the migration might not have the desired devices
> > 
> >     This will trigger a lot of problems as we will not be able to guarantee
> >     that the devices reappear on the destination and if we'd wanted to check
> >     we'd need a new migration protocol AFAIK.
> > 
> > 2) The guest OS might refuse to detach the PCI device (it might be stuck
> > before PCI code is loaded)
> > 
> >     In that case the migration will be stuck forever and abort attempts
> >     will make the domain state basically undefined depending on the
> >     phase where it failed.
> > 
> > Since we can't guarantee that the unplug of the PCI host devices will be
> > atomic or that it will succeed we basically can't guarantee in any way
> > in which state the VM will end up later after (a possibly failed)
> > migration. To recover such state there are too many option that could be
> > desired by the user that would be hard to implement in a way that would
> > be flexible enough.
> 
> 
> In the past I've been on the side of having libvirt automatically do the
> device detach and reattach (but definitely on the side of the guest
> agent and libvirt keeping their hands off of network configuration in
> the guest), with the thinking that 1) libvirt is in a well situated spot
> to do it, and 2) this would eliminate duplicate code in the upper level
> management.
> 
> However, Peter's points above made me consider the failure cases more
> closely, in particular this one:
> 
> * the destination claims to have the resources required (right type of
> PCI device, enough RAM), so migration is started.
> 
> * device detached on source, guest memory migrated to destination,
> 
> * guest started - no problems. (At this point, since the guest has been
> restarted, it's not really possible for libvirt to fail the migration in
> a recoverable manner (unless you want to implement some sort of
> "unmigration" so that the guest state on the source is updated with
> whatever execution occurred on the destination, and I don't think
> *anyone* wants to go there))
> 
> * libvirt finds the device still available and attempts to attach it but
> (for some odd reason) fails.
> 
> Now libvirt can't tell the application that the migration has succeeded,
> because it didn't (unless the device was marked as "optional"), but it
> also can't fail the migration except to say "this is such a monumental
> failure that your guest has simply died".
> 
> If, on the other hand, the detach and re-attach are implemented in a
> higher layer (ovirt/openstack), they will at least have the guest in a
> state they can deal with - it won't be pretty, but they could for
> example migrate the guest to another host (maybe back to the source) and
> re-attach there.
> 
> So this one message from Peter has nicely pointed out the error in my
> thinking, and I now agree that auto-detach/reattach shouldn't be
> implemented in libvirt - it would work nicely in an error free world,
> but would crumble in the face of some errors. (I just wish I had
> considered the particular failure mode above a year or two ago, so I
> could have been more discouraging in my emails then :-)

It's a shame to limit the utility of this by dealing with an error case
that's not a fatal error.  Does libvirt not have a way of dealing with
non-fatal errors?

Dave

--
Dr. David Alan Gilbert / dgilbert at redhat.com / Manchester, UK