[libvirt] libvirt mdev migration, mdevctl integration

Tue Nov 26 11:14:47 UTC 2019

On Tue, Nov 26, 2019 at 12:08:41PM +0100, Cornelia Huck wrote:
> On Tue, 26 Nov 2019 10:54:59 +0100
> Boris Fiuczynski <fiuczy at linux.ibm.com> wrote:
> 
> > On 11/25/19 6:14 PM, Cornelia Huck wrote:
> > > Also, I'm wondering if we need special care for vfio-ap, although I'm
> > > not sure if it is feasible to add migration support for it all. We
> > > currently have a matrix device (always same parent) defined by the
> > > UUID, and adapters/domains configured for this matrix device (which is
> > > handled as extra parameters in the mdevctl device config). I'm not sure
> > > how different adapters/domains translate between systems we want to
> > > migrate between. Not sure how much sense it makes to dwell on this at  
> > Aside from the card preparation with the appropriate masterkeys the 
> > adapter/domain configuration (including the card types) for an mdev 
> > needs to remain the same since there is no virtualization of 
> > adapter/domain addresses in the current vfio-ap driver implementation. 
> > As a result a currently possible migration scenario is cross-CEC.
> 
> Ok, given the non-virtualization of queue addresses, we need an exact
> match on both sides.
> 
> > 
> >  From libvirts perspective:
> > Assuming that mdevs on the source and target system exist, would a 
> > matching UUID be enough assurance that these two host resources match 
> > for a migration? If not, is a check performed on the configuration of 
> > the two mdevs? What is in that case considered migration save? Where are 
> > these checks implemented? Does the checking for migratablity go beyond 
> > the configuration data of mdev devices, e.g. vfio-ap: check for 
> > existence of masterkeys, card type equivalency or as Connie mentioned 
> > before on vfio-ccw the equivalency of the child ccw device of the 
> > subchannels.
> 
> Entrusting a management layer with setting up the other side probably
> makes the most sense, at the very least for an initial implementation.
> 
> One concern I have: How easy is it to find out that the management
> layer has messed things up? Ideally, we want to find out as early as
> possible that the other side does not match and abort the migration.
> Limping on with subtle errors would be the worst case.

In the general case I think it is impossible to determine whether the
mgmt layer has messed up or not because mdevs are effectively vendor
specific black boxes. Without specific knowledge of the vendors' driver
implementation we can't look at two mdevs and declare that they are
going to be functionally identical, or not, from the guest's POV.

We just have to make sure we expose correct information about what
has been configured, so that if something does go wrong, it is as
easy as possible for humans to diagnose.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|