[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [libvirt] [RFC PATCH 0/5] hotplug: fix premature rebinding of VFIO devices to host



On Thu, 2017-06-29 at 15:59 -0500, Michael Roth wrote:
> > > Patches 1-4 address 1) by deferring rebinding of a hostdev to the host driver
> > > until all the devices in the group have been detached, at which point all
> > > the hostdevs are rebound as a group. Until that point, the devices are traced
> > > by the drvManager's inactiveList in a similar manner to hostdevs that are
> > > assigned to VFIO via the nodedev-detach interface.
> > 
> > What happens if libvirtd is restarted during this period? Is the
> > inactiveList rebuilt with all the info necessary to assure that the
> > nodedev-reattach does/doesn't happen (as appropriate) for all devices?
> 
> Hmm, good question.
> 
> The Unbindable() check only needs to know that nothing in the activeList
> belongs to the group we're checking, and that list at least seems to get
> rebuilt appropriately on restart.
> 
> But the Unbind() relies on inactiveList and the behavior there is what
> nodedev-detach currently does, which is to add it to inactive list while
> libvirtd is running, and then just forget about it when libvirtd restarts.
> For nodedev-detach it's fine since virHostdevPreparePCIDevices() re-adds
> it as needed in the device-attach path. But yah, for this purpose it ends
> up losing track of hostdevs that are still pending rebind to the host, and
> that means some devices may not get rebound at the appropriate time if
> there was a libvirtd restart.
> 
> Unlike with device-attach, we can't just re-add it on-demand because we
> actually need to know whether or not it was previously in the list. So
> I think we'd need to add some persistent state to track this. Will look
> into adding handling for that.

FWIW last time a tried to attack this issue[1] I got pretty
much as far as this, eg. the code worked as intended but
restarting libvirtd would result in an inconsistent state
which prevented you from performing some operations.

Unfortunately I got sidetracked by other work and stopped
just short of implementing a way to persist the relevant
information on disk :(


[1] ~1.5 years ago, according to git log
-- 
Andrea Bolognani / Red Hat / Virtualization


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]