Add options to device xml to skip reattach of pci passthrough devices.
Alex Williamson
alex.williamson at redhat.com
Fri Jun 18 14:58:08 UTC 2021
On Fri, 18 Jun 2021 10:43:07 -0400
Laine Stump <laine at redhat.com> wrote:
> On 6/16/21 4:15 PM, Daniel Henrique Barboza wrote:
> >
> >
> > On 6/9/21 4:38 PM, Manish Mishra wrote:
> >> Hi Everyone,
> >>
> >> We want to add extra options to device xml to skip reattach of pci
> >> passthrough devices. Following is xml format for pci passthrough
> >> devices added to domain as of now.
> >>
> >> <hostdev mode='subsystem' type='pci' managed='yes'>
> >>
> >> <source>
> >>
> >> <address domain='0x0000' bus='0x00' slot='0x1a' function='0x7'/>
> >>
> >> </source>
> >>
> >> </hostdev>
> >>
> >> When we pass managed=’yes’ flag through xml, libvirt takes
> >> responsibility of detaching device on domain(guest VM) start and
> >> reattaching on domain shutdown. We observed some issues where guest VM
> >> shutdown may take long time, blocked for reattach operation on pci
> >> passthrough device. As domain lock is held during this time it also
> >> makes libvirt mostly inactive as it blocks even basic operations like
> >> (virsh list). Reattaching of device to host can block due to reasons
> >> like buggy driver or initialization of device itself can take long
> >> time in some cases.
> >
> > I am more interested in hearing about the problem with this faulty buggy
> > driver holding domain lock during device reattach and compromising 'virsh'
> > operations, and see if there's something to do to mitigate that, instead
> > of creating a XML workaround for a driver problem.
> >
> >>
> >> We want to pass following extra options to resolve this:
> >>
> >> 1. *skipReAttach*(optional flag)
> >>
> >> In some cases we do not need to reattach device to host as it may be
> >> reserved only for guests, with this flag we can skip reattach
> >> operation on host. We do not want to modify managed flag to avoid
> >> regression, so thinking of adding new optional flag.
> >>
> >> 2. *reAttachDriverName*(optional flag)
> >>
> >> Name of driver to which we want to attach instead of default, to avoid
> >> reattaching to buggy driver. Currently libvirt asks host to auto
> >> selects driver for device.
> >>
> >> Yes we can use managed=’no’ but in that case user has to take
> >> responsibility of detaching device before starting domain which we do
> >> not want. Please let us know your views on this.
> >
> > The case you mentioned above, "we do not need to reattach device to host
> > as it may be reserved only for guests", is one of the most common uses
> > we have for managed='no' AFAIK. The user/sysadm must detach the device
> > from the host, but it's only one time. After that the device can remain
> > detached from the host, and guests can use it freely as long as you
> > don't reboot the host (or reattach the device back). This scenario
> > you described fit the managed='no' mechanics fine IMO.
> >
> > If you want to automate the detach process, you can use a Libvirt QEMU
> > hook (/etc/libvirt/hooks/qemu) to make the device detach when starting
> > the domain, in case the device isn't already detached. Note that
> > this has the same effect of the "skipReAttach" option you proposed.
> >
> > Making a design around faulty drivers isn't ideal. If the driver you're
> > using starts to have problems with the detach operation as well,
> > 'skipReAttach'
> > will do you no good. You'll have to fall back to 'managed=no' to circumvent
> > that.
> >
> > Even if we discard the motivation, I'm not sure about the utility of having
> > more forms of PCI assignment management (e.g
> > managed=yes|no|detach|reattach).
> > managed=yes|no seems to cover most use cases where the device driver works
> > properly.
> >
> >
> > Laine, what do you think?
>
>
> I have a vague memory of someone (may even have been me) proposing
> something similar several years ago, and the idea was shot down. I don't
> remember the exact context or naming, but the general idea was to have
> something like managed='detach-only' in order to have the advantage of
> all configuration being within libvirt, but eliminating the potential
> bad behavior associated with repeated re-binding of devices to drivers.
> I unfortunately also don't recall the reason the idea was nixed. Dan or
> Alex - do either of you have any memory of this?
Sounds very vaguely familiar, but not enough to remember the arguments.
> As for myself, 1) I agree with Daniel's suggestion that it is important
> to find the root cause of the long delay rather than just covering it up
> with another obscure option that will need to be re-discovered by anyone
> who encounters the problem, and 2) every new bit that we add in makes
> the code more complex and so more prone to errors, and also makes
> configuration more complex and also more prone to errors. So while new
> options like this could be useful, they could also be a net loss (or
> not, it's hard to know without actually doing it, but once it's done it
> can't be "un-done" :-))
AFAIK, the generally recommendation for "devices used only for
assignment" is to use driverctl to automatically attach these devices
to vfio-pci on boot/attach and managed="no" in the xml. It might not
be a bad idea if libvirt were to fork a thread for rebinding a device
to the host, but xml workarounds don't seem well justified here.
Thanks,
Alex
More information about the libvir-list
mailing list