[libvirt] [PATCH v3 6/8] qemu: use virDomainPCIAddressIsMulti() to determine multifunction setting

Daniel P. Berrange berrange at redhat.com
Fri Jan 6 16:30:50 UTC 2017


On Mon, Dec 19, 2016 at 10:23:16AM -0500, Laine Stump wrote:
> If the multifunction attribute isn't set in the config for the device
> at function 0 of a slot used for multifunction, it would previously
> have been an error. This patch will instead automatically correct the
> omission (but only if it hasn't been set at all - if someone
> explicitly has "multifunction='off'" on function 0, or
> "multifunction='on'" when function != 0, we have to assume they have a
> reason for that).
> 
> This effectively obsoletes the requirement of specifying
> multifunction='on' in the config, although you're still free to do
> so. Note that if you migrate a domain that needs an implied
> "multifunction='on'" back to any older libvirt that doesn't have it,
> the migration will fail. (Note that this would only be an issue with a
> domain config that was *created* on a newer libvirt; any config
> created on an older libvirt and then later migrated to a newer libvirt
> would necessarily have multifunction explicitly set in the config, and
> that will not be lost during migration).

So the reason we added multifunction=on as an explicit attribute is
because it is an guest ABI change, even if function 0 is the only
function present.

  commit c329db7180d77c8077b9f9cd167a71d7f347227a
  Author: Laine Stump <laine at laine.org>
  Date:   Thu Sep 29 13:00:32 2011 -0400

    qemu: make PCI multifunction support more manual
    
    When support for was added for PCI multifunction cards (in commit
    9f8baf, first included in libvirt 0.9.3), it was done by always
    turning on the multifunction bit for all PCI devices. Since that time
    it has been realized that this is not an ideal solution, and that the
    multifunction bit must be selectively turned on. For example, see
    
      https://bugzilla.redhat.com/show_bug.cgi?id=728174
    
    and the discussion before and after
    
      https://www.redhat.com/archives/libvir-list/2011-September/msg01036.html
    
    This patch modifies multifunction support so that the multifunction=on
    option is only added to the qemu commandline for a device if its PCI
    <address> definition has the attribute "multifunction='on'", e.g.:
    
      <address type='pci' domain='0x0000' bus='0x00'
               slot='0x04' function='0x0' multifunction='on'/>
    
    In practice, the multifunction bit should only be turned on if
    function='0' AND other functions will be used in the same slot - it
    usually isn't needed for functions 1-7 (although there are apparently
    some exceptions, e.g. the Intel X53 according to the QEMU source
    code), and should never be set if only function 0 will be used in the
    slot. The test cases have been changed accordingly to illustrate.
    
    With this patch in place, if a user attempts to assign multiple
    functions in a slot without setting the multifunction bit for function
    0, libvirt will issue an error when the domain is defined, and the
    define operation will fail. In the future, we may decide to detect
    this situation and automatically add multifunction=on to avoid the
    error; even then it will still be useful to have a manual method of
    turning on multifunction since, as stated above, there are some
    devices that excpect it to be turned on for all functions in a slot.
    
    A side effect of this patch is that attempts to use the same PCI
    address for two different devices will now log an error (previously
    this would cause the domain define operation to fail, but there would
    be no log message generated). Because the function doing this log was
    almost completely rewritten, I didn't think it worthwhile to make a
    separate patch for that fix (the entire patch would immediately be
    obsoleted).


You mentioned in your commit message there, that it would be a valid
enhancement to automatically add multifunction=on, if-and-only-if
we saw other devices with non-0 functions in the same slot. 

That all said, the concern I have is that although this change is not
in itself an ABI incompatible change, it does mean that applications
can accidentally trigger ABI incompatible changes in guests.

For example, consider an app has defined a guest with 2 block devs
in the same slot, but not set multifunction=on. Libvirt will silently
set that attribute.  Now the app re-defines the guest, but removes
the second block device. libvirt will now not be setting multifunction=on
on the 1st block device. So technically removing the second device has
triggered a silent ABI change on the 1st device.

IMHO, given that apps have to explicitly decide they're going to be
using multifunction and thus manually assign addresses, it is no
great burden for them to have to add multifunction=on explicitly too.

IOW, I'd tend towards dropping this patch, as I don't think the benefit
is compelling enough, and it can lead to surprises for apps.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|




More information about the libvir-list mailing list