[libvirt] Hot plug multi function PCI devices

Ziviani . jrziviani at gmail.com
Thu Jan 7 01:58:46 UTC 2016


On Wed, Jan 6, 2016 at 4:43 PM, Laine Stump <laine at laine.org> wrote:

> On 12/23/2015 11:01 AM, Ziviani . wrote:
>
> Hi Laine,
>
> This (hot plugging all functions at once) is something I was thinking
> about. What if we could create a xml file passing the IOMMU group instead
> of only one function per time, would it be feasible?
> I could start working on a proof of concept if the community thinks it's a
> valid path.
>
> Do you know how is currently working on it? I could offer some help if
> they need.
>
>
> (Please reply inline rather than top-posting. It makes it much easier to
> follow the context of the conversation.)
>
> What do you mean by "passing the IOMMU group"? Do you mean *just* the
> iommu group, excluding the information about the devices? This doesn't seem
> like a good idea, since afaik the iommu group number is something just
> conjured up by the kernel at boot time, and isn't necessarily predictable
> or stable between host reboots. Also, it wouldn't allow for assigning only
> some of the devices/functions in a group while leaving others inactive.
>

​My first idea was doing something like this:
% virsh nodedev-dumpxml pci_0000_00_16_3
<device>
  <name>pci_0000_00_16_3</name>
[snip]
<iommuGroup number='4'>
      <address domain='0x0000' bus='0x00' slot='0x16' function='0x0'/>
      <address domain='0x0000' bus='0x00' slot='0x16' function='0x3'/>
    </iommuGroup>
  </capability>
</device>

If an user wants to attach pci_0000_00_16_3, I'd find all devices belonging
the its same iommu group to attach every one. A very poor pseudo-code would
be like:

slot = get_available_guest_slot();
immou_group = device_to_be_attached().get_iommu();
for (device : iommu_group.devices()) {
  (1st iteraction) device_add
vfio-pci,host=00:16.0,addr=slot.0,multifunction=on
  (2nd iteraction) device_add
vfio-pci,host=00:16.3,addr=slot.3,multifunction=on
}

So, in this case, we could accept either the device to be attached or
simply its current iommu group#.


> I think there are two reasonable possibilities:
>
> 1) Follow the apparent path of qemu - accept separate attach calls, one
> for each function, and use the attach of function 0 as the "action" button
> that causes all the functions to be attached.
>
> 2) Enhance the attach API to accept multiple <hostdev> elements in the XML
> for a single call, and do "whatever is proper for the current hypervisor"
> to attach them.
>

​I think my first idea has more to do with you 1st option. But I like the
second one: user specify all devices in the xml, then we assert there is no
missing function, then we go attaching one by one (

​with this another poor pseudo-code):​
​

slot = get_available_guest_slot();
​for (device : devices_parsed_from_xml()) {
  (1st iteraction) device_add
vfio-pci,host=00:16.0,addr=slot.0,multifunction=on
  (2nd iteraction) device_add
vfio-pci,host=00:16.3,addr=slot.3,multifunction=on
}​


>
> As for detach, it's really only possible to detach *all* functions, and it
> would take more bookkeeping to allowing marking each function for removal
> and then removing the device when all functions had been marked, so maybe
> we only allow detach of function 0, and that will always detach everything?
> (not sure, that's just an idea).
>

​I think we can let users detach anyone. We could get the slot and start
detaching all functions from that slot, again another poor example:

device = device_to_be_detached();
for (uint function = 0; function < device.len_slot(), ++function)
    detach(device.slot[function]->addr);


>
> As far as I know, nobody is currently working on anything like this for
> libvirt, so this is your chance to get your hands dirty!
>

​Awesome! :)​


>
> (It just occurred to me that method (1) of multifunction attach method
> outlined above will also need similar extra bookkeeping, just as the "mark
> each function for removal" detach method would, and this extra bookkeeping
> would need to survive a restart of libvirtd in the middle of a series of
> attach/detach calls, making it more complicated, so maybe the 2nd methods
> would be better. I'd love to hear opinions though.)
>

Because it's possible to retrieve the functions belonging to a slot I think
we can avoid such bookkeeping (of course, my idea can be totally wrong) :D

(qemu) info pci
...
  Bus  0, device   6, function 0:
    Class 1920: PCI device 8086:9c3a
      IRQ 11.
      BAR0: 64 bit memory at 0x40000000 [0x4000001f].
      id ""
  Bus  0, device   6, function 3:
    Serial port: PCI device 8086:9c3d
      IRQ 6.
      BAR0: I/O at 0x1000 [0x1007].
      BAR1: 32 bit memory at 0x40001000 [0x40001fff].
      id ""

But based on my code above, the function device_to_be_detached() could
return the struct with slot[functions] based on this qemu info.

​Thank you for your time and advice, I'm starting to look on it and let you
know the progress. My irc nickname is #ziviani.​




>
>
>
> Thank you :)
>
> On Mon, Dec 21, 2015 at 3:53 PM, Laine Stump <laine at laine.org> wrote:
>
>> On 12/21/2015 08:29 AM, Ziviani . wrote:
>>
>> Hello list!
>>
>> I'm new here and interested in hot-plug multi-function PCI devices.
>> Basically I'd like to know why Libvirt does not support it. I've been
>> through the archives and basically found this thread:
>>
>> <https://www.redhat.com/archives/libvir-list/2011-May/msg00457.html>
>> https://www.redhat.com/archives/libvir-list/2011-May/msg00457.html
>>
>> But Qemu seems to handle it accordingly:
>> virsh qemu-monitor-command --hmp fedora-23 'device_add
>> vfio-pci,host=00:16.0,addr=08.0'
>> virsh qemu-monitor-command --hmp fedora-23 'device_add
>> vfio-pci,host=00:16.3,addr=08.3'
>>
>> GUEST:
>> # lspci
>> (snip)
>> 00:08.0 Communication controller: Intel Corporation 8 Series HECI #0 (rev
>> 04)
>> 00:08.3 Serial controller: Intel Corporation 8 Series HECI KT (rev 04)
>>
>> However, using Libvirt:
>>
>> % virsh attach-device fedora-23 pci_0000_00_16_0.xml --live
>> Device attached successfully
>>
>> % virsh attach-device fedora-23 pci_0000_00_16_3.xml --live
>> error: Failed to attach device from pci_0000_00_16_3.xml
>> error: internal error: Only PCI device addresses with function=0 are
>> supported
>>
>> I made some changes on domain_addr.c[1] for testing and it worked.
>>
>> [1]https://gist.github.com/jrziviani/1da184c7fd0b413e0426
>>
>> % virsh attach-device fedora-23 pci_0000_00_16_3.xml --live
>> Device attached successfully
>>
>> GUEST:
>> # lspci
>> (snip)
>> 00:08.0 Communication controller: Intel Corporation 8 Series HECI #0 (rev
>> 04)
>> 00:08.3 Serial controller: Intel Corporation 8 Series HECI KT (rev 04)
>>
>> So there is more to it that I'm not aware?
>>
>>
>> You're relying on behavior in the guest OS for which there is no standard
>> (and which, by definition, doesn't work on real hardware, so no guest OS
>> will be expecting it; a friend more familiar with this has told me that
>> probably qemu is sending an (acpi?) "device check" to the guest for each
>> function that is added, and in your case it's apparently "doing the right
>> thing" in response to that). But just because it is successful in this one
>> case doesn't mean that it will be successful in all situations; likely it
>> won't be. So while the qemu monitor takes the laissez-faire approach of
>> allowing you to try it and letting you pick up the pieces when it fails,
>> libvirt prevents it because it is bound to fail, and thus not supportable.
>>
>> There has recently been some work in qemu to "save up" any requests to
>> attach devices with function > 0, then present them all to the guest at
>> once when function 0 is attached. This is the only standard way to handle
>> hotplug of multiple functions in a slot. Hot unplug can only happen for all
>> functions in the slot at once. I'm not sure of the current status of that
>> work, but once it is in and stable, libvirt will support it.
>>
>>
>> Thank you!
>>
>>
>> --
>> libvir-list mailing listlibvir-list at redhat.comhttps://www.redhat.com/mailman/listinfo/libvir-list
>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20160106/ae7c48ec/attachment-0001.htm>


More information about the libvir-list mailing list