<div dir="ltr"><div class="gmail_default" style="font-family:courier new,monospace;font-size:small;color:#0b5394">You're right, I have taken for granted the cases I'm aware of, forgetting to think all other possible scenarios. Your hints here was absolutely important to address points of misunderstanding. I'll certainly read the sources you told.</div><div class="gmail_default" style="font-family:courier new,monospace;font-size:small;color:#0b5394"><br></div><div class="gmail_default" style="font-family:courier new,monospace;font-size:small;color:#0b5394">Thank you!</div><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Jan 8, 2016 at 4:41 PM, Laine Stump <span dir="ltr"><<a href="mailto:laine@laine.org" target="_blank">laine@laine.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
  
    
  
  <div text="#000000" bgcolor="#FFFFFF"><div><div class="h5">
    <div>On 01/06/2016 08:58 PM, Ziviani .
      wrote:<br>
    </div>
    <blockquote type="cite">
      <div dir="ltr">
        <div><br>
        </div>
        <div class="gmail_extra">
          <div class="gmail_quote">On Wed, Jan 6, 2016 at 4:43 PM, Laine
            Stump <<a href="mailto:laine@laine.org" target="_blank">laine@laine.org</a>>
            wrote:<br>
            <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
              <div text="#000000" bgcolor="#FFFFFF">
                <div>On 12/23/2015 11:01 AM, Ziviani . wrote:<br>
                </div>
                <blockquote type="cite">
                  <div dir="ltr">
                    <div>Hi
                      Laine,</div>
                    <div><br>
                    </div>
                    <div>This
                      (hot plugging all functions at once) is something
                      I was thinking about. What if we could create a
                      xml file passing the IOMMU group instead of only
                      one function per time, would it be feasible?</div>
                    <div>I
                      could start working on a proof of concept if the
                      community thinks it's a valid path.</div>
                    <div><br>
                    </div>
                    <div>Do
                      you know how is currently working on it? I could
                      offer some help if they need.</div>
                  </div>
                </blockquote>
                <br>
                (Please reply inline rather than top-posting. It makes
                it much easier to follow the context of the
                conversation.)<br>
                <br>
                What do you mean by "passing the IOMMU group"? Do you
                mean *just* the iommu group, excluding the information
                about the devices? This doesn't seem like a good idea,
                since afaik the iommu group number is something just
                conjured up by the kernel at boot time, and isn't
                necessarily predictable or stable between host reboots.
                Also, it wouldn't allow for assigning only some of the
                devices/functions in a group while leaving others
                inactive.<br>
              </div>
            </blockquote>
            <div><br>
            </div>
            <div>
              <div>​
                My first idea was doing something like this:</div>
            </div>
            <div>
              <div style="display:inline">
                <div>% virsh nodedev-dumpxml pci_0000_00_16_3</div>
                <div><device></div>
                <div> 
                  <name>pci_0000_00_16_3</name></div>
                <div>[snip]</div>
                <div>
                  <div><iommuGroup number='4'></div>
                  <div>      <address domain='0x0000' bus='0x00'
                    slot='0x16' function='0x0'/></div>
                  <div>      <address domain='0x0000' bus='0x00'
                    slot='0x16' function='0x3'/></div>
                  <div>    </iommuGroup></div>
                  <div>  </capability></div>
                  <div></device></div>
                  <div><br>
                  </div>
                  <div>If an user wants to attach pci_0000_00_16_3, I'd
                    find all devices belonging the its same iommu group
                    to attach every one. A very poor pseudo-code would
                    be like:</div>
                  <div><br>
                  </div>
                  <div>slot = get_available_guest_slot();</div>
                  <div>immou_group =
                    device_to_be_attached().get_iommu();</div>
                  <div>for (device : iommu_group.devices()) {</div>
                  <div>  (1st iteraction) device_add
                    vfio-pci,host=00:16.0,addr=slot.0,multifunction=on<br>
                  </div>
                  <div>  (2nd iteraction) device_add
                    vfio-pci,host=00:16.3,addr=slot.3,multifunction=on<br>
                  </div>
                  <div>}</div>
                  <div><br>
                  </div>
                  <div>So, in this case, we could accept either the
                    device to be attached or simply its current iommu
                    group#.</div>
                </div>
              </div>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <br></div></div>
    But "iommu group" is not the same thing as "all functions on a
    single device". Although in some cases they might be the same, that
    isn't necessarily true - one iommu group could span several devices,
    and there could be devices in the group that the user wasn't
    expecting and that could cause unexpected disastrous results (the
    most commonly used example is if the controller for the host's main
    disk happens to be in the same iommu group as some device that
    you're trying to assign).<br>
    <br>
    Also, you're making the assumption that only physical hardware
    devices assigned with vfio can/should be put onto multiple functions
    of a single guest slot, but that isn't true. It's also okay (and at
    times desirable) to put multiple emulated devices into different
    functions of the same slot.<span class=""><br>
    <br>
    <blockquote type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <div>
              <div style="display:inline">
                <div>
                  <div><br>
                  </div>
                </div>
              </div>
            </div>
            <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
              <div text="#000000" bgcolor="#FFFFFF"> <br>
                I think there are two reasonable possibilities:<br>
                <br>
                1) Follow the apparent path of qemu - accept separate
                attach calls, one for each function, and use the attach
                of function 0 as the "action" button that causes all the
                functions to be attached.<br>
                <br>
                2) Enhance the attach API to accept multiple
                <hostdev> elements in the XML for a single call,
                and do "whatever is proper for the current hypervisor"
                to attach them.<br>
              </div>
            </blockquote>
            <div><br>
            </div>
            <div>
              <div>​
                I think my first idea has more to do with you 1st
                option. But I like the second one: user specify all
                devices in the xml, then we assert there is no missing
                function,</div>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <br></span>
    Why do you assert that there is no missing function? Again, while
    this *can* be used to assign all of the functions of a single
    multi-function host device to functions of a single guest slot, that
    isn't the only use. You can also assign *some* of the functions of a
    single device, or a collection of emulated devices (or possibly even
    a mixture of emulated and assigned devices, although I'm not sure
    what vfio would think about that - it may be prohibited for very
    good reasons).<span class=""><br>
    <br>
    <blockquote type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <div>
              <div>then
                we go attaching one by one (</div>
               
              <div>​
                with this another poor pseudo-code):​</div>
              <div>​</div>
            </div>
            <div>
              <div><br>
              </div>
            </div>
            <div>
              <div>slot
                = get_available_guest_slot();<br>
              </div>
            </div>
            <div>
              <div>​
                for (device : devices_parsed_from_xml()) {</div>
              <div>  (1st iteraction) device_add
                vfio-pci,host=00:16.0,addr=slot.0,multifunction=on<br>
              </div>
              <div>  (2nd iteraction) device_add
                vfio-pci,host=00:16.3,addr=slot.3,multifunction=on<br>
              </div>
              <div>}​</div>
              <br>
            </div>
            <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
              <div text="#000000" bgcolor="#FFFFFF"> <br>
                <br>
                As for detach, it's really only possible to detach *all*
                functions, and it would take more bookkeeping to
                allowing marking each function for removal and then
                removing the device when all functions had been marked,
                so maybe we only allow detach of function 0, and that
                will always detach everything? (not sure, that's just an
                idea). <br>
              </div>
            </blockquote>
            <div><br>
            </div>
            <div>
              <div>​I
                think we can let users detach anyone. We could get the
                slot and start detaching all functions from that slot,
                again another poor example:<br>
              </div>
              <div><br>
              </div>
              <div>device
                = device_to_be_detached();</div>
              <div>for
                (uint function = 0; function < device.len_slot(),
                ++function)<br>
              </div>
              <div>   
                detach(device.slot[function]->addr);</div>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <br></span>
    My understanding is that there is no way to inform the guest OS that
    a single function of a device has been detached. The only thing you
    can do is tell it that the entire device has been unplugged from the
    slot.<span class=""><br>
    <br>
    <blockquote type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <div>
              <div> <br>
              </div>
            </div>
            <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
              <div text="#000000" bgcolor="#FFFFFF"> <br>
                As far as I know, nobody is currently working on
                anything like this for libvirt, so this is your chance
                to get your hands dirty!<br>
              </div>
            </blockquote>
            <div><br>
            </div>
            <div>
              <div>​
                Awesome! :)​</div>
               </div>
            <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
              <div text="#000000" bgcolor="#FFFFFF"> <br>
                (It just occurred to me that method (1) of multifunction
                attach method outlined above will also need similar
                extra bookkeeping, just as the "mark each function for
                removal" detach method would, and this extra bookkeeping
                would need to survive a restart of libvirtd in the
                middle of a series of attach/detach calls, making it
                more complicated, so maybe the 2nd methods would be
                better. I'd love to hear opinions though.)</div>
            </blockquote>
            <div><br>
            </div>
            <div>
              <div>Because
                it's possible to retrieve the functions belonging to a
                slot I think we can avoid such bookkeeping (of course,
                my idea can be totally wrong) :D</div>
              <div><br>
              </div>
              <div>
                <div>(qemu) info pci</div>
                <div>...</div>
                <div>  Bus  0, device   6,
                  function 0:</div>
                <div>    Class 1920: PCI device
                  8086:9c3a</div>
                <div>      IRQ 11.</div>
                <div>      BAR0: 64 bit memory at
                  0x40000000 [0x4000001f].</div>
                <div>      id ""</div>
                <div>  Bus  0, device   6,
                  function 3:</div>
                <div>    Serial port: PCI device
                  8086:9c3d</div>
                <div>      IRQ 6.</div>
                <div>      BAR0: I/O at 0x1000
                  [0x1007].</div>
                <div>      BAR1: 32 bit memory at
                  0x40001000 [0x40001fff].</div>
                <div>      id ""</div>
                <div><br>
                </div>
                <div>But based on my code above,
                  the function device_to_be_detached() could return the
                  struct with slot[functions] based on this qemu info.</div>
              </div>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <br></span>
    It's not that simple. You need to keep track of which devices you've
    told qemu to detach that qemu hasn't yet informed you were
    successfully detached. Also, if we allow it in steps (libvirt
    accepts attach/detach for multiple functions followed by a "make it
    so!" command), the info about pending attach/detach sets would need
    to be maintained.<br>
    <br>
    You should probably spend some time looking at
    src/qemu/qemu_hotplug.c, src/util/virhostdev.c, and virpci.c before
    jumping to a lot of conclusions :-)<div><div class="h5"><br>
    <br>
    <blockquote type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <div>
              <div>
                <div><br>
                </div>
              </div>
              <div>​
                Thank you for your time and advice, I'm starting to look
                on it and let you know the progress. My irc nickname is
                #ziviani.​</div>
              <br>
            </div>
            <div><br>
            </div>
            <div> </div>
            <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
              <div text="#000000" bgcolor="#FFFFFF">
                <div>
                  <div><br>
                    <br>
                    <blockquote type="cite">
                      <div dir="ltr">
                        <div><br>
                        </div>
                        <div>Thank
                          you :)</div>
                      </div>
                      <div class="gmail_extra"><br>
                        <div class="gmail_quote">On Mon, Dec 21, 2015 at
                          3:53 PM, Laine Stump <<a href="mailto:laine@laine.org" target="_blank"></a><a href="mailto:laine@laine.org" target="_blank">laine@laine.org</a>>
                          wrote:<br>
                          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
                            <div text="#000000" bgcolor="#FFFFFF">
                              <div>
                                <div>
                                  <div>On 12/21/2015 08:29 AM, Ziviani .
                                    wrote:<br>
                                  </div>
                                  <blockquote type="cite">
                                    <div dir="ltr">
                                      <div>Hello list!</div>
                                      <div><br>
                                      </div>
                                      <div>I'm new here and interested
                                        in hot-plug multi-function PCI
                                        devices. Basically I'd like to
                                        know why Libvirt does not
                                        support it. I've been through
                                        the archives and basically found
                                        this thread:</div>
                                      <div><br>
                                      </div>
                                      <div><a href="https://www.redhat.com/archives/libvir-list/2011-May/msg00457.html" target="_blank">https://www.redhat.com/archives/libvir-list/2011-May/msg00457.html</a><br>
                                      </div>
                                      <div><br>
                                      </div>
                                      <div>But Qemu seems to handle it
                                        accordingly:</div>
                                      <div>
                                        <div>virsh qemu-monitor-command
                                          --hmp fedora-23 'device_add
                                          vfio-pci,host=00:16.0,addr=08.0'</div>
                                        <div>virsh qemu-monitor-command
                                          --hmp fedora-23 'device_add
                                          vfio-pci,host=00:16.3,addr=08.3'</div>
                                        <div><br>
                                        </div>
                                        <div>GUEST:</div>
                                        <div>
                                          <div># lspci</div>
                                          <div>(snip)</div>
                                          <div>00:08.0 Communication
                                            controller: Intel
                                            Corporation 8 Series HECI #0
                                            (rev 04)<br>
                                          </div>
                                          <div>00:08.3 Serial
                                            controller: Intel
                                            Corporation 8 Series HECI KT
                                            (rev 04)</div>
                                          <div><br>
                                          </div>
                                        </div>
                                        <div>However, using Libvirt:<br>
                                        </div>
                                        <div><br>
                                        </div>
                                        <div>
                                          <div>% virsh attach-device
                                            fedora-23
                                            pci_0000_00_16_0.xml --live</div>
                                          <div>Device attached
                                            successfully</div>
                                          <div><br>
                                          </div>
                                          <div>% virsh attach-device
                                            fedora-23
                                            pci_0000_00_16_3.xml --live<br>
                                          </div>
                                          <div>error: Failed to attach
                                            device from
                                            pci_0000_00_16_3.xml</div>
                                          <div>error: internal error:
                                            Only PCI device addresses
                                            with function=0 are
                                            supported</div>
                                          <div><br>
                                          </div>
                                          <div>I made some changes
                                            on domain_addr.c[1] for
                                            testing and it worked.</div>
                                          <div><br>
                                          </div>
                                          <div>[1]<a href="https://gist.github.com/jrziviani/1da184c7fd0b413e0426" target="_blank"></a><a href="https://gist.github.com/jrziviani/1da184c7fd0b413e0426" target="_blank">https://gist.github.com/jrziviani/1da184c7fd0b413e0426</a></div>
                                          <div><br>
                                          </div>
                                          <div>
                                            <div>% virsh attach-device
                                              fedora-23
                                              pci_0000_00_16_3.xml
                                              --live</div>
                                            <div>Device attached
                                              successfully</div>
                                          </div>
                                          <div><br>
                                          </div>
                                          <div>
                                            <div>GUEST:</div>
                                            <div>
                                              <div># lspci</div>
                                              <div>(snip)</div>
                                              <div>00:08.0 Communication
                                                controller: Intel
                                                Corporation 8 Series
                                                HECI #0 (rev 04)<br>
                                              </div>
                                              <div>00:08.3 Serial
                                                controller: Intel
                                                Corporation 8 Series
                                                HECI KT (rev 04)</div>
                                              <div><br>
                                              </div>
                                              <div>So there is more to
                                                it that I'm not aware?</div>
                                            </div>
                                          </div>
                                        </div>
                                      </div>
                                    </div>
                                  </blockquote>
                                  <br>
                                </div>
                              </div>
                              You're relying on behavior in the guest OS
                              for which there is no standard (and which,
                              by definition, doesn't work on real
                              hardware, so no guest OS will be expecting
                              it; a friend more familiar with this has
                              told me that probably qemu is sending an
                              (acpi?) "device check" to the guest for
                              each function that is added, and in your
                              case it's apparently "doing the right
                              thing" in response to that). But just
                              because it is successful in this one case
                              doesn't mean that it will be successful in
                              all situations; likely it won't be. So
                              while the qemu monitor takes the
                              laissez-faire approach of allowing you to
                              try it and letting you pick up the pieces
                              when it fails, libvirt prevents it because
                              it is bound to fail, and thus not
                              supportable.<br>
                              <br>
                              There has recently been some work in qemu
                              to "save up" any requests to attach
                              devices with function > 0, then present
                              them all to the guest at once when
                              function 0 is attached. This is the only
                              standard way to handle hotplug of multiple
                              functions in a slot. Hot unplug can only
                              happen for all functions in the slot at
                              once. I'm not sure of the current status
                              of that work, but once it is in and
                              stable, libvirt will support it.<br>
                              <br>
                              <blockquote type="cite">
                                <div dir="ltr">
                                  <div>
                                    <div>
                                      <div><br>
                                      </div>
                                      <div>Thank you!<br>
                                      </div>
                                    </div>
                                  </div>
                                </div>
                                <br>
                                <span>
                                  <fieldset></fieldset>
                                  <br>
                                  <pre>--
libvir-list mailing list
<a href="mailto:libvir-list@redhat.com" target="_blank">libvir-list@redhat.com</a>
<a href="https://www.redhat.com/mailman/listinfo/libvir-list" target="_blank">https://www.redhat.com/mailman/listinfo/libvir-list</a></pre>
                                </span></blockquote>
                              <br>
                            </div>
                          </blockquote>
                        </div>
                        <br>
                      </div>
                    </blockquote>
                    <br>
                  </div>
                </div>
              </div>
            </blockquote>
          </div>
          <br>
        </div>
      </div>
    </blockquote>
    <br>
  </div></div></div>

</blockquote></div><br></div></div>