[libvirt] [PATCH 00/13] PCIe fixes + new PCI controllers w/RFC

Laine Stump laine at laine.org
Wed Jun 24 16:04:33 UTC 2015


I think the only votes were for option 1 and 4 (interesting how the only
ones that were chosen were those that I *didn't* pick personally :-).
See comments below. In the meantime the other issue Alex pointed out may
cause this to take a slightly different direction.

On 06/22/2015 02:44 PM, Laine Stump wrote:
> ===
> Idea 1: multiplex the meaning of the "model" attribute, so we currently have:
>
>   <controller type='pci' model='dmi-to-pci-bridge'/>
>
> which means "add an i82801b11-bridge device" and when we add a generic
> version of this type of controller, we would do it with something like:
>
>   <controller type='pci' model='generic-dmi-to-pci-bridge'/>
>
> and for another vendor's mythical controller:
>
>   <controller type='pci' model='xyz-dmi-to-pci-bridge'/>
>
> Cons: This will make for ugliness in switch statements where a new
> case will have to be added whenever a different controller with
> similar behavior/usage is supported. And it's generally not a good idea to
> have a single attribute be used for two different functions.

jtomko advocated this one because it would make it easier for an older
libvirt to notice an unsupported class+model of controller during a
migration attempt from a newer libvirt. An example would be if a newer
libvirt had a guest with

    <controller type='pci' model='xyz'/>

(where xyz is some qemu device that implements a dmi-to-pci-bridge)

That would fail on an attempt to migrate to older libvirt, but

   <controller type='pci' model='dmi-to-pci-bridge'/>

would succeed. On the other hand, if we did this:

   <controller type='pci' model='dmi-to-pci-bridge submodel='xyz'/>

that would succeed (and create the wrong device) because submodel would
be ignored.

Since there is currently no alternate implementation of a
dmi-to-pci-bridge available in qemu, and it will likely be awhile until
that happens, I think if we start filling out the XML now as:

   <controller type='pci' model='dmi-to-pci-bridge'
                          submodel='i82801b11-bridge'/>

by the time we get to the point that we do have an alternate 'xyz'
controller, any version of libvirt that doesn't understand "submodel"
will be so far in the past that we wouldn't want to be able to migrate
back to it anyway.


> ===
> Idea 4: Unlike other uses of "model" in libvirt, for pci controllers,
> continue to use "model" to denote the subtype/class/whatever of
> controller, and create a new attribute to denote the different
> specific implementations of each one. So for example:
>
> [4]  <controller type='pci' model='dmi-to-pci-bridge'/>
>
> would become:
>
> [5]  <controller type='pci' model='dmi-to-pci-bridge'
>                             implementation='i82801b11-bridge'/>
>
> (or some other name in place of "implementation" - ideas? I'm horrible
> at thinkin up names)
>
> Pros: wouldn't create compatibility problems when downgrading or
> migrating cross version.
>
> Cons: Is inconsistent with every other use of "model" attribute in
> libvirt, and each new addition of a PCI controller further propagates
> this misuse.
>

mkletzan preferred this one, and danpb was amenable to it in IRC. I
think I now agree, but Alex's comments about needing to keep track of
what we put in the qemu commandline for port and chassis have me
thinking this isn't enough.

The problem is that the "ioh3420" device needs its "port" and "chassis"
options set; Alex recommends setting port to:

   (slot << 3)+function

Likewise, he recommends setting "port" for the xio3130-downstream device
to the same value as slot. So, for the following:

   <controller type='pci model='pci-root-port' index='3'>
     <address type='pci' bus='0' slot='4' function='1'>
   </controller>

we would end up with the following commandline:

   -device ioh3420,chassis=3,port=0x21,id='pcie.3',bus='pcie.0',
           addr=0x4.0x1

(chassis is the same as "index" in the xml (again per Alex's
recommendation), and port is (4 << 3) + 1 = 0x21)

*However* he also says that we may change our mind on these in the
future, so chassis and port may end up being something different. Since
those values are visible to the guest, we can't allow them to change on
an existing machine, as it breaks the guest ABI. So we need to store
them in the XML. They aren't part of the PCI address though, they are
options. So I need to figure out the best way to represent that in the
XML, and fill it in when it is auto-generated when the controller is
defined. A few possible ways to solve both problems at once:

[1] <controller type='pci model='pci-root-port' index='3'>
      <address type='pci' bus='0' slot='4' function='1'>
      <target type='ioh3420' chassis='3' port='0x21'/>
    </controller>

Precedent: <target> is used to store the port number for serial/console
devices, specify guest-side bus and device name for disks, specify
guest-side mount location for filesystems, specify the *host*-side name
of tap devices for network interfaces, memory size for <memory>. So it
seems kind of proper. (slight variation - <target model='ioh3420' .../>
:-/ )

[2] <controller type='pci model='pci-root-port' index='3'>
      <address type='pci' bus='0' slot='4' function='1'>
      <model type='ioh3420' chassis='3' port='0x21'/>
    </controller>

(How's *that* for confusing?!? Both a model attribute *and* a model
subelement.)

Precedent: This is exactly patterned after the use of <model> in <video>
devices though, where the specific card being emulated along with any
memory/etc options are put in <model>, so at least there is precedence)

[3] <controller type='pci model='pci-root-port' index='3'>
      <address type='pci' bus='0' slot='4' function='1'>
      <device type='ioh3420' chassis='3' port='0x21'/>
    </controller>

(This one is completely new, just for a fresh start in case people think
neither of the others are exactly right)

At this point you can see that it all comes down to what name we want to
give the subelement; within that, we give the exact name of the qemu
device, and the exact name/value of any qemu options that we set to
non-default values.

Somebody *please* have an opinion on the name for this, because none of
these strikes me as better or worse (well, I think I dislike <driver>



(P.S. There are other "automagic" qemu options being specified by
libvirt that maybe could use the same treatment as these. Two that come
to mind are chassis_nr for pci-bridge controllers (set to the
controller's index), and "vectors" for virtio-net multiqueue support
(set to (queues*2)+2).




More information about the libvir-list mailing list