[libvirt] [PATCH 00/13] PCIe fixes + new PCI controllers w/RFC

Martin Kletzander mkletzan at redhat.com
Thu Jun 25 06:44:17 UTC 2015


On Wed, Jun 24, 2015 at 12:04:33PM -0400, Laine Stump wrote:
>I think the only votes were for option 1 and 4 (interesting how the only
>ones that were chosen were those that I *didn't* pick personally :-).
>See comments below. In the meantime the other issue Alex pointed out may
>cause this to take a slightly different direction.
>
>On 06/22/2015 02:44 PM, Laine Stump wrote:
>> ===
>> Idea 1: multiplex the meaning of the "model" attribute, so we currently have:
>>
>>   <controller type='pci' model='dmi-to-pci-bridge'/>
>>
>> which means "add an i82801b11-bridge device" and when we add a generic
>> version of this type of controller, we would do it with something like:
>>
>>   <controller type='pci' model='generic-dmi-to-pci-bridge'/>
>>
>> and for another vendor's mythical controller:
>>
>>   <controller type='pci' model='xyz-dmi-to-pci-bridge'/>
>>
>> Cons: This will make for ugliness in switch statements where a new
>> case will have to be added whenever a different controller with
>> similar behavior/usage is supported. And it's generally not a good idea to
>> have a single attribute be used for two different functions.
>
>jtomko advocated this one because it would make it easier for an older
>libvirt to notice an unsupported class+model of controller during a
>migration attempt from a newer libvirt. An example would be if a newer
>libvirt had a guest with
>
>    <controller type='pci' model='xyz'/>
>
>(where xyz is some qemu device that implements a dmi-to-pci-bridge)
>
>That would fail on an attempt to migrate to older libvirt, but
>
>   <controller type='pci' model='dmi-to-pci-bridge'/>
>
>would succeed. On the other hand, if we did this:
>
>   <controller type='pci' model='dmi-to-pci-bridge submodel='xyz'/>
>
>that would succeed (and create the wrong device) because submodel would
>be ignored.
>
>Since there is currently no alternate implementation of a
>dmi-to-pci-bridge available in qemu, and it will likely be awhile until
>that happens, I think if we start filling out the XML now as:
>
>   <controller type='pci' model='dmi-to-pci-bridge'
>                          submodel='i82801b11-bridge'/>
>
>by the time we get to the point that we do have an alternate 'xyz'
>controller, any version of libvirt that doesn't understand "submodel"
>will be so far in the past that we wouldn't want to be able to migrate
>back to it anyway.
>
>
>> ===
>> Idea 4: Unlike other uses of "model" in libvirt, for pci controllers,
>> continue to use "model" to denote the subtype/class/whatever of
>> controller, and create a new attribute to denote the different
>> specific implementations of each one. So for example:
>>
>> [4]  <controller type='pci' model='dmi-to-pci-bridge'/>
>>
>> would become:
>>
>> [5]  <controller type='pci' model='dmi-to-pci-bridge'
>>                             implementation='i82801b11-bridge'/>
>>
>> (or some other name in place of "implementation" - ideas? I'm horrible
>> at thinkin up names)
>>
>> Pros: wouldn't create compatibility problems when downgrading or
>> migrating cross version.
>>
>> Cons: Is inconsistent with every other use of "model" attribute in
>> libvirt, and each new addition of a PCI controller further propagates
>> this misuse.
>>
>
>mkletzan preferred this one, and danpb was amenable to it in IRC. I
>think I now agree, but Alex's comments about needing to keep track of
>what we put in the qemu commandline for port and chassis have me
>thinking this isn't enough.
>
>The problem is that the "ioh3420" device needs its "port" and "chassis"
>options set; Alex recommends setting port to:
>
>   (slot << 3)+function
>
>Likewise, he recommends setting "port" for the xio3130-downstream device
>to the same value as slot. So, for the following:
>
>   <controller type='pci model='pci-root-port' index='3'>
>     <address type='pci' bus='0' slot='4' function='1'>
>   </controller>
>
>we would end up with the following commandline:
>
>   -device ioh3420,chassis=3,port=0x21,id='pcie.3',bus='pcie.0',
>           addr=0x4.0x1
>
>(chassis is the same as "index" in the xml (again per Alex's
>recommendation), and port is (4 << 3) + 1 = 0x21)
>
>*However* he also says that we may change our mind on these in the
>future, so chassis and port may end up being something different. Since
>those values are visible to the guest, we can't allow them to change on
>an existing machine, as it breaks the guest ABI. So we need to store
>them in the XML. They aren't part of the PCI address though, they are
>options. So I need to figure out the best way to represent that in the
>XML, and fill it in when it is auto-generated when the controller is
>defined. A few possible ways to solve both problems at once:
>
>[1] <controller type='pci model='pci-root-port' index='3'>
>      <address type='pci' bus='0' slot='4' function='1'>
>      <target type='ioh3420' chassis='3' port='0x21'/>
>    </controller>
>
>Precedent: <target> is used to store the port number for serial/console
>devices, specify guest-side bus and device name for disks, specify
>guest-side mount location for filesystems, specify the *host*-side name
>of tap devices for network interfaces, memory size for <memory>. So it
>seems kind of proper. (slight variation - <target model='ioh3420' .../>
>:-/ )
>
>[2] <controller type='pci model='pci-root-port' index='3'>
>      <address type='pci' bus='0' slot='4' function='1'>
>      <model type='ioh3420' chassis='3' port='0x21'/>
>    </controller>
>
>(How's *that* for confusing?!? Both a model attribute *and* a model
>subelement.)
>
>Precedent: This is exactly patterned after the use of <model> in <video>
>devices though, where the specific card being emulated along with any
>memory/etc options are put in <model>, so at least there is precedence)
>
>[3] <controller type='pci model='pci-root-port' index='3'>
>      <address type='pci' bus='0' slot='4' function='1'>
>      <device type='ioh3420' chassis='3' port='0x21'/>
>    </controller>
>
>(This one is completely new, just for a fresh start in case people think
>neither of the others are exactly right)
>
>At this point you can see that it all comes down to what name we want to
>give the subelement; within that, we give the exact name of the qemu
>device, and the exact name/value of any qemu options that we set to
>non-default values.
>
>Somebody *please* have an opinion on the name for this, because none of
>these strikes me as better or worse (well, I think I dislike <driver>
>

To be honest, I kinds dislike all of them.  Not that they would be
chosen poorly, no, it's simply because the good sensible choice is
unavailable due to another poor decision in the past (this may be
another point for Michal's talk on KVM Forum).  Thinking about it I
must say I don't like how target (which is supposed to match a place
where the device appears for the guest) is used for the model
specification, on the other hand (ab)using 'model' element for the
specification of an "address" in guest (that's what I understand
chassis and port are) doesn't feel any better.  What if we go with two
of those elements?  Would that be too much pain?  E.g.:

  <controller type='pci model='pci-root-port' index='3'>
    <address type='pci' bus='0' slot='4' function='1'>
    <model type='ioh3420'/>
    <target chassis='3' port='0x21'/>
  </controller>

I understand this might look like an overkill, but I think it's better
safe then sorry, I guess I just see us not so far in the future
regretting any decision made now.

Another idea I briefly thought about, which would also deal with the
migration to older libvirt versions was inventing a new address
type ... and then I realized how stupid^Wugly was that.

Anyway, I must say that I don't fully get the idea of how some of
these controllers are supposed to be specified and what particular HW
they should describe, so I might not had the best idea in town.
<rant> Mainly I don't understand why there's no pci switch.  Or is
<controller type='pci'/> actually a pci switch (I usderstand it like
that)?  Then why is there a downstream and upstream device when
{down,up}stream should just indicate whether the port aims towards the
root complex or the endpoints, respectively. </rant>

>
>
>(P.S. There are other "automagic" qemu options being specified by
>libvirt that maybe could use the same treatment as these. Two that come
>to mind are chassis_nr for pci-bridge controllers (set to the
>controller's index), and "vectors" for virtio-net multiqueue support
>(set to (queues*2)+2).

I guess 'chassis_nr' is something exposed to guest, similarly to
'chassis' in the pci-root-port case, so that might be worth saving in
the XML.  Maybe we can use 'chassis' in the target/model specification
for storing the chassis_nr as a level of abstraction on libvirt level.

But the number of vectors (to my knowledge) is not something exposed
to guest and is only advised to being set according to the formula you
wrote, above.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20150625/1a4d6f4d/attachment-0001.sig>


More information about the libvir-list mailing list