[libvirt] [PATCH 00/13] PCIe fixes + new PCI controllers w/RFC

Laine Stump laine at laine.org
Wed Jun 24 01:06:24 UTC 2015


On 06/23/2015 11:57 AM, Ján Tomko wrote:
> On Mon, Jun 22, 2015 at 02:44:05PM -0400, Laine Stump wrote:
>> The first 4 patches are bugfixes/reorganizations that have no controversy.
>>
>> The sets of 5-7, 8-10, and 11-13 each implement a new model of PCI
>> controller:
>>
>>   5-7 - <controller type='pci' model='pcie-root-port'/>
>>         This is based on qemu's ioh3420.
>>
>>  8-10 - <controller type='pci' model='pcie-switch-upstream-port'/>
>>         Based on qemu's x3130-upstream
>>
>> 11-13 - <controller type='pci' model='pcie-switch-downstream-port'/>
>>         (xio3130-downstream)
>>
>> The first patch of each set adds a capability bit for qemu (again
>> non-controversial), the 2nd adds the new pci controller model, and the
>> 3rd implements that model in qemu (by checking for the capability and
>> adding a commandline arg or failing).
>>
>> The "controversial"/RFC bit is this - talking to Alex Williamson
>> (after I'd rwritten these patches, which is why I'm presenting them in
>> a form that I *don't* want to push) about the possibility of qemu
>> adding generic root-port, switch-upstream-port, and
>> switch-downstream-port controllers, and possibly also a generic
>> dmi-to-pci-bridge (i.e. controllers not tied to any particular
>> hardware implementation as with those currently available), I'm
>> realizing that, while it was a correct decision to make all of the
>> existing pci controllers "type='pci'" (since they share an address
>> space), using the "model" attribute to set the kind of controller was
>> probably a mistake. The problem - if:
>>
>>   <controller type='pci' model='dmi-to-pci-bridge'/>
>>
>> currently means to add an i82801b11-bridge controller to the domain,
>> once qemu implements a generic dmi-to-pci-bridge, how will *that* be
>> denoted, and how will we avoid replacing the existing i81801b11-bridge
>> in a particular domain with the generic version when a guest is
>> restarted following a qemu/libvirt upgrade?
>>
>> In hindsight, it probably would have been better to do something like
>> this with the four existing pci controllers:
>>
>>   <controller type='pci' subType='dmi-to-pci-bridge'
>>                          model='i82801b11-bridge'/>
>>   <controller type='pci' subType='pci-bridge'
>>                          model='pci-bridge'/> (or maybe blank?)
>>   <controller type='pci' subType='pci-root'/> (again maybe model is blank)
>>   <controller type='pci' subType='pcie-root'/>(and again)
>>
>> (instead, what is shown above as "subType" is currently placed in the "model" attribute).
> So far, just the dmi-to-pci-bridge does not match the actual device
> model used by qemu -- pci-bridge is generic and we don't format the
> -roots.

Not yet, but as far as I understand, the pxb is another kind of pci-root.

>
>> Then we could add the 3 new types like this:
>>
>>   <controller type='pci' subType='pcie-root-port' model='ioh3420'/>
>>   <controller type='pci' subType='pcie-switch-upstream-port'
>>                          model='x3130-upstream/>
>>   <controller type='pci' subType='pcie-switch-downstream-port'
>>                          model='xio3130-downstream/>
>>
>> and we would easily be able to add support for new generic controllers
>> that behaved identically, by just adding a new model.  But we haven't
>> done that, and anything we do in the future must be backwards
>> compatible with what's already there (right?). I'm trying to think of
>> how to satisfy backward compatibility while making things work better
>> in the future.
>>
>> Some ideas, in no particular order:
>>
>> ===
>> Idea 1: multiplex the meaning of the "model" attribute, so we currently have:
>>
>>   <controller type='pci' model='dmi-to-pci-bridge'/>
>>
>> which means "add an i82801b11-bridge device" and when we add a generic
>> version of this type of controller, we would do it with something like:
>>
>>   <controller type='pci' model='generic-dmi-to-pci-bridge'/>
>>
>> and for another vendor's mythical controller:
>>
>>   <controller type='pci' model='xyz-dmi-to-pci-bridge'/>
>>
> This could be just 'xyz'. We do not have <model type='rtl8193-pci-network-card'/>
> or <memballoon model='virtio-memballoon'/>

because in those cases the type of hardware is already well known and
the different models have a similar function on the guest (in form, if
not exactly in implementation)

>
> For the new ones, it would be model='ioh3420', 'x3130-upstream', etc.

I like having the XML give a clue to the exact function of the device
without needing to look up exactly what each specific device from some
random chipset does.

>
>> Cons: This will make for ugliness in switch statements where a new
>> case will have to be added whenever a different controller with
>> similar behavior/usage is supported. And it's generally not a good idea to
>> have a single attribute be used for two different functions.
>>
>> ===
>>
>> Idea 2: implement new controllers as suggested in "20/20 hindsight"
>> above. For controllers in existing domains (dmi-to-pci-bridge,
>> pic-bridge, pci-root, and pcie-root) imply it into the controller
>> definition of an existing domain when only model has been given (but
>> don't write it out that way, to preserve the ability to downgrade). So
>> this:
>>
>> [1]  <controller type='pci' model='dmi-to-pci-bridge'/>
>>
>> would internally mean this:
>>
>> [2]  <controller type='pci' subType='dmi-to-pci-bridge'
>>                             model='i82801b11-bridge'/>
>>
> If we want to allow forward migration, we need to treat a missing model
> as 'i82801b11-bridge' anyway.

Correct. Just as we set a network device model of rtl8139 for any
<interface> that has no model (because rtl8139 has always been the
default up until now, but may not be in the future), we would treat a
missing [whatever it is] as "i82801b11-bridge", and also encode it into
the XML whenever this domain is re-saved to disk.

>
>> (but would remain as [1] when config is rewritten/migrated) while
>> this:
>>
>> [3]  <controller type='pci' subType='dmi-to-pci-bridge'
>>                             model='anything whatsoever/>
>>
>> would mean exactly what it says.
>>
>> Cons: Keeping this straight would mean having some sort of
>> "oldStyleCompat" flag in the controller object, to be sure that [1]
>> wasn't sent in migration status as [2] (since the destination might
>> not recognize it). It would also mean keeping the code in the parser
>> and formatter to deal with this flag. Forever.
> So when would the 'subType' be formatted? If it's just internal, this
> seems to be identical with my suggestion to Idea #1.

The idea would be that when it is specified in the XML of define/create,
it will also be formatted. But I think I've soured on this idea anyway.

>
>> ===
>> Idea 3: interpret controllers with missing subType as above, but
>> actually write it out to the config/migration/etc in the new modified
>> format.
>>
>> Cons: This would prevent downgrading libvirt or migrating from a host
>> with newer libvirt to one with older libvirt. (Although preserving
>> compatibility at some level when downgrading may be a stated
>> requirement of some downstream distros' builds of libvirt, I think for
>> upstream it is only a "best effort"; I'm just not certain how much "best" is considered to be :-)
>>
> I do not know of any effort done to make downgrading libvirt work.

You haven't talked to enough people deploying RHEV/oVirt in production -
they want the ability to upgrade some nodes, migrate guests over to
them, then migrate the guests back if the upgrade needs to be undone.

> Any machine configs that use new values for old attributes will
> disappear (and so will running machines, because of new qemu
> capabilities).
>
> Migration is somewhat supported, so the compatible format could be used
> only when the model is default and VIR_DOMAIN_DEF_FORMAT_MIGRATABLE was
> specified?

Isn't VIR_DOMAIN_DEF_FORMAT_MIGRATABLE there in part so that migrations
from newer libvirt to older libvirt might work? (it doesn't guarantee
it, but it does make it work in some situations where it otherwise
wouldn't).

>
>> ===
>> Idea 4: Unlike other uses of "model" in libvirt, for pci controllers,
>> continue to use "model" to denote the subtype/class/whatever of
>> controller, and create a new attribute to denote the different
>> specific implementations of each one. So for example:
>>
>> [4]  <controller type='pci' model='dmi-to-pci-bridge'/>
>>
>> would become:
>>
>> [5]  <controller type='pci' model='dmi-to-pci-bridge'
>>                             implementation='i82801b11-bridge'/>
>>
>> (or some other name in place of "implementation" - ideas? I'm horrible
>> at thinkin up names)
>>
> device? actualModel? :)

hey, I think I might like "device"!

>
>> Pros: wouldn't create compatibility problems when downgrading or
>> migrating cross version.
>>
> If you tried to migrate to older libvirt with:
> model='dmi-to-pci-bridge' impl='generic', older libvirt would not parse
> the impl flag and create a machine with i8201b11-bridge. (Assuming the
> QEMU would know the machine type).

Well, yeah, this does point out a flaw in my thinking :-/

>
>> Cons: Is inconsistent with every other use of "model" attribute in
>> libvirt, and each new addition of a PCI controller further propagates
>> this misuse.
>>
> It is consistent with all the other inconsistencies in libvirt :)
>
>> ====
>>
>> I currently like either option 2 or 3 (depending on how good we want
>> to be about supporting downgrade/intra-version migration), but as is
>> obvious by the fact that it was me that suggested putting the type of
>> pci controller into "model", I am very good at making the wrong
>> decision on matters like this.
> My favorite would be #2 without the subType (AKA #1 without the
> controller type in new model names) - it is implied from the
> model anyway. A more structured XML might be user-frendlier, but this
> way even old libvirt will recognize that it cannot start the machine
> because of an incompatible model.
>
> This way only the dmi-to-pci-bridge -> i82801b11-bridge mapping would be
> odd.

I have to think about this some more...




More information about the libvir-list mailing list