[libvirt PATCH 6/7] nodedev: Handle inactive mdevs with the same UUID

Jonathon Jongsma jjongsma at redhat.com
Tue Aug 3 19:55:47 UTC 2021

On Tue, Aug 3, 2021 at 6:24 AM Boris Fiuczynski <fiuczy at linux.ibm.com> wrote:
> On 8/2/21 5:30 PM, Jonathon Jongsma wrote:
> > On Fri, Jul 30, 2021 at 8:01 AM Boris Fiuczynski <fiuczy at linux.ibm.com> wrote:
> >>
> >> On 7/30/21 9:48 AM, Michal Prívozník wrote:
> >>> On 7/29/21 9:27 PM, Jonathon Jongsma wrote:
> >>>> On Thu, Jul 29, 2021 at 1:35 PM Boris Fiuczynski <fiuczy at linux.ibm.com> wrote:
> >>>>>
> >>>>> On 7/27/21 4:09 PM, Jonathon Jongsma wrote:
> >>>>>> On Tue, Jul 27, 2021 at 3:02 AM Michal Prívozník <mprivozn at redhat.com> wrote:
> >>>>>>>
> >>>>>>> On 7/27/21 12:08 AM, Jonathon Jongsma wrote:
> >>>>>>>> On Mon, Jul 26, 2021 at 9:47 AM Michal Prívozník <mprivozn at redhat.com> wrote:
> >>>>>>>>>
> >>>>>>>>> On 7/23/21 6:40 PM, Jonathon Jongsma wrote:
> >>>>>>>>>> Unfortunately, mdevctl supports defining more than one mdev with the
> >>>>>>>>>> same UUID as long as they have different parent devices. (Only one of
> >>>>>>>>>> these devices can be active at any given time).
> >>>>>>>>>>
> >>>>>>>>>> This means that we can't use the UUID alone as a way to uniquely
> >>>>>>>>>> identify mdev node devices. Append the parent address to ensure
> >>>>>>>>>> uniqueness. For example:
> >>>>>>>>>>
> >>>>>>>>>>        Before: mdev_88a6b868_46bd_4015_8e5b_26107f82da38
> >>>>>>>>>>        After:  mdev_88a6b868_46bd_4015_8e5b_26107f82da38_0000_00_02_0
> >>>>>>>>>>
> >>>>>>>>>> Related: https://bugzilla.redhat.com/show_bug.cgi?id=1979440
> >>>>>>>>>>
> >>>>>>>>>> Signed-off-by: Jonathon Jongsma <jjongsma at redhat.com>
> >>>>>>>>>> ---
> >>>>>>>>>>     src/node_device/node_device_driver.c                   | 3 ++-
> >>>>>>>>>>     src/node_device/node_device_udev.c                     | 2 +-
> >>>>>>>>>>     tests/nodedevmdevctldata/mdevctl-list-multiple.out.xml | 8 ++++----
> >>>>>>>>>>     3 files changed, 7 insertions(+), 6 deletions(-)
> >>>>>>>>>
> >>>>>>>>> The patch looks good, but for some reason it leaves API breakage
> >>>>>>>>> aftertaste. But I guess we don't document anywhere what MDEV <name/>
> >>>>>>>>> looks like, do we? IOW - we are free to change it if needed.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> That is true -- it does have a bit of a bad aftertaste. As far as I
> >>>>>>>> know, we haven't documented or promised any specific naming format for
> >>>>>>>> mdevs. But it could certainly cause some pain for people that are
> >>>>>>>> using mdevs already, which might be reason enough to reject or adjust
> >>>>>>>> this patch. If a person already has assigned a mdev device to their vm
> >>>>>>>> with the old name, the name would change when upgrading libvirt. That
> >>>>>>>> could make the domain definition become invalid.
> >>>>>>>>
> >>>>>>>> But, we have to deal with this situation somehow. We don't have a
> >>>>>>>> choice but to handle the case where mdevctl might return multiple
> >>>>>>>> defined devices with the same UUID. I considered a couple of other
> >>>>>>>> approaches to handling this that I rejected, such as
> >>>>>>>> - only add a suffix to the second such device with the same UUID
> >>>>>>>>        - rejected because the name of a given device depends on the order
> >>>>>>>> in which it was observed and the presence of other devices. For
> >>>>>>>> example, if we have two devices: mdev_$UUID and mdev_$UUID_$PARENT,
> >>>>>>>> and the first device was undefined, then when libvirt was restarted,
> >>>>>>>> mdev_$UUID_$PARENT would change its name to mdev_$UUID because it is
> >>>>>>>> now the only device with that UUID
> >>>>>>>
> >>>>>>> Agreed, this would not make situation any better, in fact worse.
> >>>>>>>
> >>>>>>>> - name all *active* mdevs mdev_$UUID and use a different scheme for
> >>>>>>>> inactive, defined mdevs
> >>>>>>>>       - rejected because the device would change names when it changed
> >>>>>>>> from inactive to active.
> >>>>>>>
> >>>>>>> Yeah, this is equally horrible.
> >>>>>>>
> >>>>>>>> - ignore this situation because probably almost nobody uses multiple
> >>>>>>>> devices with the same UUID
> >>>>>>>>       - but do they?
> >>>>>>>
> >>>>>>> Frankly, I don't have enough experience with MDEVs, but since it's
> >>>>>>> possible to define an MDEV with an existing UUID, it is possible to
> >>>>>>> define it also for the same parent? I mean, if I'd have an MDEV capable
> >>>>>>> graphics card and I'd define two MDEVs with the same UUID they would
> >>>>>>> both have the same parent, wouldn't they? Because in that case,
> >>>>>>> appending PCI address to libvirt's name is not enough.
> >>>>>>
> >>>>>>
> >>>>>> Nope, it's the opposite in fact. mdevctl allows you to define two or
> >>>>>> more mdevs with the same UUID, but only if they have different
> >>>>>> parents. Only a single mdev with a given UUID can be activated at the
> >>>>>> same time. So among *active* devices, the UUID is indeed unique. But
> >>>>>> not among persistent device definitions. For defined devices, the UUID
> >>>>>> is guaranteed unique per parent.
> >>>>>>
> >>>>>> I discussed this with Alex a while ago and he thought that the reason
> >>>>>> that mdevctl supported this was to allow people to assign an mdev with
> >>>>>> a specific UUID to a vm, but that UUID could represent one of several
> >>>>>> (equivalent?) devices, depending on which parent device was present in
> >>>>>> the host or which device was instantiated. That feature doesn't seem
> >>>>>> very compelling to me, and Alex wasn't sure that anybody was actually
> >>>>>> using it that way. But the fact remains that mdevctl does allow people
> >>>>>> to configure things this way and it seems like we need to be prepared
> >>>>>> to deal with it.
> >>>>>
> >>>>>
> >>>>> I am questioning how a user is supposed to correlate the nodedev
> >>>>> representation with mdev_{uuid}_{parent} to a domain using the mdev like
> >>>>> <devices>
> >>>>>      <hostdev mode='subsystem' type='mdev' model='vfio-ccw'>
> >>>>>        <source>
> >>>>>          <address uuid='{uuid}'/>
> >>>>>        </source>
> >>>>>      </hostdev>
> >>>>> </devices>
> >>>>
> >>>> Sorry for potentially confusing the impact of this change with my
> >>>> previous comments. I had said that domains with mdevs assigned might
> >>>> become invalid if the nodedev name changes. But that's not actually
> >>>> true. The naming that we're discussing is only related to the name of
> >>>> the device within the libvirt nodedev driver. When adding a mediated
> >>>> device to a domain, you don't use this nodedev name, you use the UUID
> >>>> directly. This is similar to the way you add a physical passthrough
> >>>> PCI device to a domain using its PCI address rather than its
> >>>> 'pci_$domain_$bus_$slot_$fn' nodedev name.
> >>>>
> >>>>> Reading uuid it tells me its universal unique
> >>>>> (https://en.wikipedia.org/wiki/Universally_unique_identifier) but now
> >>>>> this is not true for definitions which breaks the term universal in my
> >>>>> small universe since for the defined state of an mdev it becomes
> >>>>> "parental" unique...
> >>>>
> >>>> Yes, that's true. Which is why I think this is a questionable feature. But:
> >>>> A) defined devices are only *potential* devices.This is not all that
> >>>> different from the fact that a user can manually instantiate a mdev on
> >>>> parent A with a specific UUID and then later instantiate a totally
> >>>> different mdev on parent B with the exact same UUID. One UUID refers
> >>>> to different devices at different times.
> >>>> B) the question we're discussing is not whether mdevctl should allow
> >>>> duplicate UUIDs or not. mdevctl *already* allows it, so we need to
> >>>> figure out how to deal with it in libvirt.
> >>>>
> >>>> I suppose one potential possibility might be to change upstream
> >>>> mdevctl to disallow duplicate UUIDs and then require this new mdevctl
> >>>> version for libvirt. But that's a discussion we'd have to have in
> >>>> upstream mdevctl first.
> >>>
> >>> Maybe we can start the discussion, but even in libvirt you are allowed
> >>> to have two guests with the same name and UUID if they are in different
> >>> drivers (e.g. one in QEMU and one in LXC).
> >>>
> >>> Meanwhile, I think these can be pushed as we don't have much options anyway.
> >>>
> >>> Michal
> >>>
> >>
> >> I think that we have at least one option.
> >> We could rethink the approach to create nodedev objects based on mdevctl
> >> mdev definitions which actually are not a usable host resource for domains.
> >
> > If I understand your suggestion, I don't think it solves any problems.
> > It seems to me that you're saying that libvirt should ignore all mdev
> > definitions from mdevctl if the parent does not (yet?) exist on the
> > host. (At least that's my interpretation of your phrase "definitions
> > which are not a usable host resource"). But it's still possible to
> > have definitions with duplicate UUIDs for devices which have parents
> > that *do* exist on the host but are simply not instantiated yet. If
> > I'm misinterpreting your comment, feel free to clarify.
> I used to think of the nodedev objects as libvirt showing the user its
> view of devices it can understand in the sysfs. This changed when mdev
> nodedev objects got created based on mdevctl's mediated device definition.
> These definition objects are different from the sysfs objects.

Yes, but in a similar way, a defined storage pool is different from an
active storage pool. Or a defined domain that is different from an
active domain. There is plenty of precedent in libvirt of this kind of

> One difference is state active/inactive which caused e.g. the
> introduction of the option "inactive" on list operation.
> When we now talk about an nodedev mdev object it could be
>   a) an instantiated mediated devices without mdev definition,
>   b) an instantiated mediated devices with mdev definition or
>   c) just an mdev definition,
> but just an instantiated mediated device [part of a) and b)] is actually
> a host resource one can use in a domain.
> So what I meant by "...not usable host resources" was the definition
> itself is not usable for a domain. The sysfs based host resources are.

Those three options are also true of pretty much every other libvirt
object as well. A domain can be transient or defined, active or
inactive. Same for pools. I can define a domain in libvirt that uses
devices that are not (currently) present on the host system. I can
define a storage pool that refers to a nonexistent directory. When I
attempt to instantiate these objects, libvirt will return an error,
but I am allowed to define objects that refer to unavailable host
resources. So all of those things mentioned above seem pretty
analogous to existing libvirt objects.

> I agree that mdev definitions and operations on these definitions have a
> value but I am getting the feeling that they do not fit so well into the
> nodedev objects especially with the new twist of the loss of identity
> uniqueness which results from my perspective in a very messy nodedev
> object space.

I don't think we've lost any identity uniqueness. It's just that we
initially chose a device name schema for mdevs that turned out to not
be unique when mdevctl is used as a backend.

Also, I don't think this nodedev object space is any more "messy" than
the other libvirt objects (e.g. pools and domains mentioned above).
Are you arguing that we should not provide a way to define new
persistent nodedevs in libvirt?

> There is one other thing I noticed. I can change an mdev definition
> after I initiated/started the mdev definition. This is kind of starting
> a domain and afterwards change/edit its persisted definition. So when
> dumping the xml one would need to be able to chose running/started or
> defined. This currently is not a big issue since there are not many
> actually no attributes that can be retrieved via mdevctl from a
> running/started mdev but it just adds to the disparity since currently
> the user of nodedev gets the mdev definition and running/started mdev in
> a merged nodedev object.

Again, this is not all that different from existing libvirt objects. I
can also edit a persistent definition of a storage pool after it is
started, for example. It's possible that there are some issues unique
to mdevs that I need to fix in this area, but those are just details.
In general it seems pretty consistent with other libvirt objects.


More information about the libvir-list mailing list