[libvirt PATCH 6/7] nodedev: Handle inactive mdevs with the same UUID

Boris Fiuczynski fiuczy at linux.ibm.com
Wed Aug 11 11:47:38 UTC 2021

On 8/3/21 9:55 PM, Jonathon Jongsma wrote:
> On Tue, Aug 3, 2021 at 6:24 AM Boris Fiuczynski <fiuczy at linux.ibm.com> wrote:
>> On 8/2/21 5:30 PM, Jonathon Jongsma wrote:
>>> On Fri, Jul 30, 2021 at 8:01 AM Boris Fiuczynski <fiuczy at linux.ibm.com> wrote:
>>>> On 7/30/21 9:48 AM, Michal Prívozník wrote:
>>>>> On 7/29/21 9:27 PM, Jonathon Jongsma wrote:
>>>>>> On Thu, Jul 29, 2021 at 1:35 PM Boris Fiuczynski <fiuczy at linux.ibm.com> wrote:
>>>>>>> On 7/27/21 4:09 PM, Jonathon Jongsma wrote:
>>>>>>>> On Tue, Jul 27, 2021 at 3:02 AM Michal Prívozník <mprivozn at redhat.com> wrote:
>>>>>>>>> On 7/27/21 12:08 AM, Jonathon Jongsma wrote:
>>>>>>>>>> On Mon, Jul 26, 2021 at 9:47 AM Michal Prívozník <mprivozn at redhat.com> wrote:
>>>>>>>>>>> On 7/23/21 6:40 PM, Jonathon Jongsma wrote:
>>>>>>>>>>>> Unfortunately, mdevctl supports defining more than one mdev with the
>>>>>>>>>>>> same UUID as long as they have different parent devices. (Only one of
>>>>>>>>>>>> these devices can be active at any given time).
>>>>>>>>>>>> This means that we can't use the UUID alone as a way to uniquely
>>>>>>>>>>>> identify mdev node devices. Append the parent address to ensure
>>>>>>>>>>>> uniqueness. For example:
>>>>>>>>>>>>         Before: mdev_88a6b868_46bd_4015_8e5b_26107f82da38
>>>>>>>>>>>>         After:  mdev_88a6b868_46bd_4015_8e5b_26107f82da38_0000_00_02_0
>>>>>>>>>>>> Related: https://bugzilla.redhat.com/show_bug.cgi?id=1979440
>>>>>>>>>>>> Signed-off-by: Jonathon Jongsma <jjongsma at redhat.com>
>>>>>>>>>>>> ---
>>>>>>>>>>>>      src/node_device/node_device_driver.c                   | 3 ++-
>>>>>>>>>>>>      src/node_device/node_device_udev.c                     | 2 +-
>>>>>>>>>>>>      tests/nodedevmdevctldata/mdevctl-list-multiple.out.xml | 8 ++++----
>>>>>>>>>>>>      3 files changed, 7 insertions(+), 6 deletions(-)
>>>>>>>>>>> The patch looks good, but for some reason it leaves API breakage
>>>>>>>>>>> aftertaste. But I guess we don't document anywhere what MDEV <name/>
>>>>>>>>>>> looks like, do we? IOW - we are free to change it if needed.
>>>>>>>>>> That is true -- it does have a bit of a bad aftertaste. As far as I
>>>>>>>>>> know, we haven't documented or promised any specific naming format for
>>>>>>>>>> mdevs. But it could certainly cause some pain for people that are
>>>>>>>>>> using mdevs already, which might be reason enough to reject or adjust
>>>>>>>>>> this patch. If a person already has assigned a mdev device to their vm
>>>>>>>>>> with the old name, the name would change when upgrading libvirt. That
>>>>>>>>>> could make the domain definition become invalid.
>>>>>>>>>> But, we have to deal with this situation somehow. We don't have a
>>>>>>>>>> choice but to handle the case where mdevctl might return multiple
>>>>>>>>>> defined devices with the same UUID. I considered a couple of other
>>>>>>>>>> approaches to handling this that I rejected, such as
>>>>>>>>>> - only add a suffix to the second such device with the same UUID
>>>>>>>>>>         - rejected because the name of a given device depends on the order
>>>>>>>>>> in which it was observed and the presence of other devices. For
>>>>>>>>>> example, if we have two devices: mdev_$UUID and mdev_$UUID_$PARENT,
>>>>>>>>>> and the first device was undefined, then when libvirt was restarted,
>>>>>>>>>> mdev_$UUID_$PARENT would change its name to mdev_$UUID because it is
>>>>>>>>>> now the only device with that UUID
>>>>>>>>> Agreed, this would not make situation any better, in fact worse.
>>>>>>>>>> - name all *active* mdevs mdev_$UUID and use a different scheme for
>>>>>>>>>> inactive, defined mdevs
>>>>>>>>>>        - rejected because the device would change names when it changed
>>>>>>>>>> from inactive to active.
>>>>>>>>> Yeah, this is equally horrible.
>>>>>>>>>> - ignore this situation because probably almost nobody uses multiple
>>>>>>>>>> devices with the same UUID
>>>>>>>>>>        - but do they?
>>>>>>>>> Frankly, I don't have enough experience with MDEVs, but since it's
>>>>>>>>> possible to define an MDEV with an existing UUID, it is possible to
>>>>>>>>> define it also for the same parent? I mean, if I'd have an MDEV capable
>>>>>>>>> graphics card and I'd define two MDEVs with the same UUID they would
>>>>>>>>> both have the same parent, wouldn't they? Because in that case,
>>>>>>>>> appending PCI address to libvirt's name is not enough.
>>>>>>>> Nope, it's the opposite in fact. mdevctl allows you to define two or
>>>>>>>> more mdevs with the same UUID, but only if they have different
>>>>>>>> parents. Only a single mdev with a given UUID can be activated at the
>>>>>>>> same time. So among *active* devices, the UUID is indeed unique. But
>>>>>>>> not among persistent device definitions. For defined devices, the UUID
>>>>>>>> is guaranteed unique per parent.
>>>>>>>> I discussed this with Alex a while ago and he thought that the reason
>>>>>>>> that mdevctl supported this was to allow people to assign an mdev with
>>>>>>>> a specific UUID to a vm, but that UUID could represent one of several
>>>>>>>> (equivalent?) devices, depending on which parent device was present in
>>>>>>>> the host or which device was instantiated. That feature doesn't seem
>>>>>>>> very compelling to me, and Alex wasn't sure that anybody was actually
>>>>>>>> using it that way. But the fact remains that mdevctl does allow people
>>>>>>>> to configure things this way and it seems like we need to be prepared
>>>>>>>> to deal with it.
>>>>>>> I am questioning how a user is supposed to correlate the nodedev
>>>>>>> representation with mdev_{uuid}_{parent} to a domain using the mdev like
>>>>>>> <devices>
>>>>>>>       <hostdev mode='subsystem' type='mdev' model='vfio-ccw'>
>>>>>>>         <source>
>>>>>>>           <address uuid='{uuid}'/>
>>>>>>>         </source>
>>>>>>>       </hostdev>
>>>>>>> </devices>
>>>>>> Sorry for potentially confusing the impact of this change with my
>>>>>> previous comments. I had said that domains with mdevs assigned might
>>>>>> become invalid if the nodedev name changes. But that's not actually
>>>>>> true. The naming that we're discussing is only related to the name of
>>>>>> the device within the libvirt nodedev driver. When adding a mediated
>>>>>> device to a domain, you don't use this nodedev name, you use the UUID
>>>>>> directly. This is similar to the way you add a physical passthrough
>>>>>> PCI device to a domain using its PCI address rather than its
>>>>>> 'pci_$domain_$bus_$slot_$fn' nodedev name.
>>>>>>> Reading uuid it tells me its universal unique
>>>>>>> (https://en.wikipedia.org/wiki/Universally_unique_identifier) but now
>>>>>>> this is not true for definitions which breaks the term universal in my
>>>>>>> small universe since for the defined state of an mdev it becomes
>>>>>>> "parental" unique...
>>>>>> Yes, that's true. Which is why I think this is a questionable feature. But:
>>>>>> A) defined devices are only *potential* devices.This is not all that
>>>>>> different from the fact that a user can manually instantiate a mdev on
>>>>>> parent A with a specific UUID and then later instantiate a totally
>>>>>> different mdev on parent B with the exact same UUID. One UUID refers
>>>>>> to different devices at different times.
>>>>>> B) the question we're discussing is not whether mdevctl should allow
>>>>>> duplicate UUIDs or not. mdevctl *already* allows it, so we need to
>>>>>> figure out how to deal with it in libvirt.
>>>>>> I suppose one potential possibility might be to change upstream
>>>>>> mdevctl to disallow duplicate UUIDs and then require this new mdevctl
>>>>>> version for libvirt. But that's a discussion we'd have to have in
>>>>>> upstream mdevctl first.
>>>>> Maybe we can start the discussion, but even in libvirt you are allowed
>>>>> to have two guests with the same name and UUID if they are in different
>>>>> drivers (e.g. one in QEMU and one in LXC).
>>>>> Meanwhile, I think these can be pushed as we don't have much options anyway.
>>>>> Michal
>>>> I think that we have at least one option.
>>>> We could rethink the approach to create nodedev objects based on mdevctl
>>>> mdev definitions which actually are not a usable host resource for domains.
>>> If I understand your suggestion, I don't think it solves any problems.
>>> It seems to me that you're saying that libvirt should ignore all mdev
>>> definitions from mdevctl if the parent does not (yet?) exist on the
>>> host. (At least that's my interpretation of your phrase "definitions
>>> which are not a usable host resource"). But it's still possible to
>>> have definitions with duplicate UUIDs for devices which have parents
>>> that *do* exist on the host but are simply not instantiated yet. If
>>> I'm misinterpreting your comment, feel free to clarify.
>> I used to think of the nodedev objects as libvirt showing the user its
>> view of devices it can understand in the sysfs. This changed when mdev
>> nodedev objects got created based on mdevctl's mediated device definition.
>> These definition objects are different from the sysfs objects.
> Yes, but in a similar way, a defined storage pool is different from an
> active storage pool. Or a defined domain that is different from an
> active domain. There is plenty of precedent in libvirt of this kind of
> distinction.
>> One difference is state active/inactive which caused e.g. the
>> introduction of the option "inactive" on list operation.
>> When we now talk about an nodedev mdev object it could be
>>    a) an instantiated mediated devices without mdev definition,
>>    b) an instantiated mediated devices with mdev definition or
>>    c) just an mdev definition,
>> but just an instantiated mediated device [part of a) and b)] is actually
>> a host resource one can use in a domain.
>> So what I meant by "...not usable host resources" was the definition
>> itself is not usable for a domain. The sysfs based host resources are.
> Those three options are also true of pretty much every other libvirt
> object as well. A domain can be transient or defined, active or
> inactive. Same for pools. I can define a domain in libvirt that uses
> devices that are not (currently) present on the host system. I can
> define a storage pool that refers to a nonexistent directory. When I
> attempt to instantiate these objects, libvirt will return an error,
> but I am allowed to define objects that refer to unavailable host
> resources. So all of those things mentioned above seem pretty
> analogous to existing libvirt objects.
>> I agree that mdev definitions and operations on these definitions have a
>> value but I am getting the feeling that they do not fit so well into the
>> nodedev objects especially with the new twist of the loss of identity
>> uniqueness which results from my perspective in a very messy nodedev
>> object space.
> I don't think we've lost any identity uniqueness. It's just that we
> initially chose a device name schema for mdevs that turned out to not
> be unique when mdevctl is used as a backend.
> Also, I don't think this nodedev object space is any more "messy" than
> the other libvirt objects (e.g. pools and domains mentioned above).
> Are you arguing that we should not provide a way to define new
> persistent nodedevs in libvirt?
>> There is one other thing I noticed. I can change an mdev definition
>> after I initiated/started the mdev definition. This is kind of starting
>> a domain and afterwards change/edit its persisted definition. So when
>> dumping the xml one would need to be able to chose running/started or
>> defined. This currently is not a big issue since there are not many
>> actually no attributes that can be retrieved via mdevctl from a
>> running/started mdev but it just adds to the disparity since currently
>> the user of nodedev gets the mdev definition and running/started mdev in
>> a merged nodedev object.
> Again, this is not all that different from existing libvirt objects. I
> can also edit a persistent definition of a storage pool after it is
> started, for example. It's possible that there are some issues unique
> to mdevs that I need to fix in this area, but those are just details.
> In general it seems pretty consistent with other libvirt objects.
> Jonathon

My arguing was based on looking at the node device driver.
I know that the concepts exist elsewhere in libvirt but apparently not 
in the node device driver. I always though that the reason for it is/was 
that it is based on sysfs.

Mit freundlichen Grüßen/Kind regards
    Boris Fiuczynski

IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Gregor Pillen
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294

More information about the libvir-list mailing list