[RFC] exposing 'nodedev assigned to domain' info to users
Daniel Henrique Barboza
danielhb413 at gmail.com
Wed Jan 6 17:24:35 UTC 2021
On 1/6/21 8:13 AM, Erik Skultety wrote:
> On Wed, Jan 06, 2021 at 08:00:52AM -0300, Daniel Henrique Barboza wrote:
>> On 1/6/21 7:09 AM, Daniel P. Berrangé wrote:
>>> On Tue, Jan 05, 2021 at 05:18:13PM -0300, Daniel Henrique Barboza wrote:
>>>> This is something I've been giving a thought after working in Gitlab issue
>>>> #72 and decided to run through the ML before hitting the code.
>>>> We don't have an easy way to retrieve the domain that is using an specific
>>>> hostdev. Let's say that I want to know which domain is using the PCI card
>>>> pci_0000_01_00_2. 'nodedev-dumpxml' will return the hardware/driver capabilities
>>>> of the device, such as IOMMU group, driver and so on, but will not inform
>>>> which domain is using the hostdev, if any. 'nodedev-list' will simply list
>>>> all nodedev names known to Libvirt, without outputting any other information.
>>>> IIUC, the only existing way I can reliably tell whether a hostdev is being
>>>> used by domain, aside from having to register the information by myself
>>>> during domain definition of course, is via 'virsh dumpxml <domain>' each
>>>> existing running domain and matching the nodedev name with the source.address
>>>> element of the XML.
>>>> When we consider SR-IOV devices that can have 28+ VFs each (and have lots of
>>>> fun caveats, like Github #72 showed us), the capability of hot plug/unplug
>>>> hostdevs freely, and lots of running domains, it is clear that we're putting a
>>>> considerable pressure in the upper layers (OVirt, or a poor human admin) to
>>>> keep track of the nodedevs each running domain is using. An info that we
>>>> already have internally and can just expose it.
>>>> I have a few ideas to make this happen:
>>>> 1 - upgrade 'nodedev-list' to add an extra 'assigned to' column
>>>> This is the more straightforward way of exposing the info. A simple 'nodedev-list'
>>>> call can retrieve which domain is using which nodedev. To preserve the existing
>>>> usage we can add an "--show-assigned-domains" option to control whether we
>>>> will display this info.
>>> That would mean nodedev-list has to fetch XML for every running guest
>>> and parse and extract it. That's not a scalable solution.
>>>> 2 - add an '<assigned_to>' element in nodedev XML definition
>>>> I'm not a fan of exposing this in this particular XML because we would mix
>>>> host/hw related attributes with domain info. But it would be easier to pull
>>>> this off comparing to (1), so I'm mentioning it for the record.
>>> This is similar to what we do for the nwfilter-binding and net-port XML
>>> where we have an <owner> element present.
>>> The complication here is that right now we don't ever touch the nodedev
>>> driver when doing host device assignment, and so don't especially want
>>> to introduce a dependancy.
>> One possible alternative would be a new API that operates on hostdevs instead
>> of nodedevs. "hostdev-list" would list the devices assigned to any domain, as
>> opposed to "nodedev-list" that lists all nodedevs of the host. I'm not sure if this
>> differentiation between hostdev and nodedev (i.e. hostdev is a nodedev that is
>> assigned to a domain) would be clear enough to users though. We would need to
>> document it clearer in the docs.
> Wasn't this about the connection to the nodedev though? E.g. with mdevs we only
> have a UUID in the domain XML which doesn't tell you anything about the device
> nor its parent and you also can't take the uuid and try finding the
> corresponding nodedev entry for it (well, you can hack it so that you construct
> the resulting nodedev name). Maybe I'm just misunderstanding the use case
This particular case I'm asking for comments is related to PCI hostdevs (namely,
SR-IOV virtual functions) that might get removed from the host, while being
assigned to a running domain. We don't support that (albeit I posted patches
that tries to alleviate the issue in Libvirt), and at the same time we don't
provide easy tools for the user to check whether a specific hostdev is
assigned to a domain. The user must query the running domains to find out.
About mdevs, isn't a mdev device created on demand via hypercalls to the physical
device driver, and threw away after use, the only real device being the parent?
I'm not sure whether there is a use case/requirement for knowing the parent
nodedev device. The parent device can be retrieved via sysfs AFAIC.
>> Yet another alternative is a new API under "Device Commands". We already have
>> attach-device, detach-device and so on, might as well have a new "list-devices"
>> that does the deed. This fits with the "The following commands manipulate
>> devices associated to domains." claim that we make about this class of commands.
More information about the libvir-list