[libvirt] mdevctl: A shoestring mediated device management and persistence utility

Sylvain Bauza sbauza at redhat.com
Wed Jun 19 12:48:02 UTC 2019


On Wed, Jun 19, 2019 at 11:57 AM Cornelia Huck <cohuck at redhat.com> wrote:

> On Wed, 19 Jun 2019 11:04:15 +0200
> Sylvain Bauza <sbauza at redhat.com> wrote:
>
> > On Wed, Jun 19, 2019 at 12:27 AM Alex Williamson <
> alex.williamson at redhat.com>
> > wrote:
> >
> > > On Tue, 18 Jun 2019 14:48:11 +0200
> > > Sylvain Bauza <sbauza at redhat.com> wrote:
> > >
> > > > On Tue, Jun 18, 2019 at 1:01 PM Cornelia Huck <cohuck at redhat.com>
> wrote:
>
> > > > > I think we need to reach consensus about the actual scope of the
> > > > > mdevctl tool.
> > > > >
> > > > >
> > > > Thanks Cornelia, my thoughts:
> > > >
> > > > - Is it supposed to be responsible for managing *all* mdev devices
> in
> > > > >   the system, or is it more supposed to be a convenience helper for
> > > > >   users/software wanting to manage mdevs?
> > > > >
> > > >
> > > > The latter. If an operator (or some software) wants to create mdevs
> by
> > > not
> > > > using mdevctl (and rather directly calling the sysfs), I think it's
> OK.
> > > > That said, mdevs created by mdevctl would be supported by
> systemctl,
> > > while
> > > > the others not but I think it's okay.
> > >
> > > I agree (sort of), and I'm hearing that we should drop any sort of
> > > automatic persistence of mdevs created outside of mdevctl.  The problem
> > > comes when we try to draw the line between unmanaged and manged
> > > devices.  For instance, if we have a command to list mdevs it would
> > > feel incomplete if it didn't list all mdevs both those managed by
> > > mdevctl and those created elsewhere.  For managed devices, I expect
> > > we'll also have commands that allow the mode of the device to be
> > > switched between transient, saved, and persistent.  Should a user then
> > > be allowed to promote an unmanaged device to one of these modes via the
> > > same command?  Should they be allowed to stop an unmanaged device
> > > through driverctl?  Through systemctl?  These all seem like reasonable
> > > things to do, so what then is the difference between transient and
> > > unmanaged mdev and is mdevctl therefore managing all mdevs, not just
> > > those it has created?
> > >
> > >
> > Well, IMHO, mdevs created by mdevctl could all be persisted or transient
> > just by adding an option when calling mdevctl, like :
> > "mdevctl create-mdev [--transient] <uuid> <pci_id> <type>" where default
> > would be persisting the mdev.
>
> This sounds useful; the caller can avoid fiddling with sysfs entries
> directly, while not committing to a permanent configuration.
>
> >
> > For mdevs *not* created by mdevctl, then a usecase could be "I'd like to
> > ask mdevctl to manage mdevs I already created" and if so, a mdevctl
> command
> > like :
> > "mdevctl manage-mdev [--transient] <mdev_uuid>"
>
> What kind of 'managing' would this actually enable? If we rely on
> mdevctl working with sysfs directly for transient devices, I can't
> really think of anything...
>
>
Just for getting the list of mdevs and see whether they are persistent.

>
> > Of course, that would mean that when you list mdevs by "mdev list-all"
> you
> > wouldn't get mdevs managed by mdevctl.
> > Thoughts ?
> >
> > > - Do we want mdevctl to manage config files for individual mdevs, or
> > > > >   are they supposed to be in a common format that can also be
> managed
> > > > >   by e.g. libvirt?
> > > > >
> > > >
> > > > Unless I misunderstand, I think mdevctl just helps to create mdevs
> for
> > > > being used by guests created either by libvirt or QEMU or even
> others.
> > > > How a guest would allocate a mdev (ie. saying "I'll use this
> specific
> > > mdev
> > > > UUID") is IMHO not something for mdevctl.
> > >
> > > Right, mdevctl isn't concerned with how a specific mdev is used, but I
> > > think what Connie is after is more the proposal from Daniel where
> > > libvirt can essentially manage mdevctl config files itself and then
> > > only invoke mdevctl for the dirty work of creating and deleting
> > > devices.  In fact, assuming systemd, libvirt could avoid direct
> > > interaction with mdevctl entirely, instead using systemctl device units
> > > to start and stop the mdevs.  Maybe where that proposal takes a turn is
> > > when we again consider non-systemd hosts, where maybe mdevctl needs to
> > > write out an init script per mdev and libvirt injecting itself into
> > > manipulation of the config files would either need to perform the same
> > > or fall back to mdevctl.  Unfortunately there seems to be an ultimatum
> > > to either condone external config file manipulation or expand the scope
> > > of the project into becoming a library.
> > >
> > >
> > Well, like I said, I think it's maybe another user case : just using
> > libvirt when you want a guest having vGPUs and then libvirt would create
> > mdevs (so users wouln't need to know at that).
> > That said, for the moment, I think we don't really need it so maybe a new
> > RFE once we at least have mdevctl packaged and supported by RHEL ?
>
> If we allow config file handling directly, libvirt could start using it
> even without mdevctl present? (Not sure if that makes sense.)
>
>
Well, sure.

>
> >
> > > - Should mdevctl be a stand-alone tool, provide library functions, or
> > > > >   both? Related: should it keep any internal state that is not
> written
> > > > >   to disk? (I think that also plays into the transient vs.
> persistent
> > > > >   question.)
> > >
> > > I don't think we want an mdevctld, if that's what you mean by internal
>
> Yeah, mdevctld--.
>
> > > state not written to disk.  I think we ideally want all state in the
> > > mdev config files or discerned through sysfs.  How we handle
> > > non-systemd hosts may throw a wrench in that though since currently the
> > > systemd integration relies on a template to support arbitrary mdevs and
> > > I'm not sure how to replicate that in other init services.  If we need
> > > to dynamically manage per mdev init files in addition to config files,
> > > we're not so self contained.
> > >
> > > > FWIW, I'd love using mdevctl for OpenStack (Nova) just at least for
> > > > creating persisted mdevs (ie. mdevs that would be recreated after
> > > rebooting
> > > > using systemctl). That's the real use case I need.
> > > > Whether libvirt would internally support mdevctl would be nice but
> that's
> > > > not really something Nova needs, so I leave others providing their
> own
> > > > thoughts.
> > > >
> > > >
> > > > My personal opinion is that mdevctl should be able to tolerate
> mdevs
> > > > > being configured by other means, but probably should not try to
> impose
> > > > > its own configuration if it detects that (unless explicitly asked
> to do
> > > > > so). Not sure how feasible that goal is.
> > > > >
> > > > > That's what I misunderstand : in order to have a guest using a
> vGPU,
> > > you
> > > > need to do two things :
> > > > 1/ create the mdev
> > > > 2/ allocate this created dev to a specific guest config
> > > >
> > > > Of course, we could imagine a way to have both steps to be done
> directly
> > > by
> > > > libvirt, but from my opinion, mdevctl is really helping 1/ and not
> 2/.
> > >
> > > Yep, we also don't want to presume libvirt is the only consumer here.
> > > mdevctl should also support other VM management tools, users who write
> > > their own management scripts, and even non-VM related use cases.
> > >
> > >
> > Oh yes, please don't premuse mdevctl is only needed by libvirt.
> > FWIW, once mdevctl is supported by RHEL, I'd love to use it for OpenStack
> > Nova at least because I want to persist the mdevs.
> > At the moment, Nova just creates mdevs directly by sysfs and look the
> > existing onces up by sysfs, but upstream community in Nova thinks the
> > mission statement is not about managing mdevs so we don't really want to
> > add in Nova some kind of DB persistence just for mdevs.
>
> So, Nova would basically poke mdevctl, but not interact with the config
> files directly? Or am I misunderstanding?
>
>
Correct, instead of doing something like
https://github.com/openstack/nova/blob/master/nova/privsep/libvirt.py#L207-L216
That said, Nova could do like libvirt and create a config file, for sure.


>
> > > > A well-defined config file format is probably a win, even if it
> only
> > > > > ends up being used by mdevctl itself.
> > >
> > > Yes, regardless of whether others touch them, conversion scripts on
> > > upgrade should be avoided.  Do we need something beyond a key=value
> > > file?  So far we're only storing the mdev type and startup mode, but
> > > vfio-ap clearly needs more, apparently key=value1,value2,... type
> > > representation.  Still, I think I'd prefer simple over jumping to xml
> > > or json or yaml.  Thanks,
> > >
> > >
> > Heh, in OpenStack discussing about a file format is possibly one of the
> > longest arguments we already have, so I leave others to say their own
> > opinions but FWIW, as we use Python we tend to prefer YAML files. I don't
> > care about the format tho, just take the most convenient for libvirt I'd
> > say.
>
> Aww, and here I was looking forward to a nice file format discussion...
>
> More seriously, as I said in my other reply, .ini style would be good,
> but using JSON probably gives us more flexibility in the long run.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20190619/7a88a094/attachment-0001.htm>


More information about the libvir-list mailing list