[libvirt] RFC: An "embedded" mode for QEMU/LXC drivers

Thu Jan 3 04:55:21 UTC 2013

On Wed, Jan 02, 2013 at 03:36:54PM +0000, Daniel P. Berrange wrote:
> This is something I was thinking about a little over the christmas
> break. I've no intention of implementing this in the immediate
> future, but wanted to post it while it was fresh in my mind.
> 
> Historically we have had 2 ways of using the stateful drivers like
> QEMU/LXC/UML/etc.
> 
>  - "system mode"  - privileged libvirtd, one per host, started at boot
>  - "session mode" - unprivileged libvirtd, one per non-root user, autostarted
> 
> Within context of each daemon, VM name uniqueness is enforced. Operating
> via the daemon means that all applications connected to libvirtd get the
> same world view. This single world view is exactly what you want when
> dealing with server / cloud / desktop virtualization, because it means
> tools like 'virt-top', 'virt-viewer', 'virsh' can see the same VMs as
> virt-manager / oVirt / OpenStack / Boxes / etc.
> 
> Recently we've seen increasing importance of a new use case which I will
> refer to as "embedded" virtualization. The best example of this use case
> is libguestfs which has long run a dedicated QEMU instance, but just now
> switched to using libvirtd. The other use case is virt-sandbox which is
> doing application confinement using LXC/KVM.
> 
> In both these cases, operating via libvirtd is sub-optimal. Users of so
> called "embedded" virtualization, explicitly don't want to have interaction
> with other libvirt applications. They likely don't even want to expose the
> concept of virtualization to their users. For them virtualization is intended
> to be just a hidden impl detail of their application.
> 
> Some issues which arise when using embedded virtualization
> 
>  - Need to invent sensible unique names for each VM launched. This
>    leads to pollution of logfiles for QEMU instances run.
> 
>  - User sees libguestfs / virt-sandbox VMs in virt-manager / oVirt
>    which they may then try to "manage", breaking libguestfs / etc

I didn't realize this before, but yes this is bad.

>  - Disassociated process context, so if 'virt-sandbox' is placed in
>    a cgroup, the VMs it launches are in a different cgroup. Likewise
>    if custom env variables are set, work is needed to propagate those
>    to VMs.
> 
> This leads me to wonder whether it is worth exploring the idea of a new
> type of libvirt connection.
> 
>  - "embed mode" - no libvirtd, driver runs in application context

Seems like an excellent idea.

> The idea here is to take libvirtd out of the equation and directly use
> the QEMU driver code in the libvirt.so client / application. Since
> libvirtd (mostly) uses the same APIs as the public libvirt.so clients,
> there isn't much required to make this work.
> 
>  - A way for the application to invoke virStateInit for the driver
>  - Application must provide an event loop impl
>  - A way to specify alternative dirs for logs/state/config/etc
> 
> An application would access this mode using a different path for the
> driver, and specifying the path to use for logs/state/config etc.
> eg libguestfs might use
> 
>    qemu:///embed?statedir=/tmp/libguestfsXXXXX/
> 
> to get an instance of the QEMU driver that is completely private
> to itself. One question is whether there should be a single embed
> instance per process, or whether an application should be allowed
> to open multiple completely isolated embed instances. The latter
> might require is to eliminate more static variables in our code.
> 
> This kind of embedded mode is not without its downsides though
> 
>  - How to access virtual network  / storage / node device APIs ?

libguestfs only uses (optionally) user networking.  We also don't
access any storage or node APIs, and don't intend to AFAIK.

>  - Extra SELinux policy work to allow each app to have the same
>    kind of privileges that libvirtd has to let it start VMs

Right, this is important, and probably tricky.  How about still
running libvirtd, but per connection/process?  (I think you mentioned
before that this is in fact already possible).  It costs 1 extra fork,
but in the libguestfs scheme of things this won't make much
difference.

>  - How to troubleshoot - can't use things like
> 
>     'virsh qemu-monitor-command'
>
>    since the embedded instance is private to the application
>    in question.

For libguestfs this last one isn't important.

> One answer to the latter question, might be to actually allow the
> application to expose the same RPC service as libvirtd does. So
> virsh could connect to libguestfs using
> 
>     qemu:///embed?socketdir=/tmp/libguestfsXXXXX/libvirt-sock
>
> For the question of network/storage/node device access, the long
> term answer is probably to split up the system libvirtd instance
> into separate pieces. eg a virtnodedeviced, virtnetworkd,
> virstoraged, virtqemud, virtlxcd, etc. Now a client app would
> connect to their embedded QEMU instances, but to the shared
> nodedevice/network/storage daemons.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top