[libvirt] adding a new libvirt xml element for File Descriptor backed memory for use with vhost-user

Daniel P. Berrange berrange at redhat.com
Thu May 12 16:28:02 UTC 2016


On Thu, May 12, 2016 at 04:00:29PM +0000, Mooney, Sean K wrote:
> > > Today it is possible to use Libvirt to spawn a vm without hugepage
> > > memory and a file descriptor backed memdev Via the use of the
> > qemu:commandline element.
> > >
> > >   <qemu:commandline>
> > >     <qemu:arg value='-object'/>
> > >     <qemu:arg value='memory-backend-file,id=mem,size=1024M,mem-
> > path=/var/lib/libvirt/qemu,share=on'/>
> > >     <qemu:arg value='-numa'/>
> > >     <qemu:arg value='node,memdev=mem'/>
> > >     <qemu:arg value='-mem-prealloc'/>
> > >   </qemu:commandline>
> > >
> > > I created a proof of concept patch to nova to demonstrate that this
> > > works however to support this usecase in Nova a new xml element is
> > required.
> > > https://review.openstack.org/#/c/309565/1
> > >
> > > I would like to propose the introduction of  a new subelemnt to the
> > > memorybacking element to request file discrptro backed memory
> > >
> > > <memoryBacking>
> > >    <filedescriptor size_mb="1024" path="/var/lib/libvirt/qemu"
> > > prealloc="true" shared="on" />  </memoryBacking>
> > 
> > Specifying a size is not required - we already know how big memory must
> > be for the guest.
> > 
> > We already have a memAccess='shared' attribute against the <numa>
> > element that is used to determine if the underlying memory should be
> > setup as shared.  We could define a further element that lets us control
> > memory access mode for guests without NUMA topology specified.
> [Mooney, Sean K] hi yes the reason I added the shared attribute was to cater for
> The case of guest without numa topology. For guest with numa topology I agree that
> Using the memAcess='shared' on the cell is better for consistency with hugepage memory.
>
> >   <memoryBacking>
> >      <access mode="shared"/>
> >   </memoryBacking>
> > 
> > For huge pages it seems we unconditionally pass --mem-prealloc. I'm
> > thinking we could perhaps make that configurable via an element
> > 
> > 
> >   <memoryBacking>
> >      <allocation mode="immediate|ondemand"/>
> >   </memoryBacking>
> > 
> > to control use of -mem-prealloc or not.
> [Mooney, Sean K] for the vhost user case the the mem-prealloc is required
> Because you are basically doing dma so you really want memory to allocated.
> Generically though from a Libvirt point of view I do think It makes sense for this
> To be configurable to allow over subscript of memory for higher density.
> > 
> > So all that remains is a way to request file based backing of RAM. As
> > with huge pages, I think we should hide the actual path from the user.
> > We should just use /dev/shm as the backing for non-hugepage RAM. For
> > this we could define something like
> > 
> >    <memoryBacking>
> >        <source type="file|anonymous"/>
> >    </memoryBacking>
> > 
> [Mooney, Sean K] for some reason when I used /dev/shm I could only boot one instance at a time.
> that was my first choice but maybe we would have to create a file per instance under /dev/shm to make it work.

QEMU should create the file itself - its not different to our use
of hugetlbfs in fact. Possibly you hit a limit on amount of memory
allowed to be used via /dev/shm - iirc the mount poin tis limited
to 50% by default

If you use /var/lib/libvirt/ as the location you get a real file
backed by disk, so akin to putting the VM on swap IIUC !

> > Putting that all together, to get what you want we'd have
> > 
> >    <memoryBacking>
> >        <source type="file"/>
> >        <access mode="shared"/>
> >        <allocation mode="immediate"/>
> >    </memoryBacking>
> > 
> [Mooney, Sean K] 
> Yes this seems like it would be a clean way to address this use case.
> Can you guage how small/large of a change this would be. Its been
> A while since I worked with c directly but if you could point me in the
> Right direction in the Libvirt  codebase I would be happy to look at
> creating an RFC patch.

First there's defining the XML extensions - needs docs/schemas/domaincommon.rng
and src/conf/domain_conf.{c,h} to be changed.

Then there's wiring up QEMU XML -> ARGV conversion - src/qemu/qemu_command.c
and adding test cases in tests/qemuxml2argvtest.c

> From a nova side assuming Libvirt was extended for this feature should
> I open a blueprint to extend the existing guest memory backing support
> In parallel to the Libvirt implementation or wait until after it is 
> support in Libvirt to start the Nova discussion? In either case I think
> we agree that any support in nova Would Depend on Libvirt support to be
> accepted in  upstream nova.

You're going to hit the deadline for approval of Newton specs in Nova
fairly soon, and unless the libvirt impl is done before then, I think
it is unlikely you'd get a spec approved. So by all means work on this
in parallel, but be realistic about chances of approval in Nova for
this cycle.


Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|




More information about the libvir-list mailing list