[libvirt] [PATCH 3/5] qemu: prefer memfd for anonymous memory

Dr. David Alan Gilbert dgilbert at redhat.com
Tue Sep 11 11:49:12 UTC 2018


* Marc-André Lureau (marcandre.lureau at redhat.com) wrote:
> Hi
> 
> On Tue, Sep 11, 2018 at 3:26 PM, Dr. David Alan Gilbert
> <dgilbert at redhat.com> wrote:
> > * Marc-André Lureau (marcandre.lureau at redhat.com) wrote:
> >> Hi
> >>
> >> On Tue, Sep 11, 2018 at 2:32 PM, Dr. David Alan Gilbert
> >> <dgilbert at redhat.com> wrote:
> >> > * Marc-André Lureau (marcandre.lureau at redhat.com) wrote:
> >> >> Hi
> >> >>
> >> >> On Tue, Sep 11, 2018 at 12:37 PM, Michal Privoznik <mprivozn at redhat.com> wrote:
> >> >> > On 09/11/2018 12:46 AM, John Ferlan wrote:
> >> >> >>
> >> >> >> On 09/07/2018 07:32 AM, marcandre.lureau at redhat.com wrote:
> >> >> >>> From: Marc-André Lureau <marcandre.lureau at redhat.com>
> >> >> >>>
> >> >> >>
> >> >> >> Would be nice to have a few more words here. If you provide them I can
> >> >> >> add them... The if statement is difficult to read unless you know what
> >> >> >> each field really means.
> >> >> >>
> >> >> >> secondary question - should we document what gets used?, e.g.:
> >> >> >>
> >> >> >> https://libvirt.org/formatdomain.html#elementsMemoryBacking
> >> >> >>
> >> >> >> Seems to me the preference to use memfd is for memory backing using
> >> >> >> anonymous source for nvdimm's without a defined path, but sometimes my
> >> >> >> wording doesn't match reality.
> >> >> >
> >> >> > I don't think we want to tell users what backend are we going to use
> >> >> > under what conditions. Firstly, these conditions will change (as they
> >> >> > did in the past). Secondly, what backend libvirt decides to use is no
> >> >> > business of users. I mean, they care about providing XML that matches
> >> >> > their demands. It's libvirt's job to fulfil them.
> >> >> >
> >> >> > Look at this from the other way: if an user wants to have
> >> >> > memory-backend-file for his domain, how would they enforce it once memfd
> >> >> > is merged? Sure, they can tweak their memoryBacking settings, but that
> >> >> > would work only until we decide to change the decision process for mem
> >> >> > backend.
> >> >> >
> >> >> > What I am more worried about is migration. What happens if I migrate a
> >> >> > hugepages domain from older libvirt to a newer one (the former doesn't
> >> >> > support memfd, the latter does). On the source the domain was started
> >> >> > with memory-backend-file (or memory-backend-ram with -mem-path). And
> >> >> > during migration, the generated cmd line would use memfd. And I don't
> >> >> > think qemu is capable of dealing with this discrepancy, is it?
> >> >>
> >> >>
> >> >> Actually, qemu doesn't care about the hostmem backend kind, it should
> >> >> handle the migration ok.
> >> >>
> >> >> However, there seems to be a bug in qemu, and hostmem backend don't
> >> >> use the right qom object name.
> >> >
> >> > Can you give me the command lines you're using?
> >>
> >> qemu -m 4096 -object memory-backend-ram,id=mem,size=4G -numa
> >> node,memdev=mem -monitor stdio
> >> qemu -m 4096 -object
> >> memory-backend-file,id=mem,size=4G,mem-path=/tmp/foo -numa
> >> node,memdev=mem -monitor stdio
> >> qemu -m 4096 -object memory-backend-memfd,id=mem,size=4G -numa
> >> node,memdev=mem -monitor stdio
> >
> > There seem to be two different problems (at least); there's that
> > escaping problem where the /'s are shown as \x2f in into qom-tree,
> 
> That's not a problem, this is done in memory_region_escape_name()
> 
> > but  info ramblock looks saner, but is still showing the difference:
> >
> > ./x86_64-softmmu/qemu-system-x86_64 -m 1024 -object memory-backend-ram,id=mem,size=1G -numa node,memdev=mem -monitor stdio
> > (qemu) info ramblock
> >               Block Name    PSize              Offset               Used              Total
> >                      mem    4 KiB  0x0000000000000000 0x0000000040000000 0x0000000040000000
> >
> > ./x86_64-softmmu/qemu-system-x86_64 -m 1024 -object memory-backend-file,id=mem,size=1G,mem-path=/tmp/foo -numa node,memdev=mem -monitor stdio
> > (qemu) info ramblock
> >               Block Name    PSize              Offset               Used              Total
> >             /objects/mem    4 KiB  0x0000000000000000 0x0000000040000000 0x0000000040000000
> >
> >  ./x86_64-softmmu/qemu-system-x86_64 -m 1024 -object memory-backend-memfd,id=mem,size=1G -numa node,memdev=mem -monitor stdio
> > QEMU 3.0.50 monitor - type 'help' for more information
> > (qemu) info ramblock
> >               Block Name    PSize              Offset               Used              Total
> >             /objects/mem    4 KiB  0x0000000000000000 0x0000000040000000 0x0000000040000000
> >
> > hostmem-file.c is using object_get_canonical_path to get the RAMBlock
> > where as hostmem-ram.c is using object_get_canonical_path_**component**
> >
> > The problem is if we change either of them then again we break
> > migration compatibility.
> 
> Yes, that was the object of my question :)
> 
> > We could wire it to a machine type and/or property, so that
> > memory-backend-ram would use the long name on newere qemus with an
> > appropriate flag?
> 
> Good idea, I can prepare a patch.

Great; if you add the property to use the longname, then turn that
property on in the newer machine type it should work.  A qemu that has
the property can then be assumed to the right thing when set.

> However, libvirt will have to learn of this migration issue with older
> version, it's probably not worth to try to make more workarounds.

Yeh I'm not sure what your heuristics look like for these choices.
But for a VM without this fix then you can't convert from backend-ram to
memfd.

Dave

> 
> > Dave
> >
> >
> >
> >> >
> >> > Dave
> >> >
> >> >> with memory-backend-ram:
> >> >>
> >> >> (qemu) info qom-tree /objects
> >> >> /objects (container)
> >> >>   /mem (memory-backend-file)
> >> >>     /mem[0] (qemu:memory-region)
> >> >>
> >> >> But with memory-backend-file or memory-backend-memfd:
> >> >>
> >> >> (qemu) info qom-tree /objects
> >> >> /objects (container)
> >> >>   /mem (memory-backend-file)
> >> >>     /\x2fobjects\x2fmem[0] (qemu:memory-region)
> >> >>
> >> >>
> >> >> This causes migration to fail because of the object naming mismatch.
> >> >>
> >> >> It can migrate from/to -file and -memfd, since they use the same
> >> >> "broken" name, but not with -ram.
> >> >>
> >> >> I don't know how we can solve this migration issue without breaking
> >> >> things further. Any idea David?
> >> >>
> >> >> > Or is memfd going to be used only for hugepages + <source
> >> >> > type='anonymous'/> case (which is not allowed now and thus migration
> >> >> > scenario I'm describing can't happen)?
> >> >>
> >> >> With those patches, memfd is used for anonymous memory (shared or not,
> >> >> hpt or not) with an explicit numa configuration.
> >> >>
> >> >> thanks
> >> > --
> >> > Dr. David Alan Gilbert / dgilbert at redhat.com / Manchester, UK
> > --
> > Dr. David Alan Gilbert / dgilbert at redhat.com / Manchester, UK
--
Dr. David Alan Gilbert / dgilbert at redhat.com / Manchester, UK




More information about the libvir-list mailing list