[libvirt] [PATCH 3/5] qemu: prefer memfd for anonymous memory

Marc-André Lureau marcandre.lureau at redhat.com
Tue Sep 11 11:33:52 UTC 2018


Hi

On Tue, Sep 11, 2018 at 3:26 PM, Dr. David Alan Gilbert
<dgilbert at redhat.com> wrote:
> * Marc-André Lureau (marcandre.lureau at redhat.com) wrote:
>> Hi
>>
>> On Tue, Sep 11, 2018 at 2:32 PM, Dr. David Alan Gilbert
>> <dgilbert at redhat.com> wrote:
>> > * Marc-André Lureau (marcandre.lureau at redhat.com) wrote:
>> >> Hi
>> >>
>> >> On Tue, Sep 11, 2018 at 12:37 PM, Michal Privoznik <mprivozn at redhat.com> wrote:
>> >> > On 09/11/2018 12:46 AM, John Ferlan wrote:
>> >> >>
>> >> >> On 09/07/2018 07:32 AM, marcandre.lureau at redhat.com wrote:
>> >> >>> From: Marc-André Lureau <marcandre.lureau at redhat.com>
>> >> >>>
>> >> >>
>> >> >> Would be nice to have a few more words here. If you provide them I can
>> >> >> add them... The if statement is difficult to read unless you know what
>> >> >> each field really means.
>> >> >>
>> >> >> secondary question - should we document what gets used?, e.g.:
>> >> >>
>> >> >> https://libvirt.org/formatdomain.html#elementsMemoryBacking
>> >> >>
>> >> >> Seems to me the preference to use memfd is for memory backing using
>> >> >> anonymous source for nvdimm's without a defined path, but sometimes my
>> >> >> wording doesn't match reality.
>> >> >
>> >> > I don't think we want to tell users what backend are we going to use
>> >> > under what conditions. Firstly, these conditions will change (as they
>> >> > did in the past). Secondly, what backend libvirt decides to use is no
>> >> > business of users. I mean, they care about providing XML that matches
>> >> > their demands. It's libvirt's job to fulfil them.
>> >> >
>> >> > Look at this from the other way: if an user wants to have
>> >> > memory-backend-file for his domain, how would they enforce it once memfd
>> >> > is merged? Sure, they can tweak their memoryBacking settings, but that
>> >> > would work only until we decide to change the decision process for mem
>> >> > backend.
>> >> >
>> >> > What I am more worried about is migration. What happens if I migrate a
>> >> > hugepages domain from older libvirt to a newer one (the former doesn't
>> >> > support memfd, the latter does). On the source the domain was started
>> >> > with memory-backend-file (or memory-backend-ram with -mem-path). And
>> >> > during migration, the generated cmd line would use memfd. And I don't
>> >> > think qemu is capable of dealing with this discrepancy, is it?
>> >>
>> >>
>> >> Actually, qemu doesn't care about the hostmem backend kind, it should
>> >> handle the migration ok.
>> >>
>> >> However, there seems to be a bug in qemu, and hostmem backend don't
>> >> use the right qom object name.
>> >
>> > Can you give me the command lines you're using?
>>
>> qemu -m 4096 -object memory-backend-ram,id=mem,size=4G -numa
>> node,memdev=mem -monitor stdio
>> qemu -m 4096 -object
>> memory-backend-file,id=mem,size=4G,mem-path=/tmp/foo -numa
>> node,memdev=mem -monitor stdio
>> qemu -m 4096 -object memory-backend-memfd,id=mem,size=4G -numa
>> node,memdev=mem -monitor stdio
>
> There seem to be two different problems (at least); there's that
> escaping problem where the /'s are shown as \x2f in into qom-tree,

That's not a problem, this is done in memory_region_escape_name()

> but  info ramblock looks saner, but is still showing the difference:
>
> ./x86_64-softmmu/qemu-system-x86_64 -m 1024 -object memory-backend-ram,id=mem,size=1G -numa node,memdev=mem -monitor stdio
> (qemu) info ramblock
>               Block Name    PSize              Offset               Used              Total
>                      mem    4 KiB  0x0000000000000000 0x0000000040000000 0x0000000040000000
>
> ./x86_64-softmmu/qemu-system-x86_64 -m 1024 -object memory-backend-file,id=mem,size=1G,mem-path=/tmp/foo -numa node,memdev=mem -monitor stdio
> (qemu) info ramblock
>               Block Name    PSize              Offset               Used              Total
>             /objects/mem    4 KiB  0x0000000000000000 0x0000000040000000 0x0000000040000000
>
>  ./x86_64-softmmu/qemu-system-x86_64 -m 1024 -object memory-backend-memfd,id=mem,size=1G -numa node,memdev=mem -monitor stdio
> QEMU 3.0.50 monitor - type 'help' for more information
> (qemu) info ramblock
>               Block Name    PSize              Offset               Used              Total
>             /objects/mem    4 KiB  0x0000000000000000 0x0000000040000000 0x0000000040000000
>
> hostmem-file.c is using object_get_canonical_path to get the RAMBlock
> where as hostmem-ram.c is using object_get_canonical_path_**component**
>
> The problem is if we change either of them then again we break
> migration compatibility.

Yes, that was the object of my question :)

> We could wire it to a machine type and/or property, so that
> memory-backend-ram would use the long name on newere qemus with an
> appropriate flag?

Good idea, I can prepare a patch.

However, libvirt will have to learn of this migration issue with older
version, it's probably not worth to try to make more workarounds.


> Dave
>
>
>
>> >
>> > Dave
>> >
>> >> with memory-backend-ram:
>> >>
>> >> (qemu) info qom-tree /objects
>> >> /objects (container)
>> >>   /mem (memory-backend-file)
>> >>     /mem[0] (qemu:memory-region)
>> >>
>> >> But with memory-backend-file or memory-backend-memfd:
>> >>
>> >> (qemu) info qom-tree /objects
>> >> /objects (container)
>> >>   /mem (memory-backend-file)
>> >>     /\x2fobjects\x2fmem[0] (qemu:memory-region)
>> >>
>> >>
>> >> This causes migration to fail because of the object naming mismatch.
>> >>
>> >> It can migrate from/to -file and -memfd, since they use the same
>> >> "broken" name, but not with -ram.
>> >>
>> >> I don't know how we can solve this migration issue without breaking
>> >> things further. Any idea David?
>> >>
>> >> > Or is memfd going to be used only for hugepages + <source
>> >> > type='anonymous'/> case (which is not allowed now and thus migration
>> >> > scenario I'm describing can't happen)?
>> >>
>> >> With those patches, memfd is used for anonymous memory (shared or not,
>> >> hpt or not) with an explicit numa configuration.
>> >>
>> >> thanks
>> > --
>> > Dr. David Alan Gilbert / dgilbert at redhat.com / Manchester, UK
> --
> Dr. David Alan Gilbert / dgilbert at redhat.com / Manchester, UK




More information about the libvir-list mailing list