[libvirt] [Qemu-devel] [PATCH 1/2] numa: deprecate 'mem' parameter of '-numa node' option

Mon Mar 4 15:03:13 UTC 2019

On Mon, 4 Mar 2019 15:24:28 +0100
Michal Privoznik <mprivozn at redhat.com> wrote:

> [Thanks Igor for bringing this onto my radar. I don't follow qemu-devel 
> that close]
> 
> On 3/4/19 11:19 AM, Daniel P. Berrangé wrote:
> > On Mon, Mar 04, 2019 at 08:13:53AM +0100, Markus Armbruster wrote:  
> >> Daniel P. Berrangé <berrange at redhat.com> writes:
> >>  
> >>> On Fri, Mar 01, 2019 at 06:33:28PM +0100, Igor Mammedov wrote:  
> >>>> On Fri, 1 Mar 2019 15:49:47 +0000
> >>>> Daniel P. Berrangé <berrange at redhat.com> wrote:
> >>>>  
> >>>>> On Fri, Mar 01, 2019 at 04:42:15PM +0100, Igor Mammedov wrote:  
> >>>>>> The parameter allows to configure fake NUMA topology where guest
> >>>>>> VM simulates NUMA topology but not actually getting a performance
> >>>>>> benefits from it. The same or better results could be achieved
> >>>>>> using 'memdev' parameter. In light of that any VM that uses NUMA
> >>>>>> to get its benefits should use 'memdev' and to allow transition
> >>>>>> initial RAM to device based model, deprecate 'mem' parameter as
> >>>>>> its ad-hoc partitioning of initial RAM MemoryRegion can't be
> >>>>>> translated to memdev based backend transparently to users and in
> >>>>>> compatible manner (migration wise).
> >>>>>>
> >>>>>> That will also allow to clean up a bit our numa code, leaving only
> >>>>>> 'memdev' impl. in place and several boards that use node_mem
> >>>>>> to generate FDT/ACPI description from it.  
> >>>>>
> >>>>> Can you confirm that the  'mem' and 'memdev' parameters to -numa
> >>>>> are 100% live migration compatible in both directions ?  Libvirt
> >>>>> would need this to be the case in order to use the 'memdev' syntax
> >>>>> instead.  
> >>>> Unfortunately they are not migration compatible in any direction,
> >>>> if it where possible to translate them to each other I'd alias 'mem'
> >>>> to 'memdev' without deprecation. The former sends over only one
> >>>> MemoryRegion to target, while the later sends over several (one per
> >>>> memdev).  
> >>>
> >>> If we can't migration from one to the other, then we can not deprecate
> >>> the existing 'mem' syntax. Even if libvirt were to provide a config
> >>> option to let apps opt-in to the new syntax, we need to be able to
> >>> support live migration of existing running VMs indefinitely. Effectively
> >>> this means we need the to keep 'mem' support forever, or at least such
> >>> a long time that it effectively means forever.  
> 
> I'm with Daniel on this. The reason why libvirt still defaults to '-numa 
> node,mem=' is exactly because of backward compatibility. Since a machine 
> can't be migrated from '-numa node,mem=' to '-numa node,memdev= + 
> -object memory-backend-*' libvirt hast to play it safe and chose a 
> combination that is acessible the widest.
> 
> If you remove this, how would you expect older machines to migrate to 
> newer cmd line?
> 
> I'm all for deprecating old stuff. In fact, I've suggested that in 
> libvirt(!) here and there, but I'm afraid we can't just remove 
> functionatlity unless we give users a way to migrate to the one we 
> prefer now.
Agreed, it's clear now that I can't remove just 'mem' for OLD machine
types (even if this safe variant is broken and doesn't actually do
what it should). Libvirt should use 'memdev' for new VMs for them
to actually benefit from NUMA configuration.

Currently we are talking about disabling 'mem' for new machine types
only (pity that I have to keep around legacy code but at least we would
be able to move on to normal device modeling for initial memory on
new machines).

> And if libvirt doesn't follow qemu's warnings then it definitely should. 
> It's a libvirt bug if it doesn't follow the best practicies (well, if can).
> 
> Michal