[libvirt] [Qemu-devel] [PATCH 1/2] numa: deprecate 'mem' parameter of '-numa node' option

Michal Privoznik mprivozn at redhat.com
Mon Mar 4 14:34:06 UTC 2019


On 3/4/19 3:16 PM, Igor Mammedov wrote:
> On Mon, 4 Mar 2019 12:39:08 +0000
> Daniel P. Berrangé <berrange at redhat.com> wrote:
> 
>> On Mon, Mar 04, 2019 at 01:25:07PM +0100, Igor Mammedov wrote:
>>> On Mon, 04 Mar 2019 08:13:53 +0100
>>> Markus Armbruster <armbru at redhat.com> wrote:
>>>    
>>>> Daniel P. Berrangé <berrange at redhat.com> writes:
>>>>    
>>>>> On Fri, Mar 01, 2019 at 06:33:28PM +0100, Igor Mammedov wrote:
>>>>>> On Fri, 1 Mar 2019 15:49:47 +0000
>>>>>> Daniel P. Berrangé <berrange at redhat.com> wrote:
>>>>>>      
>>>>>>> On Fri, Mar 01, 2019 at 04:42:15PM +0100, Igor Mammedov wrote:
>>>>>>>> The parameter allows to configure fake NUMA topology where guest
>>>>>>>> VM simulates NUMA topology but not actually getting a performance
>>>>>>>> benefits from it. The same or better results could be achieved
>>>>>>>> using 'memdev' parameter. In light of that any VM that uses NUMA
>>>>>>>> to get its benefits should use 'memdev' and to allow transition
>>>>>>>> initial RAM to device based model, deprecate 'mem' parameter as
>>>>>>>> its ad-hoc partitioning of initial RAM MemoryRegion can't be
>>>>>>>> translated to memdev based backend transparently to users and in
>>>>>>>> compatible manner (migration wise).
>>>>>>>>
>>>>>>>> That will also allow to clean up a bit our numa code, leaving only
>>>>>>>> 'memdev' impl. in place and several boards that use node_mem
>>>>>>>> to generate FDT/ACPI description from it.
>>>>>>>
>>>>>>> Can you confirm that the  'mem' and 'memdev' parameters to -numa
>>>>>>> are 100% live migration compatible in both directions ?  Libvirt
>>>>>>> would need this to be the case in order to use the 'memdev' syntax
>>>>>>> instead.
>>>>>> Unfortunately they are not migration compatible in any direction,
>>>>>> if it where possible to translate them to each other I'd alias 'mem'
>>>>>> to 'memdev' without deprecation. The former sends over only one
>>>>>> MemoryRegion to target, while the later sends over several (one per
>>>>>> memdev).
>>>>>
>>>>> If we can't migration from one to the other, then we can not deprecate
>>>>> the existing 'mem' syntax. Even if libvirt were to provide a config
>>>>> option to let apps opt-in to the new syntax, we need to be able to
>>>>> support live migration of existing running VMs indefinitely. Effectively
>>>>> this means we need the to keep 'mem' support forever, or at least such
>>>>> a long time that it effectively means forever.
>>>>>
>>>>> So I think this patch has to be dropped & replaced with one that
>>>>> simply documents that memdev syntax is preferred.
>>>>
>>>> We have this habit of postulating absolutes like "can not deprecate"
>>>> instead of engaging with the tradeoffs.  We need to kick it.
>>>>
>>>> So let's have an actual look at the tradeoffs.
>>>>
>>>> We don't actually "support live migration of existing running VMs
>>>> indefinitely".
>>>>
>>>> We support live migration to any newer version of QEMU that still
>>>> supports the machine type.
>>>>
>>>> We support live migration to any older version of QEMU that already
>>>> supports the machine type and all the devices the machine uses.
>>>>
>>>> Aside: "support" is really an honest best effort here.  If you rely on
>>>> it, use a downstream that puts in the (substantial!) QA work real
>>>> support takes.
>>>>
>>>> Feature deprecation is not a contract to drop the feature after two
>>>> releases, or even five.  It's a formal notice that users of the feature
>>>> should transition to its replacement in an orderly manner.
>>>>
>>>> If I understand Igor correctly, all users should transition away from
>>>> outdated NUMA configurations at least for new VMs in an orderly manner.
>>> Yes, we can postpone removing options until there are machines type
>>> versions that were capable to use it (unfortunate but probably
>>> unavoidable unless there is a migration trick to make transition
>>> transparent) but that should not stop us from disabling broken
>>> options on new machine types at least.
>>>
>>> This series can serve as formal notice with follow up disabling of
>>> deprecated options for new machine types. (As Thomas noted, just warnings
>>> do not work and users continue to use broken features regardless whether
>>> they are don't know about issues or aware of it [*])
>>>
>>> Hence suggested deprecation approach and enforced rejection of legacy
>>> numa options for new machine types in 2 releases so users would stop
>>> using them eventually.
>>
>> When we deprecate something, we need to have a way for apps to use the
>> new alternative approach *at the same time*.  So even if we only want to
>> deprecate for new machine types, we still have to first solve the problem
>> of how mgmt apps will introspect QEMU to learn which machine types expect
>> the new options.
> I'm not aware any mechanism to introspect machine type options (existing
> or something being developed). Are/were there any ideas about it that were
> discussed in the past?
> 
> Aside from developing a new mechanism what are alternative approaches?
> I mean when we delete deprecated CLI option, how it's solved on libvirt
> side currently?

Libvirt queries qemu capabilites via QMP. And in all places it can it 
preferes the latest recommended cmd line options (well, those known to a 
libvirt developer at the time he/she is writing the code). So as long as 
you remove only old stuff and libvirt refreshes itself when following 
best practicies we're okay.

> 
> For example I don't see anything introspection related when we have been
> removing deprecated options recently.
> 
> More exact question specific to this series usecase,
> how libvirt decides when to use -numa node,memdev or not currently?

It has a mechanism to tell if '-numa node,memdev=' is needed (i.e. there 
is no other way to satisfy user requested configuration) and only then 
it uses ,memdev. For all other cases it defaults to -numa node,mem= 
simply to keep backwards compatibility (as I'm explaining in another 
e-mail I've just sent to this list).

Anyway, in the libvirt code you want to be looking at:

src/qemu/qemu_command.c: qemuBuildNumaArgStr
src/qemu/qemu_command.c: qemuBuildMemoryCellBackendStr

Michal




More information about the libvir-list mailing list