[libvirt] [PATCH RFC v2] qemu: Support numad

Bill Gray bgray at redhat.com
Wed Mar 7 15:47:13 UTC 2012


Note numad will attempt to manage / balance processes after they're 
launched, but the ideal case is libvirt pre-places them in a good spot 
and they never move....

On 03/07/2012 10:38 AM, Osier Yang wrote:
> On 2012年03月07日 21:48, Daniel P. Berrange wrote:
>> On Wed, Mar 07, 2012 at 09:55:16PM +0800, Osier Yang wrote:
>>> numad is an user-level daemon that monitors NUMA topology and
>>> processes resource consumption to facilitate good NUMA resource
>>> alignment of applications/virtual machines to improve performance
>>> and minimize cost of remote memory latencies. It provides a
>>> pre-placement advisory interface, so significant processes can
>>> be pre-bound to nodes with sufficient available resources.
>>>
>>> More details: http://fedoraproject.org/wiki/Features/numad
>>>
>>> "numad -w ncpus:memory_amount" is the advisory interface numad
>>> provides currently.
>>>
>>> This patch add the support by introducing a bool XML element:
>>> <numatune>
>>> <autonuma/>
>>> </numatune>
>>>
>>> If it's specified, the number of vcpus and the current memory
>>> amount specified in domain XML will be used for numad command
>>> line (numad uses MB for memory amount):
>>> numad -w $num_of_vcpus:$current_memory_amount / 1024
>>>
>>> The advisory nodeset returned from numad will be used to set
>>> domain process CPU affinity then. (e.g. qemuProcessInitCpuAffinity).
>>>
>>> If the user specifies both CPU affinity policy (e.g.
>>> (<vcpu cpuset="1-10,^7,^8">4</vcpu>) and XML indicating to use
>>> numad for the advisory nodeset, the specified CPU affinity will be
>>> ignored.
>>
>> I'm not sure that's a good idea. When we do dynamic generation
>> of parts of libvirt XML, we tend to report in the XML what was
>> generated, and if 2 parts contradict each other we shouldn't
>> silently ignore it.
>
> Agreed, v1 overrides the cpuset="1-10,^6", but I thought it
> could be confused for user to see things are different after
> domain is started.
>
>>
>> eg, with VNC with autoport=yes we then report the generated
>> port number.
>>
>> Similarly with<cpu> mode=host, we then report what the host
>> CPU features were.
>>
>> So, if we want to auto-set placement for a guest we should
>> likely do this via the<vcpu> element
>>
>> eg, Current mode where placement is completely static
>>
>> - Input XML:
>>
>> <vcpu placement="static" cpuset="1-10" />
>>
>> - Output XML:
>>
>> <vcpu placement="static" cpuset="1-10" />
>>
>> Or where we want to use numad:
>>
>> - Input XML:
>>
>> <vcpu placement="auto"/>
>>
>> - Output XML:
>>
>> <vcpu placement="auto" cpuset="1-10" />
>>
>
> I must admit this is much better. :-)
>
>>
>> The current numad functionality you propose only sets the initial guest
>> placement. Are we likely to have a mode in the future where numad will
>> be called to update the placement periodically for existing guests ?
>
> Very possiable, now numad just provides the advisory interface,
> in future I guess it could manage the placement dynamically.
>
>> If so, then "placement" would need to have more enum values.
>>
>
> I will post a v3 tomorrow.
>
> Regards,
> Osier




More information about the libvir-list mailing list