[Libvir] Extending libvirt to probe NUMA topology
Daniel P. Berrange
berrange at redhat.com
Wed Jun 13 19:03:24 UTC 2007
On Wed, Jun 13, 2007 at 01:48:21PM -0400, Daniel Veillard wrote:
> On Wed, Jun 13, 2007 at 10:40:40AM -0500, Ryan Harper wrote:
> > Hello all,
> Hello Ryan,
> > I wanted to start a discussion on how we might get libvirt to be able to
> > probe the NUMA topology of Xen and Linux (for QEMU/KVM). In Xen, I've
> > recently posted patches for exporting topology into the physinfo
> > hypercall, as well adding a hypercall to probe the Xen heap. I
> > believe the topology and memory info is already available in Linux.
> > With these, we have enough information to be able to write some simple
> > policy above libvirt that can create guests in a NUMA-aware fashion.
> > I'd like to suggest the following for discussion:
> > (1) A function to discover topology
> > (2) A function to check available memory
> > (3) Specifying which cpus to use prior to domain start
> Okay, but let's start by defining the scope a bit. Historically NUMA
> have explored various paths, and I assume we are gonna work in a rather
> small subset of what NUMA (Non Uniform Memory Access) have meant over time.
> I assume the following, tell me if I'm wrong:
> - we are just considering memory and processor affinity
> - the topology, i.e. the affinity between the processors and the various
> memory areas is fixed and the kind of mapping is rather simple
> to get into more specifics:
> - we will need to expand the model of libvirt http://libvirt.org/intro.html
> to split the Node ressources into separate sets containing processors
> and memory areas which are highly connected together (assuming the
> model is a simple partition of the ressources between the equivalent
> of sub-Nodes)
> - the function (2) would for a given processor tell how much of its memory
> is already allocated (to existing running or paused domains)
> Right ? Is the partition model sufficient for the architectures ?
> If yes then we will need a new definition and terminology for those sub-Nodes.
We have 3 core models we should refer to when deciding how to present
- Linux/Solaris Xen - hypercalls
- Linux non-Xen - libnuma
- Solaris non-Xen - liblgrp
The Xen & Linux modelling seems reasonably similar IIRC, but Solaris is
a slightly different representational approach.
> For 3 we already have support for pinning the domain virtual CPUs to physical
> CPUs but I guess it's not sufficient because you want this to be activated
> from the definition of the domain:
> So the XML format would have to be extended to allow specifying the subset
> of processors the domain is supposed to start on:
Yeah, I've previously argued against including VCPU pinning information
in the XML since its a tunable, not a hardware description. Reluctantly
though we'll have to add this VCPU info, since its an absolute requirement
for this info to be provided at time of domain creation for NUMA support.
> I would assume that if nothing is specified, the underlying Hypervisor
> (in libvirt terminology, that could be a linux kernel in practice) will
> by default try to do the optimal placement by itself, i.e. (3) is only
> useful if you want to override the default behaviour.
Yes that is correct. We should not change the default - let the OS appply
whatever policy it sees fit by default, since over time OS are tending
towards being able to automagically determine & apply NUMA policy.
|=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=|
|=- Perl modules: http://search.cpan.org/~danberr/ -=|
|=- Projects: http://freshmeat.net/~danielpb/ -=|
|=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|
More information about the libvir-list