[Libvir] Extending libvirt to probe NUMA topology
Daniel Veillard
veillard at redhat.com
Thu Sep 6 15:17:57 UTC 2007
On Thu, Sep 06, 2007 at 03:40:23PM +0100, Richard W.M. Jones wrote:
> Daniel Veillard wrote:
> >1) Provide a function describing the topology as an XML instance:
> >
> > char * virNodeGetTopology(virConnectPtr conn);
>
> >which would return an XML instance as in virConnectGetCapabilities. I
> >toyed with the idea of extending virConnectGetCapabilities() to add a
> >topology section in case of NUMA support at the hypervisor level, but
> >it was looking to me that the two might be used at different times
> >and separating both might be a bit cleaner, but I could be convinced
> >otherwise.
>
> I'd definitely prefer to extend virConnectGetCapabilities XML. It
> avoids changing the remote driver and language bindings, and really
> callers only need to pull capabilities once per connection.
yeah, I understand that concern, simplifies a lot of stuff inside, but
the goal at the library level is to simplify the user code even if that
means a more complex implementation. However if people think they don't
need a separate call then I'm really fine with this.
> >---------------------------------
> ><topology>
> > <cells num='2'>
> > <cell id='0'>
> > <cpus num='2'>
> > <cpu id='0'/>
> > <cpu id='1'/>
> > </cpus>
> > <memory size='2097152'/>
> > </cell>
> > <cell id='1'>
> > <cpus num='2'>
> > <cpu id='2'/>
> > <cpu id='3'/>
> > </cpus>
> > <memory size='2097152'/>
> > </cell>
> > </cells>
> ></topology>
> >---------------------------------
> >
> > A few things to note:
> > - the <cells> element list the top sibling cells
>
> Not <nodes>?
A Node in libvirt terminology is a single physical machine, cell is
a weel accepted term I think for a sub-node within a NUMA box.
> > - the <cell> element describes as child the resources available
> > like the list of CPUs, the size of the local memory, that could
> > be extended by disk descriptions too
> > <disk dev='/dev/sdb'/>
> > and possibly other special devices (no idea what ATM).
> >
> > - in case of deeper hierarchical topology one may need to be able to
> > name sub-cells and the format could be extended for example as
> > <cells num='2'>
> > <cells num='2'>
> > <cell id='1'>
> > ...
> > </cell>
> > <cell id='2'>
> > ...
> > </cell>
> > </cells>
> > <cells num='2'>
> > <cell id='3'>
> > ...
> > </cell>
> > <cell id='4'>
> > ...
> > </cell>
> > </cells>
> > </cells>
> > But that can be discussed/changed when the need arise :-)
>
> Especially note that 4 (or more) socket AMDs have a topology like this,
> with two different penalties for reaching nodes which are one and two
> hops away. Do we have a way to describe the penalties along different
> paths?
As hinted in my mail, I think the access costs will have to be added
separately and probably as a array map, unless people come with a more
intelligent way of exposing those informations.
> >2) Function to get the free memory of a given cell:
> >
> > unsigned long virNodeGetCellFreeMemory(virConnectPtr conn, int cell);
> >
> >that's relatively simple, would match the request from the initial mail
> >but I'm wondering a bit. If the program tries to do a best placement it
> >will usually run that request for a number of cells no ? Maybe a call
> >returning the memory amounts for a range of cells would be more
> >appropriate.
>
> Yes, I guess they'd want to get the free memory for all nodes. But IBM
> will have a better idea about this.
Well I'm looking for feedback :-)
Daniel
--
Red Hat Virtualization group http://redhat.com/virtualization/
Daniel Veillard | virtualization library http://libvirt.org/
veillard at redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
More information about the libvir-list
mailing list