[PATCH v2 09/10] capabilities: Expose NUMA interconnects

Martin Kletzander mkletzan at redhat.com
Mon Jun 14 18:18:51 UTC 2021


On Thu, Jun 10, 2021 at 03:57:18PM +0200, Michal Privoznik wrote:
>Links between NUMA nodes can have different latencies and
>bandwidths. This info is newly defined in ACPI 6.2 under
>Heterogeneous Memory Attribute Table (HMAT) table. Linux kernel
>learned how to report these values under sysfs and thus we can
>expose them in our capabilities XML. The sysfs interface is
>documented in kernel's Documentation/admin-guide/mm/numaperf.rst.
>
>Long story short, two nodes can be in initiator-target
>relationship. A node can be initiator if it has a CPU or a device
>that's capable of initiating memory transfer. Therefore a node
>that has just memory can only be target. An initiator-target link
>can then have any combination of {bandwidth, latency} - {access,
>read, write} attribute (6 in total). However, the standard says
>access is applicable iff read and write values are the same.
>Therefore, we really have just four combinations of attributes:
>bandwidth-read, bandwidth-write, latency-read, latency-write.
>
>This is the combination that kernel reports anyway.
>
>Then, under /sys/system/devices/node/nodeX/acccessN/initiators we
>find values for those 4 attributes and also symlinks named
>"nodeN" which then represent initiators to nodeX. For instance:
>
>  /sys/system/node/node1/access1/initiators/node0 -> ../../node0
>  /sys/system/node/node1/access1/initiators/read_bandwidth
>  /sys/system/node/node1/access1/initiators/read_latency
>  /sys/system/node/node1/access1/initiators/write_bandwidth
>  /sys/system/node/node1/access1/initiators/write_latency
>
>This means that node0 is initiator and node1 is target and values
>of the interconnect can be read.
>
>In theory, there can be separate links to memory side caches too
>(e.g. one link from node X to node Y's main memory, another from
>node X to node Y's L1 cache, another one to L2 cache and so on).
>But sysfs does not express this relationship just yet.
>
>The "accessN" means either "access0" or "access1". The difference
>is that while the former expresses the best interconnect between
>two nodes including CPUS and I/O devices (such as GPUs and NICs),
>the latter includes only CPUs and thus is what we need.
>
>Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1786309
>Signed-off-by: Michal Privoznik <mprivozn at redhat.com>

Reviewed-by: Martin Kletzander <mkletzan at redhat.com>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20210614/3a05b5a8/attachment-0001.sig>


More information about the libvir-list mailing list