[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [libvirt] [PATCH 1/3] nodeinfo: Add check and workaround to guarantee valid cpu topologies

On 11/08/12 23:55, Eric Blake wrote:
On 11/07/2012 08:22 AM, Peter Krempa wrote:
Lately there were a few reports of the output of the virsh nodeinfo
command being inaccurate. This patch ties to avoid that by checking if


the topology actualy makes sense. If it doesn't we then report a


synthetic topology that indicates to the user that the host capabilities
should be checked for the actual topology.
  src/nodeinfo.c | 37 +++++++++++++++++++++++++++++++------
  1 file changed, 31 insertions(+), 6 deletions(-)

You've convinced me that we need this; and we have already documented
that nodes=1 is a hint that the user must verify with the capability XML
for accurate results anyway.  ACK if you can answer one question:

@@ -531,6 +539,23 @@ done:
          goto cleanup;

+    /* Now check if the topology makes sense. There are machines that don't
+     * expose their real number of nodes or for example the AMD Bulldozer
+     * architecture that exposes their Clustered integer core modules as both
+     * threads and cores. This approach throws off our detection. Unfortunately
+     * the nodeinfo structure isn't designed to carry the full topology so
+     * we're going to lie about the detected topology to notify the user
+     * to check the host capabilities for the actual topology. */
+    if ((nodeinfo->nodes *
+         nodeinfo->sockets *
+         nodeinfo->cores *
+         nodeinfo->threads) != (nodeinfo->cpus + offline)) {
+        nodeinfo->nodes = 1;
+        nodeinfo->sockets = nodeinfo->cpus;
+        nodeinfo->cores = 1;

Would it be any better to swap these and expose sockets == 1 and cores
== cpus when we fake things?

Hm the first version actually had it that way. I think we should tweak the docs in that case.


When we're lying it's not that much important in which field we're doing so. Normally it could have some performance implications but in this case everything is off anyways.

Viktor in his review is having the same opinion plus a good catch about reporting the topology also with offline processors, so that thing should be changed.

Do you have any suggestions about tweaking the docs? I can't think of anything good right now.


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]