[libvirt] [PATCH] Fix nodeinfo output on PPC64 KVM hosts

Andrea Bolognani abologna at redhat.com
Mon Jun 29 12:11:54 UTC 2015


On Thu, 2016-06-25 at 12:03 +0530, Shivaprasad G Bhat wrote:

Hi Shivaprasad,

here are a few comments about your patch.

>
[...]
> +# if HAVE_LINUX_KVM_H
> +#  include <linux/kvm.h>

This line should be moved to the top of the file, with all the other
#include directives.

> +        kvmfd = open("/dev/kvm", O_RDONLY);
> +        if (kvmfd >= 0) {
> +#  ifdef KVM_CAP_PPC_SMT
> +            valid_ppc64_kvmhost_mode = true;
> +            /* For Phyp and KVM based guests the ioctl for
> KVM_CAP_PPC_SMT
> +             * returns zero and both primary and secondary threads
> will be
> +             * online. Dont try to alter or interpret anything here.
> +             */
> +            threads_per_subcore = ioctl(kvmfd, KVM_CHECK_EXTENSION,
> KVM_CAP_PPC_SMT);
> +            if (threads_per_subcore == 0)
> +                valid_ppc64_kvmhost_mode = false;
> +#  endif
> +        }
> +        VIR_FORCE_CLOSE(kvmfd);
> +# endif
> +        rewinddir(cpudir);
> +        lastonline = -1;
> +        while (valid_ppc64_kvmhost_mode &&
> +               ((direrr = virDirRead(cpudir, &cpudirent, node)) >
> 0)) {
> +            if (sscanf(cpudirent->d_name, "cpu%u", &cpu) != 1)
> +                continue;
> +            if ((online = virNodeGetCpuValue(node, cpu, "online",
> 1)) < 0)
> +                goto cleanup;
> +
> +            if (online) {
> +                if (lastonline != -1 &&
> +                    (cpu - lastonline) < threads_per_subcore) {
> +                    valid_ppc64_kvmhost_mode = false; /* if any of
> secondaries
> +                                                       * online */

If a comment is long enough to span two lines, it should probably not
start at the end of a statement but rather be above it.

> +                    break;
> +                }
> +                lastonline = cpu;
> +            }
> +        }
> +    }
> +

Enclosing this whole block within the #if and #ifdef directives would
make
the code cleaner; on the other hand, since the skeleton of the loop is
identical to the one a few lines above, maybe you could move the part
reading from /dev/kvm closer to the top and merge the two loops
together?

>      /* allocate cpu maps for each socket */
>      if (VIR_ALLOC_N(core_maps, sock_max) < 0)
>          goto cleanup;
> @@ -473,6 +539,7 @@ virNodeParseNode(const char *node,
>
>      /* iterate over all CPU's in the node */
>      rewinddir(cpudir);
> +    lastonline = -1;
>      while ((direrr = virDirRead(cpudir, &cpudirent, node)) > 0) {
>          if (sscanf(cpudirent->d_name, "cpu%u", &cpu) != 1)
>              continue;
> @@ -480,6 +547,17 @@ virNodeParseNode(const char *node,
>          if ((online = virNodeGetCpuValue(node, cpu, "online", 1)) <
> 0)
>              goto cleanup;
>
> +        if (ARCH_IS_PPC64(arch) && valid_ppc64_kvmhost_mode) {

valid_ppc64_kvmhost_mode can be true iff ARCH_IS_PPC64(arch), so this
condition is kinda redundant.

> +            if (online) {
> +                lastonline = cpu;
> +            } else if (lastonline != -1 &&
> +                       (cpu-lastonline) < threads_per_subcore) {
> +                processors++; /* Count only those secondaries whose
> primary
> +                                 subcore is online */

Same as above.

> +                continue;
> +            }
> +        }
> +
>          if (!online) {
>              (*offline)++;
>              continue;
> @@ -528,6 +606,13 @@ virNodeParseNode(const char *node,
>              *cores = core;
>      }
>
> +    if (ARCH_IS_PPC64(arch) && valid_ppc64_kvmhost_mode) {

Same as above.

> +        /* The actual host threads per core is
> +         * multiple of the subcores_per_core(i.e variable *threads
> in this case)
> +         * and threads_per_subcore.
> +         */
> +        *threads = *threads * threads_per_subcore;
> +    }
>      ret = processors;
>
>   cleanup:
>

Overall, the code makes sense and seems to work from my limited
testing.

However, this change completely breaks the topology information
reported
by `virsh capabilities`. Here's the output on a host with
subcores_per_core=1, without the patch:

    <topology>
      <cells num='4'>
        <cell id='0'>
          <memory unit='KiB'>67108864</memory>
          <pages unit='KiB' size='64'>1048576</pages>
          <pages unit='KiB' size='16384'>0</pages>
          <pages unit='KiB' size='16777216'>0</pages>
          <distances>
            <sibling id='0' value='10'/>
            <sibling id='1' value='20'/>
            <sibling id='16' value='40'/>
            <sibling id='17' value='40'/>
          </distances>
          <cpus num='5'>
            <cpu id='0' socket_id='0' core_id='32' siblings='0'/>
            <cpu id='8' socket_id='0' core_id='40' siblings='8'/>
            <cpu id='16' socket_id='0' core_id='48' siblings='16'/>
            <cpu id='24' socket_id='0' core_id='96' siblings='24'/>
            <cpu id='32' socket_id='0' core_id='104' siblings='32'/>
          </cpus>
        </cell>
        <cell id='1'>
          <memory unit='KiB'>67108864</memory>
          <pages unit='KiB' size='64'>1048576</pages>
          <pages unit='KiB' size='16384'>0</pages>
          <pages unit='KiB' size='16777216'>0</pages>
          <distances>
            <sibling id='0' value='20'/>
            <sibling id='1' value='10'/>
            <sibling id='16' value='40'/>
            <sibling id='17' value='40'/>
          </distances>
          <cpus num='5'>
            <cpu id='40' socket_id='1' core_id='160' siblings='40'/>
            <cpu id='48' socket_id='1' core_id='168' siblings='48'/>
            <cpu id='56' socket_id='1' core_id='176' siblings='56'/>
            <cpu id='64' socket_id='1' core_id='224' siblings='64'/>
            <cpu id='72' socket_id='1' core_id='232' siblings='72'/>
          </cpus>
        </cell>
        <cell id='16'>
          <memory unit='KiB'>67108864</memory>
          <pages unit='KiB' size='64'>1048576</pages>
          <pages unit='KiB' size='16384'>0</pages>
          <pages unit='KiB' size='16777216'>0</pages>
          <distances>
            <sibling id='0' value='40'/>
            <sibling id='1' value='40'/>
            <sibling id='16' value='10'/>
            <sibling id='17' value='20'/>
          </distances>
          <cpus num='5'>
            <cpu id='80' socket_id='16' core_id='2088' siblings='80'/>
            <cpu id='88' socket_id='16' core_id='2096' siblings='88'/>
            <cpu id='96' socket_id='16' core_id='2144' siblings='96'/>
            <cpu id='104' socket_id='16' core_id='2152'
siblings='104'/>
            <cpu id='112' socket_id='16' core_id='2160'
siblings='112'/>
          </cpus>
        </cell>
        <cell id='17'>
          <memory unit='KiB'>67108864</memory>
          <pages unit='KiB' size='64'>1048576</pages>
          <pages unit='KiB' size='16384'>0</pages>
          <pages unit='KiB' size='16777216'>0</pages>
          <distances>
            <sibling id='0' value='40'/>
            <sibling id='1' value='40'/>
            <sibling id='16' value='20'/>
            <sibling id='17' value='10'/>
          </distances>
          <cpus num='5'>
            <cpu id='120' socket_id='17' core_id='2208'
siblings='120'/>
            <cpu id='128' socket_id='17' core_id='2216'
siblings='128'/>
            <cpu id='136' socket_id='17' core_id='2272'
siblings='136'/>
            <cpu id='144' socket_id='17' core_id='2280'
siblings='144'/>
            <cpu id='152' socket_id='17' core_id='2288'
siblings='152'/>
          </cpus>
        </cell>
      </cells>
    </topology>

With the patch:

    <topology>
      <cells num='1'>
        <cell id='0'>
          <memory unit='KiB'>263635136</memory>
          <cpus num='160'>
            <cpu id='0' socket_id='0' core_id='0' siblings='0'/>
            <cpu id='8' socket_id='0' core_id='1' siblings='8'/>
            <cpu id='16' socket_id='0' core_id='2' siblings='16'/>
            <cpu id='24' socket_id='0' core_id='3' siblings='24'/>
            <cpu id='32' socket_id='0' core_id='4' siblings='32'/>
            <cpu id='0'/>
            <!-- The above line is repeated 155 times -->
          </cpus>
        </cell>
      </cells>
    </topology>

This issue needs to be addressed before the changes can be considered
for inclusion. I'd also highly recommend writing tests cases in order
to prevent regressions in the future.

Cheers.

--
Andrea Bolognani
Software Engineer - Virtualization Team




More information about the libvir-list mailing list