[libvirt] [PATCHv7 17/18] qemu: Report cache occupancy (CMT) with domstats

John Ferlan jferlan at redhat.com
Mon Nov 5 20:04:09 UTC 2018



On 10/22/18 4:01 AM, Wang Huaqiang wrote:
> Adding the interface in qemu to report CMT statistic information
> through command 'virsh domstats --cpu-total'.
> 
> Below is a typical output:
> 
>          # virsh domstats 1 --cpu-total
>          Domain: 'ubuntu16.04-base'
>            ...
>            cpu.cache.monitor.count=2
>            cpu.cache.monitor.0.name=vcpus_1
>            cpu.cache.monitor.0.vcpus=1
>            cpu.cache.monitor.0.bank.count=2
>            cpu.cache.monitor.0.bank.0.id=0
>            cpu.cache.monitor.0.bank.0.bytes=4505600
>            cpu.cache.monitor.0.bank.1.id=1
>            cpu.cache.monitor.0.bank.1.bytes=5586944
>            cpu.cache.monitor.1.name=vcpus_4-6
>            cpu.cache.monitor.1.vcpus=4,5,6
>            cpu.cache.monitor.1.bank.count=2
>            cpu.cache.monitor.1.bank.0.id=0
>            cpu.cache.monitor.1.bank.0.bytes=17571840
>            cpu.cache.monitor.1.bank.1.id=1
>            cpu.cache.monitor.1.bank.1.bytes=29106176
> 
> Signed-off-by: Wang Huaqiang <huaqiang.wang at intel.com>
> ---
>  src/libvirt-domain.c     |   9 ++
>  src/libvirt_private.syms |   1 +
>  src/qemu/qemu_driver.c   | 229 +++++++++++++++++++++++++++++++++++++++++++++++
>  src/util/virresctrl.c    | 130 +++++++++++++++++++++++++++
>  src/util/virresctrl.h    |  12 +++
>  5 files changed, 381 insertions(+)
> 

I have a feeling I'll be asking for this to be split up a bit, but let's
see...  There's *util, *qemu, and *API code in the same patch.

> diff --git a/src/libvirt-domain.c b/src/libvirt-domain.c
> index 7690339..4895f9f 100644
> --- a/src/libvirt-domain.c
> +++ b/src/libvirt-domain.c
> @@ -11345,6 +11345,15 @@ virConnectGetDomainCapabilities(virConnectPtr conn,
>   *     "cpu.user" - user cpu time spent in nanoseconds as unsigned long long.
>   *     "cpu.system" - system cpu time spent in nanoseconds as unsigned long
>   *                    long.
> + *     "cpu.cache.monitor.count" - tocal cache monitoring groups
> + *     "cpu.cache.monitor.M.name" - name of cache monitoring group 'M'
> + *     "cpu.cache.monitor.M.vcpus" - vcpus for cache monitoring group 'M'
> + *     "cpu.cache.monitor.M.bank.count" - total bank number of cache monitoring
> + *                    group 'M'
> + *     "cpu.cache.monitor.M.bank.N.id" - OS assigned cache bank id for cache
> + *                    'N' in cache monitoring group 'M'
> + *     "cpu.cache.monitor.M.bank.N.bytes" - monitor's cache occupancy of cache
> + *                    bank 'N' in cache monitoring group 'M'
>   *
>   * VIR_DOMAIN_STATS_BALLOON:
>   *     Return memory balloon device information.
> diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms
> index 91801ed..4551767 100644
> --- a/src/libvirt_private.syms
> +++ b/src/libvirt_private.syms
> @@ -2683,6 +2683,7 @@ virResctrlInfoNew;
>  virResctrlMonitorAddPID;
>  virResctrlMonitorCreate;
>  virResctrlMonitorDeterminePath;
> +virResctrlMonitorGetCacheOccupancy;
>  virResctrlMonitorGetID;
>  virResctrlMonitorIsRunning;
>  virResctrlMonitorNew;
> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
> index 12a5f8f..9828118 100644
> --- a/src/qemu/qemu_driver.c
> +++ b/src/qemu/qemu_driver.c
> @@ -102,6 +102,7 @@
>  #include "virnuma.h"
>  #include "dirname.h"
>  #include "netdev_bandwidth_conf.h"
> +#include "c-ctype.h"

Ahh.. this one's a red flag...  Says to me that something should move to
a util function...

>  
>  #define VIR_FROM_THIS VIR_FROM_QEMU
>  
> @@ -19698,6 +19699,230 @@ typedef enum {
>  #define HAVE_JOB(flags) ((flags) & QEMU_DOMAIN_STATS_HAVE_JOB)
>  
>  
> +/* In terms of the output of virBitmapFormat, both '1-3' and '1,3' are valid
> + * outputs and represent different vcpu set. But it is not easy to
> + * differentiate these two vcpu set formats at first glance.
> + *
> + * This function could be used to clear this ambiguity, it substitutes all '-'
> + * with ',' while generates semantically correct vcpu set.
> + * e.g. vcpu set string '1-3' will be replaced by string '1,2,3'. */
> +static char *
> +qemuDomainVcpuFormatHelper(const char *vcpus)
> +{
> +    size_t i = 0;
> +    int last = 0;
> +    int start = 0;
> +    char * tmp = NULL;
> +    bool firstnum = true;
> +    const char *cur = vcpus;
> +    virBuffer buf = VIR_BUFFER_INITIALIZER;
> +    char *ret = NULL;
> +
> +    if (virStringIsEmpty(cur))
> +        return NULL;
> +
> +    while (*cur != '\0') {
> +        if (!c_isdigit(*cur))

Explains the #include, but in the long run, I don't think this method is
worth the effort since all you're doing is printing the format in the
output. Is there other libvirt generated output for cpu stats that does
a similar conversion?  If so, share code, if not drop this.

No need to rearrange the range someone else entered and we've
succesfully parsed in some manner. I think the output should look like
the XML output unless there's some compelling reason to change it which
I don't see.

>From above the:

>            cpu.cache.monitor.1.name=vcpus_4-6
>            cpu.cache.monitor.1.vcpus=4,5,6

would then be:

>            cpu.cache.monitor.1.name=vcpus_4-6
>            cpu.cache.monitor.1.vcpus=4-6

right?

I don't even want to think about someone who does "7,4-6"...

> +            goto cleanup;
> +
> +        if (virStrToLong_i(cur, &tmp, 10, &start) < 0)
> +            goto cleanup;
> +        if (start < 0)
> +            goto cleanup;
> +
> +        cur = tmp;
> +
> +        virSkipSpaces(&cur);
> +
> +        if (*cur == ',' || *cur == 0) {
> +            if (!firstnum)
> +                virBufferAddChar(&buf, ',');
> +            virBufferAsprintf(&buf, "%d", start);
> +            firstnum = false;
> +        } else if (*cur == '-') {
> +            cur++;
> +            virSkipSpaces(&cur);
> +
> +            if (virStrToLong_i(cur, &tmp, 10, &last) < 0)
> +                goto cleanup;
> +
> +            if (last < start)
> +                goto cleanup;
> +            cur = tmp;
> +
> +            for (i = start; i <= last; i++) {
> +                if (!firstnum)
> +
> +                    virBufferAddChar(&buf, ',');
> +                virBufferAsprintf(&buf, "%ld", i);
> +                firstnum = 0;
> +            }
> +
> +            virSkipSpaces(&cur);
> +        }
> +
> +        if (*cur == ',') {
> +            cur++;
> +            virSkipSpaces(&cur);
> +        } else if (*cur == 0) {
> +            break;
> +        } else {
> +            goto cleanup;
> +        }
> +    }
> +
> +    ret = virBufferContentAndReset(&buf);
> + cleanup:
> +    virBufferFreeAndReset(&buf);
> +    return ret;
> +}
> +
> +
> +static int
> +qemuDomainGetStatsCpuResource(virQEMUDriverPtr driver ATTRIBUTE_UNUSED,
> +                              virDomainObjPtr dom,
> +                              virDomainStatsRecordPtr record,
> +                              int *maxparams,
> +                              unsigned int privflags ATTRIBUTE_UNUSED,
> +                              virResctrlMonitorType restag)

Why bother passing ATTRIBUTE_UNUSED ?

> +{
> +    char param_name[VIR_TYPED_PARAM_FIELD_LENGTH];
> +    virDomainResctrlMonDefPtr domresmon = NULL;
> +    virResctrlMonitorStatsPtr stats = NULL;
> +    size_t nstats = 0;

since these are local to the:

+        for (j = 0; j < resctrl->nmonitors; j++) {

loop and there's local's in that for loop, maybe these two should move
there too...

> +    virDomainResctrlDefPtr resctrl = NULL;
> +    unsigned int nmonitors = 0;
> +    unsigned int imonitor = 0;
> +    const char *restype = NULL;
> +    char *rawvcpus = NULL;
> +    char *vcpus = NULL;
> +    size_t i = 0;
> +    size_t j = 0;
> +    int ret = -1;
> +
> +    if (!virDomainObjIsActive(dom))
> +        return 0;
> +
> +    if (restag == VIR_RESCTRL_MONITOR_TYPE_CACHE) {
> +        restype = "cache";
> +    } else {
> +        VIR_DEBUG("Invalid CPU resource type");

VIR_DEBUG? virReportError would seem to be more appropriate. As I've
stated before there will be an error for some reason generated.

> +        return -1;
> +    }
> +
> +    for (i = 0; i < dom->def->nresctrls; i++) {
> +        resctrl = dom->def->resctrls[i];
> +
> +        for (j = 0; j < resctrl->nmonitors; j++) {
> +            domresmon = resctrl->monitors[j];
> +            if (virResctrlMonitorIsRunning(domresmon->instance) &&
> +                domresmon->tag == restag)

Knowing how heavy wait *IsRunning is, the order of checking should be
reversed at the very least.

> +                nmonitors++;
> +        }
> +    }
> +
> +    if (nmonitors) {
> +        snprintf(param_name, VIR_TYPED_PARAM_FIELD_LENGTH,
> +                 "cpu.%s.monitor.count", restype);
> +        if (virTypedParamsAddUInt(&record->params,
> +                                  &record->nparams,
> +                                  maxparams,
> +                                  param_name,
> +                                  nmonitors) < 0)
> +            goto cleanup;
> +    }

All that above to ensure nmonitors == resctrl->nmonitors??? I think
there's a lot of unnecessary stuff.
> +
> +    for (i = 0; i < dom->def->nresctrls; i++) {
> +        resctrl = dom->def->resctrls[i];
> +
> +        for (j = 0; j < resctrl->nmonitors; j++) {
> +            size_t l = 0;
> +            virResctrlMonitorPtr monitor = resctrl->monitors[j]->instance;
> +            const char *id = virResctrlMonitorGetID(monitor);
> +
> +            if (!id)
> +                goto cleanup;

I would think you have other problems in this case. Besides there's no
error message associated with this.

> +
> +            domresmon = resctrl->monitors[j];
> +
> +            if (!virResctrlMonitorIsRunning(domresmon->instance))
> +                continue;

Oh my one call for each iteration, ouch - highly inefficient.

> +
> +            if (!(rawvcpus = virBitmapFormat(domresmon->vcpus)))
> +                goto cleanup;
> +
> +            vcpus = qemuDomainVcpuFormatHelper(rawvcpus);
> +            if (!vcpus)
> +                goto cleanup;
> +
> +            if (virResctrlMonitorGetCacheOccupancy(monitor, &stats,
> +                                                   &nstats) < 0)
> +                goto cleanup;
> +
> +            snprintf(param_name, VIR_TYPED_PARAM_FIELD_LENGTH,
> +                     "cpu.%s.monitor.%d.name", restype, imonitor);
> +            if (virTypedParamsAddString(&record->params,
> +                                        &record->nparams,
> +                                        maxparams,
> +                                        param_name,
> +                                        id) < 0)
> +                goto cleanup;
> +
> +            snprintf(param_name, VIR_TYPED_PARAM_FIELD_LENGTH,
> +                     "cpu.%s.monitor.%d.vcpus", restype, imonitor);
> +
> +            if (virTypedParamsAddString(&record->params,
> +                                        &record->nparams,
> +                                        maxparams,
> +                                        param_name,
> +                                        vcpus) < 0)
> +                goto cleanup;
> +
> +            snprintf(param_name, VIR_TYPED_PARAM_FIELD_LENGTH,
> +                     "cpu.%s.monitor.%d.bank.count", restype, imonitor);
> +            if (virTypedParamsAddUInt(&record->params,
> +                                      &record->nparams,
> +                                      maxparams,
> +                                      param_name,
> +                                      nstats) < 0)
> +                goto cleanup;
> +
> +            for (l = 0; l < nstats; l++) {
> +                snprintf(param_name, VIR_TYPED_PARAM_FIELD_LENGTH,
> +                         "cpu.%s.monitor.%d.bank.%ld.id",
> +                         restype, imonitor, l);
> +                if (virTypedParamsAddUInt(&record->params,
> +                                          &record->nparams,
> +                                          maxparams,
> +                                          param_name,
> +                                          stats[l].id) < 0)
> +                    goto cleanup;
> +
> +
> +                snprintf(param_name, VIR_TYPED_PARAM_FIELD_LENGTH,
> +                         "cpu.%s.monitor.%d.bank.%ld.bytes",
> +                         restype, imonitor, l);
> +                if (virTypedParamsAddUInt(&record->params,
> +                                          &record->nparams,
> +                                          maxparams,
> +                                          param_name,
> +                                          stats[l].val) < 0)
> +                    goto cleanup;
> +            }
> +
> +            VIR_FREE(stats);

               nstats = 0;

too!

> +            VIR_FREE(vcpus);
> +            imonitor++;
> +        }
> +    }
> +
> +    ret = 0;
> + cleanup:
> +    VIR_FREE(vcpus);
> +    return ret;
> +}
> +
> +
>  static int
>  qemuDomainGetStatsCpu(virQEMUDriverPtr driver ATTRIBUTE_UNUSED,
>                        virDomainObjPtr dom,
> @@ -19735,6 +19960,10 @@ qemuDomainGetStatsCpu(virQEMUDriverPtr driver ATTRIBUTE_UNUSED,
>              return -1;
>      }
>  
> +    if (qemuDomainGetStatsCpuResource(driver, dom, record, maxparams, privflags,
> +                                      VIR_RESCTRL_MONITOR_TYPE_CACHE) < 0)
> +        return -1;
> +
>      return 0;
>  }
>  
> diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c
> index fa3e6e9..02e7095 100644
> --- a/src/util/virresctrl.c
> +++ b/src/util/virresctrl.c
> @@ -2713,3 +2713,133 @@ virResctrlMonitorIsRunning(virResctrlMonitorPtr monitor)
>  
>      return ret;
>  }
> +
> +
> +static int
> +virResctrlMonitorStatsSorter(const void *a,
> +                             const void *b)
> +{
> +    return ((virResctrlMonitorStatsPtr)a)->id
> +        - ((virResctrlMonitorStatsPtr)b)->id;
> +}
> +
> +
> +/*
> + * virResctrlMonitorGetStats
> + *
> + * @monitor: The monitor that the statistic data will be retrieved from.
> + * @resource: The name for resource name. 'llc_occupancy' for cache resource.
> + * "mbm_total_bytes" and "mbm_local_bytes" for memory bandwidth resource.
> + * @stats: Array of virResctrlMonitorStatsPtr for holding cache or memory
> + * bandwidth usage data.
> + * @nstats: A size_t pointer to hold the returned array length of @stats
> + *
> + * Get cache or memory bandwidth utilization information.
> + *
> + * Returns 0 on success, -1 on error.
> + */
> +static int
> +virResctrlMonitorGetStats(virResctrlMonitorPtr monitor,
> +                          const char *resource,
> +                          virResctrlMonitorStatsPtr *stats,
> +                          size_t *nstats)
> +{
> +    int rv = -1;
> +    int ret = -1;
> +    DIR *dirp = NULL;
> +    char *datapath = NULL;
> +    struct dirent *ent = NULL;
> +    virResctrlMonitorStatsPtr stat = NULL;
> +
> +    if (!monitor) {
> +        virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
> +                       _("Invalid resctrl monitor"));
> +        return -1;
> +    }
> +
> +    if (virAsprintf(&datapath, "%s/mon_data", monitor->path) < 0)
> +        return -1;
> +
> +    if (virDirOpen(&dirp, datapath) < 0)
> +        goto cleanup;
> +
> +    *nstats = 0;
> +    while (virDirRead(dirp, &ent, datapath) > 0) {
> +        char *node_id = NULL;
> +
> +        if (VIR_ALLOC(stat) < 0)
> +            goto cleanup;
> +
> +        /* Looking for directory that contains resource utilization
> +         * information file. The directory name is arranged in format
> +         * "mon_<node_name>_<node_id>". For example, "mon_L3_00" and
> +         * "mon_L3_01" are two target directories for a two nodes system
> +         * with resource utilization data file for each node respectively.
> +         */
> +        if (ent->d_type != DT_DIR)
> +            continue;
> +
> +        /* Looking for directory has a prefix 'mon_L' */
> +        if (!(node_id = STRSKIP(ent->d_name, "mon_L")))
> +            continue;
> +
> +        /* Looking for directory has another '_' */
> +        node_id = strchr(node_id, '_');
> +        if (!node_id)
> +            continue;
> +
> +        /* Skip the character '_' */
> +        if (!(node_id = STRSKIP(node_id, "_")))
> +            continue;
> +
> +        /* The node ID number should be here, parsing it. */
> +        if (virStrToLong_uip(node_id, NULL, 0, &stat->id) < 0)
> +            goto cleanup;
> +
> +        rv = virFileReadValueUint(&stat->val, "%s/%s/%s", datapath,
> +                                  ent->d_name, resource);
> +        if (rv == -2) {
> +            virReportError(VIR_ERR_INTERNAL_ERROR,
> +                           _("File '%s/%s/%s' does not exist."),
> +                           datapath, ent->d_name, resource);
> +        }
> +        if (rv < 0)
> +            goto cleanup;
> +
> +        if (VIR_APPEND_ELEMENT(*stats, *nstats, *stat) < 0)
> +            goto cleanup;
> +    }
> +
> +    /* Sort in id's ascending order */
> +    qsort(*stats, *nstats, sizeof(*stat), virResctrlMonitorStatsSorter);

Coverity notes that *stats could be NULL - if the above while loop gets
nothing...

> +
> +    ret = 0;
> + cleanup:
> +    VIR_FREE(datapath);
> +    VIR_FREE(stat);
> +    VIR_DIR_CLOSE(dirp);
> +    return ret;
> +}
> +
> +
> +/*
> + * virResctrlMonitorGetCacheOccupancy
> + *
> + * @monitor: The monitor that the statistic data will be retrieved from.
> + * @caches: Array of virResctrlMonitorStatsPtr for receiving cache occupancy
> + * data. Caller is responsible to free this array.
> + * @ncaches: A size_t pointer to hold the returned array length of @caches

They're @stats and @nstats elsewhere, be consistent.

> + *
> + * Get cache or memory bandwidth utilization information.
> + *
> + * Returns 0 on success, -1 on error.
> + */
> +
> +int
> +virResctrlMonitorGetCacheOccupancy(virResctrlMonitorPtr monitor,
> +                                   virResctrlMonitorStatsPtr *caches,
> +                                   size_t *ncaches)

They're @stats and @nstats elsewhere, be consistent.

> +{
> +    return virResctrlMonitorGetStats(monitor, "llc_occupancy",
> +                                     caches, ncaches);
> +}
> diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h
> index 8d8fdc2..004c40e 100644
> --- a/src/util/virresctrl.h
> +++ b/src/util/virresctrl.h
> @@ -191,6 +191,13 @@ virResctrlInfoGetMonitorPrefix(virResctrlInfoPtr resctrl,
>  typedef struct _virResctrlMonitor virResctrlMonitor;
>  typedef virResctrlMonitor *virResctrlMonitorPtr;
>  
> +typedef struct _virResctrlMonitorStats virResctrlMonitorStats;
> +typedef virResctrlMonitorStats *virResctrlMonitorStatsPtr;
> +struct _virResctrlMonitorStats {
> +    unsigned int id;
> +    unsigned int val;
> +};
> +
>  virResctrlMonitorPtr
>  virResctrlMonitorNew(void);
>  
> @@ -222,4 +229,9 @@ virResctrlMonitorRemove(virResctrlMonitorPtr monitor);
>  
>  bool
>  virResctrlMonitorIsRunning(virResctrlMonitorPtr monitor);
> +
> +int
> +virResctrlMonitorGetCacheOccupancy(virResctrlMonitorPtr monitor,
> +                                   virResctrlMonitorStatsPtr *caches,
> +                                   size_t *ncaches);

They're @stats and @nstats elsewhere, be consistent.

>  #endif /*  __VIR_RESCTRL_H__ */
> 

I'm sure there's more I'll pick up later, but you should have plenty for
now. As to splitting - I think some splitting could be done, I'll leave
it up to you... Lots of change across multiple areas - ask yourself -
can something be separated out?

John




More information about the libvir-list mailing list