[PATCH v4 1/1] qemu: add per-vcpu delay stats

Aleksei Zakharov zaharov at selectel.ru
Tue Mar 9 13:15:12 UTC 2021


вт, 9 мар. 2021 г. в 15:35, Michal Privoznik <mprivozn at redhat.com>:

> On 2/19/21 9:08 PM, Aleksei Zakharov wrote:
> > This patch adds delay time (steal time inside guest) to libvirt
> > domain per-vcpu stats. Delay time is an important performance metric.
> > It is a consequence of the overloaded CPU. Knowledge of the delay
> > time of a virtual machine helps to understand if it is affected and
> > estimate the impact.
> >
> > As a result, it is possible to react exactly when needed and
> > rebalance the load between hosts. This is used by cloud providers
> > to provide quality of service, especially when the CPU is
> > oversubscribed.
> >
> > It's more convenient to work with this metric in a context of a
> > libvirt domain. Any monitoring software may use this information.
> >
> > Signed-off-by: Aleksei Zakharov <zaharov at selectel.ru>
> > ---
> >   docs/manpages/virsh.rst |  4 ++++
> >   src/libvirt-domain.c    |  4 ++++
> >   src/qemu/qemu_driver.c  | 44 +++++++++++++++++++++++++++++++++++++++--
> >   3 files changed, 50 insertions(+), 2 deletions(-)
> >
> > diff --git a/docs/manpages/virsh.rst b/docs/manpages/virsh.rst
> > index e3afa48f7b..a2b83ddfaa 100644
> > --- a/docs/manpages/virsh.rst
> > +++ b/docs/manpages/virsh.rst
> > @@ -2295,6 +2295,10 @@ When selecting the *--state* group the following
> fields are returned:
> >   * ``vcpu.<num>.halted`` - virtual CPU <num> is halted: yes or
> >     no (may indicate the processor is idle or even disabled,
> >     depending on the architecture)
> > +* ``vcpu.<num>.delay`` - time the vCPU <num> thread was enqueued by the
> > +  host scheduler, but was waiting in the queue instead of running.
> > +  Exposed to the VM as a steal time.
> > +
> >
> >
> >   *--interface* returns:
> > diff --git a/src/libvirt-domain.c b/src/libvirt-domain.c
> > index 4af0166872..e54d11e513 100644
> > --- a/src/libvirt-domain.c
> > +++ b/src/libvirt-domain.c
> > @@ -11693,6 +11693,10 @@ virConnectGetDomainCapabilities(virConnectPtr
> conn,
> >    *     "vcpu.<num>.halted" - virtual CPU <num> is halted, may indicate
> the
> >    *                           processor is idle or even disabled,
> depending
> >    *                           on the architecture)
> > + *     "vcpu.<num>.delay" - time the vCPU <num> thread was enqueued by
> the
> > + *                          host scheduler, but was waiting in the queue
> > + *                          instead of running. Exposed to the VM as a
> steal
> > + *                          time.
> >    *
> >    * VIR_DOMAIN_STATS_INTERFACE:
> >    *     Return network interface statistics (from domain point of view).
> > diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
> > index b9bbdf8d48..9ec3c8fce7 100644
> > --- a/src/qemu/qemu_driver.c
> > +++ b/src/qemu/qemu_driver.c
> > @@ -1332,6 +1332,34 @@ static char
> *qemuConnectGetCapabilities(virConnectPtr conn) {
> >       return virCapabilitiesFormatXML(caps);
> >   }
> >
> > +static int
> > +qemuGetSchedstatDelay(unsigned long long *cpudelay,
> > +                 pid_t pid, pid_t tid)
> > +{
>
> I know you took inspiration in existing code, but our standards have
> moved since that code was written, a bit.
>
> > +    g_autofree char *proc = NULL;
> > +    unsigned long long oncpu = 0;
> > +    g_autofree FILE *schedstat = NULL;
> > +
> > +    if (tid)
> > +        proc = g_strdup_printf("/proc/%d/task/%d/schedstat", (int)pid,
> (int)tid);
> > +    else
> > +        proc = g_strdup_printf("/proc/%d/schedstat", (int)pid);
> > +
> > +    schedstat = fopen(proc, "r");
> > +    if (!schedstat) {
> > +        virReportError(VIR_ERR_INTERNAL_ERROR,
> > +                       _("Unable to open schedstat file at '%s'"),
> > +                       proc);
>
> I'd expect this to be an error. Except for the case when the file
> doesn't exist.
>
> > +    }
> > +    if (fscanf(schedstat, "%llu %llu", &oncpu, cpudelay) < 2) {
> > +        virReportError(VIR_ERR_INTERNAL_ERROR,
> > +                       _("Unable to parse schedstat info at '%s'"),
> > +                       proc);
>
>
> And this has to be error always.
>
> > +    }
> > +
> > +    VIR_FORCE_FCLOSE(schedstat);
> > +    return 0;
>
> I'm fixing this function a bit.
>
> > +}
>
> The rest looks okay. I'm pushing it.
>
> Reviewed-by: Michal Privoznik <mprivozn at redhat.com>
>
> Congratulations on your first libvirt contribution!
>
> Michal
>
> Awesome, thank you!

-- 
Best Regards,
Aleksei Zakharov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20210309/ff629103/attachment-0001.htm>


More information about the libvir-list mailing list