[PATCH 3/3] qemu_driver: use qemuMonitorQueryStats to extract halt poll time

Martin Kletzander mkletzan at redhat.com
Tue Jun 28 20:25:54 UTC 2022


On Tue, Jun 28, 2022 at 10:15:28PM +0530, Amneesh Singh wrote:
>On Tue, Jun 28, 2022 at 05:23:11PM +0200, Michal Prívozník wrote:
>> On 6/24/22 10:14, Amneesh Singh wrote:
>> > Related: https://gitlab.com/libvirt/libvirt/-/issues/276
>> >
>> > This patch uses qemuMonitorQueryStats to query "halt_poll_success_ns"
>> > and "halt_poll_fail_ns" for every vCPU. The respective values for each
>> > vCPU are then added together.
>> >
>> > Signed-off-by: Amneesh Singh <natto at weirdnatto.in>
>> > ---
>> >  src/qemu/qemu_driver.c | 48 ++++++++++++++++++++++++++++++++++++++----
>> >  1 file changed, 44 insertions(+), 4 deletions(-)
>> >
>> > diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
>> > index 3b5c3db6..0a2be6d3 100644
>> > --- a/src/qemu/qemu_driver.c
>> > +++ b/src/qemu/qemu_driver.c
>> > @@ -18057,10 +18057,50 @@ qemuDomainGetStatsCpuHaltPollTime(virDomainObj *dom,
>> >  {
>> >      unsigned long long haltPollSuccess = 0;
>> >      unsigned long long haltPollFail = 0;
>> > -    pid_t pid = dom->pid;
>> > +    qemuDomainObjPrivate *priv = dom->privateData;
>> > +    bool canQueryStats = virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_QUERY_STATS);
>>
>> Is this variable really needed? I mean, we can just:
>>
>> if (virQEMUCapsGet(...) {
>>
>> } else {
>>
>> }
>>
>> But if you want to avoid long lines, then perhaps rename the variable to
>> queryStatsCap? This way it's more obvious what the variable reflects.
>> Stats can be queried in both cases ;-)
>Sure, that sounds doable :)
>>
>> >
>> > -    if (virHostCPUGetHaltPollTime(pid, &haltPollSuccess, &haltPollFail) < 0)
>> > -        return 0;
>> > +    if (!canQueryStats) {
>> > +        pid_t pid = dom->pid;
>> > +
>> > +        if (virHostCPUGetHaltPollTime(pid, &haltPollSuccess, &haltPollFail) < 0)
>> > +            return 0;
>> > +    } else {
>> > +        size_t i;
>> > +        qemuMonitorQueryStatsTargetType target = QEMU_MONITOR_QUERY_STATS_TARGET_VCPU;
>> > +        qemuMonitorQueryStatsProvider *provider = NULL;
>> > +        g_autoptr(GPtrArray) providers = NULL;
>> > +        g_autoptr(GPtrArray) queried_stats = NULL;
>> > +        const char *success_str = "halt_poll_success_ns";
>> > +        const char *fail_str = "halt_poll_fail_ns";
>> > +
>> > +        provider = qemuMonitorQueryStatsProviderNew(QEMU_MONITOR_QUERY_STATS_PROVIDER_KVM);
>> > +        provider->names = g_new0(char *, 3);
>> > +        provider->names[0] = g_strdup(success_str), provider->names[1] = g_strdup(fail_str);
>>
>> I'm starting to wonder whether this is a good interface. These ->names[]
>> array is never changed, so maybe we can have it as a NULL terminated
>> list of constant strings? For instance:
>>
>> provider = qemuMonitorQueryStatsProviderNew();
>> provider->names = {"halt_poll_success_ns", "halt_poll_fail_ns", NULL};
>Well, cannot really assign char ** with a scalar initialization, but
>what might work is something like
>
>const char *names[] = { success_str, fail_str, NULL };
>
>provider = qemuMonitorQueryStatsProviderNew(QEMU_MONITOR_QUERY_STATS_PROVIDER_KVM);
>provider->names = g_strdupv((char **) names);
>

I think what Michal was trying to say is that since you do not change
the array anywhere, there is no need for that to be a dynamically
allocated array that needs to be freed.  I, however, am not 100% if you
are going to need this to be dynamic and whether you will be changing
these arrays.

>Another thought was using GStrvBuilder but it is not avaiable as per
>GLIB_VERSION_MAX.
>
>I too think that the current approach is not very nice. A variadic
>function similar to g_strv_builder_add_many that initializes a
>char ** would be nice.
>>

If that is something that helps, then we can write it ourselves and only
use our implementation if glib is too old.

>> > +
>> > +        providers = g_ptr_array_new_full(1, (GDestroyNotify) qemuMonitorQueryStatsProviderFree);
>> > +        g_ptr_array_add(providers, provider);
>> > +
>> > +        queried_stats = qemuMonitorQueryStats(priv->mon, target, NULL, providers);
>>
>> This issues monitor command without proper checks. Firstly, you need to
>> check whether job acquiring (that s/false/true/ change done in the last
>> hunk) was successful and the domain is still running:
>>
>> if (!HAVE_JOB(privflags) || !virDomainObjIsActive(dom)) {
>>    /* code here */
>> }
>>
>> Then, you need to so called "enter the monitor" and "exit the monitor".
>> So what happens here is that @vm is when this function is called. Now,
>> issuing a monitor command with the domain object lock held is
>> potentially very dangerous because if QEMU takes long time to reply (or
>> it doesn't reply at all) the domain object is locked an can't be
>> destroyed (virsh destroy) - because the first thing that
>> qemuDomainDestroyFlags() does is locking the domain object. Also, you
>> want to make sure no other thread is currently talking on the monitor -
>> so the monitor has to be locked too. We have two convenient functions
>> for these operations:
>>
>> qemuDomainObjEnterMonitor()
>> qemuDomainObjExitMonitor()
>>
>> This is all described in more detail in
>> docs/kbase/internals/qemu-threads.rst.
>>
>> Having said all of this, I think we can fallback to the current behavior
>> if job wasn't acquired successfully. Therefore the condition at the very
>> beginning might look like this:
>>
>> if (queryStatsCap && HAVE_JOB() && virDomainObjIsActive()) {
>>   ...
>>   qemuDomainEnterMonitor();
>>   qemuMonitorQueryStats();
>>   qemuDomainExitMonitor();
>> } else {
>>   virHostCPUGetHaltPollTime();
>> }
>>
>I see that makes sense. I shall do the necessary changes. Thanks for the
>detailed explanation.
>> > +
>> > +        if (!queried_stats)
>> > +            return 0;
>> > +
>> > +        for (i = 0; i < queried_stats->len; i++) {
>> > +            unsigned long long curHaltPollSuccess, curHaltPollFail;
>> > +            GHashTable *cur_table = queried_stats->pdata[i];
>> > +            virJSONValue *success, *fail;
>> > +
>> > +            success = g_hash_table_lookup(cur_table, success_str);
>> > +            fail = g_hash_table_lookup(cur_table, fail_str);
>> > +
>> > +            ignore_value(virJSONValueGetNumberUlong(success, &curHaltPollSuccess));
>> > +            ignore_value(virJSONValueGetNumberUlong(fail, &curHaltPollFail));
>>
>> I don't think this is a safe thing to do. What if either of @success or
>> @fail is NULL? That can happen when QEMU didn't return requested member.
>> True, at that point the 'query-stats' command should have failed, but I
>> think it's more defensive if we checked these pointers. My worry is that
>> virJSONValueGetNumberUlong() dereferences the pointer right away.
>>
>I do understand that this is not an unfounded worry. I shall put the
>checks there.
>> > +
>> > +            haltPollSuccess += curHaltPollSuccess;
>> > +            haltPollFail += curHaltPollFail;
>> > +        }
>> > +    }
>> >
>> >      if (virTypedParamListAddULLong(params, haltPollSuccess, "cpu.haltpoll.success.time") < 0 ||
>> >          virTypedParamListAddULLong(params, haltPollFail, "cpu.haltpoll.fail.time") < 0)
>> > @@ -18915,7 +18955,7 @@ static virQEMUCapsFlags queryDirtyRateRequired[] = {
>> >
>> >  static struct qemuDomainGetStatsWorker qemuDomainGetStatsWorkers[] = {
>> >      { qemuDomainGetStatsState, VIR_DOMAIN_STATS_STATE, false, NULL },
>> > -    { qemuDomainGetStatsCpu, VIR_DOMAIN_STATS_CPU_TOTAL, false, NULL },
>> > +    { qemuDomainGetStatsCpu, VIR_DOMAIN_STATS_CPU_TOTAL, true, NULL },
>> >      { qemuDomainGetStatsBalloon, VIR_DOMAIN_STATS_BALLOON, true, NULL },
>> >      { qemuDomainGetStatsVcpu, VIR_DOMAIN_STATS_VCPU, true, NULL },
>> >      { qemuDomainGetStatsInterface, VIR_DOMAIN_STATS_INTERFACE, false, NULL },
>>
>> Please feel free to object to anything I wrote. Maybe you have more
>> patches written on a local branch that justify your choices. I don't know.
>>
>> Michal
>>
>Thanks for taking the time to review the patches.
>Amneesh


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20220628/a4bb5099/attachment-0001.sig>


More information about the libvir-list mailing list