[libvirt] [PATCH 1/2] qemu: Resolve issue with virsh schedinfo for non running domain

Fri Jun 7 15:05:32 UTC 2013

On 06/07/2013 06:14 AM, Martin Kletzander wrote:
> On 05/30/2013 02:24 PM, John Ferlan wrote:
>> Since commit '632f78ca' the 'virsh schedinfo <domain>' command returns:
>>
>> Scheduler      : Unknown
>> error: Requested operation is not valid: cgroup CPU controller is not mounted
>>
>> Prior to that change a non running domain would return:
>>
>> Scheduler      : posix
>> cpu_shares     : 0
>> vcpu_period    : 0
>> vcpu_quota     : 0
>> emulator_period: 0
>> emulator_quota : 0
>>
>> This change will result in the following:
>>
>> Scheduler      : posix
>> cpu_shares     : 0
>>
>> The code sequence is a 'virGetSchedulerType()' which returns the "*params"
>> for a subsequent 'virDomainGetSchedulerParameters[Flags]()' call. The latter
>> requires at least 1 'nparam' to be provided/returned.
> 
> I can't find where the nparams is required to be >= 1, but that might
> changed since you've posted this patch.  I changed the '*nparams = 1' to
> '0' and it works perfectly (returns only the scheduler type, so it's
> visible that nothing is set).
> 

The documentation for virDomainGetSchedulerParameters[Flags] has:

*
* Get all scheduler parameters. On input, @nparams gives the size of the
* @params array; on output, @nparams gives how many slots were filled
* with parameter information, which might be less but will not exceed
* the input value.  @nparams cannot be 0.
*

Also, the virDomainGetSchedulerParameters checks nparams > 0:

    virCheckPositiveArgGoto(*nparams, error);

Te reason why virsh schedinfo printed just the scheduler type is because
virsh-domain.c checks for 'nparams' before calling the
virDomainGetSchedulerParametersFlags() API.

>> ---
>>  src/qemu/qemu_driver.c | 11 ++++++++++-
>>  1 file changed, 10 insertions(+), 1 deletion(-)
>>
>> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
>> index 4a76f14..7292149 100644
>> --- a/src/qemu/qemu_driver.c
>> +++ b/src/qemu/qemu_driver.c
>> @@ -6899,12 +6899,20 @@ static char *qemuDomainGetSchedulerType(virDomainPtr dom,
>>      }
>>      priv = vm->privateData;
>>  
>> +    /* Domain not running or no cgroup */
>> +    if (!priv->cgroup) {
>> +        if (nparams)
>> +            *nparams = 1;
>> +        goto cleanup;
>> +    }
>> +
>>      if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPU)) {
>>          virReportError(VIR_ERR_OPERATION_INVALID,
>>                         "%s", _("cgroup CPU controller is not mounted"));
>>          goto cleanup;
>>      }
>>  
>> +
>>      if (nparams) {
>>          rc = qemuGetCpuBWStatus(priv->cgroup);
>>          if (rc < 0)
>> @@ -6915,11 +6923,12 @@ static char *qemuDomainGetSchedulerType(virDomainPtr dom,
>>              *nparams = 5;
>>      }
>>  
>> +cleanup:
>>      ignore_value(VIR_STRDUP(ret, "posix"));
>>  
> 
> You can get to here even when the domain doesn't exist and in that case
> NULL should be returned to indicate an error.  It would also skip an
> error on problem in qemuGetCpuBWStatus(), but there is none printed, so
> at least we would return the scheduler type.
> 

Oh yeah, right duh.  I was trying to cover the case where !priv->cgroup
without copying that same line when "!priv->cgroup".

Setting *nparams = 1 was taken from the viewpoint of the return 0 case
from the [lxc|qemu}GetCpuBWStatus() calls (since cgroups aren't
started). For a non running guest, I assumed that none of the 5 values
really meant anything, but I can see the logic that would get/set that
count once prior to vm startup and then repeatedly assume/use that value
to call virGetSchedulerParametersFlags() during the lifetime of whatever
is monitoring the vm, so I understand why setting it to 5 is
valid/desired. Once the guest is running, if there was an issue with the
[lxc|qemu]GetCPUBWStatus() call, then only the 'cpu_shares' would get
returned due to checks in the GetSchedulerParametersFlags() APIs.

Thus, If I'm reading the responses thus far correctly, then

When cgroups aren't enabled for the domain (eg, not active):

    qemuDomainGetSchedulerType() would return "posix" and 5 for *nparams
    lxcDomainGetSchedulerType() would return "posix" and 3 for *nparams

e.g., for qemu:

    /* Domain not running or no cgroup */
    if (!priv->cgroup) {
        if (nparams)
            *nparams = 5;
        ignore_value(VIR_STRDUP(ret, "posix"));
        goto cleanup;
    }

(and moving the ignore_value(VIR_STRDUP(ret, "posix")); before cleanup:

For the [lxc|qemu]DomainGetSchedulerType() API's, the only time
[lxc|qemu]GetCpuBWStatus() would be called is if the domain is active,
e.g., we get past the virCgroupPathOfController() call.

The [qemu|lxc]DomainGetSchedulerParametersFlags() API's need adjustments
as well because currently they make the same call to
[lxc|qemu]GetCpuBWStatus() regardless of whether the domain is active or
not and regardless of what 'flags' are set (current, live, config).
This code I was less sure of how to handle what gets returned.

Each has a call such as (from qemu):

    if (*nparams > 1) {
        rc = qemuGetCpuBWStatus(priv->cgroup);
        if (rc < 0)
            goto cleanup;
        cpu_bw_status = !!rc;
    }

which sets 'cpu_bw_status' based on the rc, which is set as follows:

 * Return 1 when CFS bandwidth is supported, 0 when CFS bandwidth is not
 * supported, -1 on error.

For the non active case, this returns 0 and thus clears the
cpu_bw_status flag which affects the code deciding how many
configuration parameters to return :

    if (flags & VIR_DOMAIN_AFFECT_CONFIG) {
        shares = persistentDef->cputune.shares;
        if (*nparams > 1 && cpu_bw_status) {
...

Again, given how I'm reading the responses, the check here for
cpu_bw_status could then be removed since if the vm is not active or the
"--config" flag is used, we return all 5 parameters regardless of
whether the vm is running and regardless of whether the cgroup is found.
Another option would be to add a check just prior that would set the
cpu_bw_status = true when we don't have a cgroup and we're just getting
config data, such as:

    if (!priv->cgroup && (flags & VIR_DOMAIN_AFFECT_CONFIG))
        cpu_bw_status = true;

One way or another the 'cpu_bw_status' needs to be set to true before
"goto out" so that all the data points are copied into the output
"params[]" array.

For "current" or "live" data of a running vm with cgroups mounted,
subsequent checks will return an error, 1, 3, or 5 elements depending on
success or failure of live checks.

Obviously, I'll post a v2, but I would prefer to get it "more right"
without too much more "churn".

Tks,

John

>> -cleanup:
>>      if (vm)
>>          virObjectUnlock(vm);
>> +
>>      return ret;
>>  }
>>  
>>
> 
> This patch fixes the problem, but may create a new one.  Also 'virsh
> schedinfo --config <domain>' is different for running and shutoff
> domain, but I'm willing to overlook that since it is less problematic
> than what happened before.
> 
> Martin
>