[libvirt] [PATCH v3 4/6] qemu: Implement period and quota tunable XML configuration and parsing

Adam Litke agl at us.ibm.com
Tue Jul 19 13:27:07 UTC 2011



On 07/18/2011 07:51 PM, Wen Congyang wrote:
> At 07/19/2011 04:59 AM, Adam Litke Write:
>>
>>
>> On 07/18/2011 04:42 AM, Wen Congyang wrote:
>>> +int qemuSetupCgroupForVcpu(struct qemud_driver *driver, virDomainObjPtr vm)
>>> +{
>>> +    virCgroupPtr cgroup = NULL;
>>> +    virCgroupPtr cgroup_vcpu = NULL;
>>> +    qemuDomainObjPrivatePtr priv = vm->privateData;
>>> +    int rc;
>>> +    unsigned int i;
>>> +    unsigned long long period = vm->def->cputune.period;
>>> +    long long quota = vm->def->cputune.quota;
>>> +
>>> +    if (driver->cgroup == NULL)
>>> +        return 0; /* Not supported, so claim success */
>>> +
>>> +    rc = virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 0);
>>> +    if (rc != 0) {
>>> +        virReportSystemError(-rc,
>>> +                             _("Unable to find cgroup for %s"),
>>> +                             vm->def->name);
>>> +        goto cleanup;
>>> +    }
>>> +
>>> +    if (priv->nvcpupids == 0 || priv->vcpupids[0] == vm->pid) {
>>> +        /* If we does not know VCPU<->PID mapping or all vcpu runs in the same
>>> +         * thread, we can not control each vcpu.
>>> +         */
>>> +        if (period || quota) {
>>> +            if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPU)) {
>>> +                if (qemuSetupCgroupVcpuBW(cgroup, period, quota) < 0)
>>> +                    goto cleanup;
>>> +            }
>>> +        }
>>> +        return 0;
>>> +    }
>>
>> I found a problem above.  In the case where we are controlling quota at
>> the domain level cgroup we must multiply the user-specified quota by the
>> number of vcpus in the domain in order to get the same performance as we
>> would with per-vcpu cgroups.  As written, the vm will be essentially
>> capped at 1 vcpu worth of quota regardless of the number of vcpus.  You
>> will also have to apply this logic in reverse when reporting the
>> scheduler statistics so that the quota number is a per-vcpu quantity.
> 
> When quota is 1000, and per-vcpu thread is not active, we can start
> vm successfully. When the per-vcpu thread is active, and the num of
> vcpu is more than 1, we can not start vm if we multiply the user-specified
> quota. It will confuse the user: sometimes the vm can be started, but
> sometimes the vm can not be started with the same configuration.

I am not sure I understand what you mean.  When vcpu threads are active,
the patches work correctly.  It is only when you disable vcpu threads
that you need to multiply the quota by the number of vcpus (since you
are now applying it globally).  A 4 vcpu guest that is started using an
emulator with vcpu threads active will get 4 times the cpu bandwidth as
compared to starting the identical configuration using an emulator
without vcpu threads.  This is because you currently apply the same
quota setting to the full process as you were applying to a single vcpu.

I know that what I am asking for is confusing at first.  The quota value
in a domain XML may not match up with the value actually written to the
cgroup filesystem.  The same applies for the schedinfo API vs. cgroupfs.
 However, my suggestion will result in quotas that match user
expectation.  For a 4 vcpu guest with 50% cpu quota, it is more logical
to set period=500000,quota=250000 without having to know if my qemu
supports vcpu threads.

For example, to limit a guests to 50% CPU we would have these settings
(when period == 500000):

                  1 VCPU    2 VCPU    4 VCPU    8 VCPU
VCPU-threads ON   250000    250000    250000    250000
VCPU-threads OFF  250000    500000   1000000   2000000

With VCPU threads on, the value is applied to each VCPU whereas with
VCPU threads off it is applied globally.  This will yield roughly
equivalent performance regardless of whether the underlying qemu process
enables vcpu threads.

-- 
Adam Litke
IBM Linux Technology Center




More information about the libvir-list mailing list