[libvirt] [PATCH 1/2] qemu: fix recording of vCPU pids for MTTCG

John Ferlan jferlan at redhat.com
Tue Oct 30 11:30:52 UTC 2018



On 10/30/18 5:25 AM, Daniel P. Berrangé wrote:
> On Mon, Oct 29, 2018 at 05:55:36PM -0400, John Ferlan wrote:
>>
>>
>> On 10/17/18 10:15 AM, Daniel P. Berrangé wrote:
>>> MTTCG is the new multi-threaded impl of TCG which follows
>>> KVM in having one host OS thread per vCPU. Historically
>>> we have discarded all PIDs reported for TCG guests, but
>>> we must now selectively honour this data.
>>>
>>> We don't have anything in the domain XML that indicates
>>> whether a guest is using TCG or MTTCG. While QEMU does
>>> have an option (-accel tcg,thread=single|multi), it is
>>> not desirable to expose this in libvirt. QEMU will
>>> automatically use MTTCG when the host/guest architecture
>>> pairing is known to be safe. Only developers of QEMU TCG
>>> have a strong reason to override this logic.
>>>
>>> Thus we use two sanity checks to decide if the vCPU
>>> PID information is usable. First we see if the PID
>>> duplicates the main emulator PID, and second we see
>>> if the PID duplicates any other vCPUs.
>>>
>>> Signed-off-by: Daniel P. Berrangé <berrange at redhat.com>
>>> ---
>>>  src/qemu/qemu_domain.c | 81 ++++++++++++++++++++++++++----------------
>>>  1 file changed, 51 insertions(+), 30 deletions(-)
>>>
>>> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
>>> index f00f1b3fdb..c7a0c03e3f 100644
>>> --- a/src/qemu/qemu_domain.c
>>> +++ b/src/qemu/qemu_domain.c
>>> @@ -10326,9 +10326,10 @@ qemuDomainRefreshVcpuInfo(virQEMUDriverPtr driver,
>>>      qemuDomainVcpuPrivatePtr vcpupriv;
>>>      qemuMonitorCPUInfoPtr info = NULL;
>>>      size_t maxvcpus = virDomainDefGetVcpusMax(vm->def);
>>> -    size_t i;
>>> +    size_t i, j;
>>>      bool hotplug;
>>>      bool fast;
>>> +    bool validTIDs = true;
>>>      int rc;
>>>      int ret = -1;
>>>  
>>> @@ -10336,6 +10337,8 @@ qemuDomainRefreshVcpuInfo(virQEMUDriverPtr driver,
>>>      fast = virQEMUCapsGet(QEMU_DOMAIN_PRIVATE(vm)->qemuCaps,
>>>                            QEMU_CAPS_QUERY_CPUS_FAST);
>>>  
>>> +    VIR_DEBUG("Maxvcpus %zu hotplug %d fast query %d", maxvcpus, hotplug, fast);
>>> +
>>>      if (qemuDomainObjEnterMonitorAsync(driver, vm, asyncJob) < 0)
>>>          return -1;
>>>  
>>> @@ -10348,39 +10351,57 @@ qemuDomainRefreshVcpuInfo(virQEMUDriverPtr driver,
>>>      if (rc < 0)
>>>          goto cleanup;
>>>  
>>> +    /*
>>> +     * The query-cpus[-fast] commands return information
>>> +     * about the vCPUs, including the OS level PID that
>>> +     * is executing the vCPU.
>>> +     *
>>> +     * For KVM there is always a 1-1 mapping between
>>> +     * vCPUs and host OS PIDs.
>>> +     *
>>> +     * For TCG things are a little more complicated.
>>> +     *
>>> +     *  - In some cases the vCPUs will all have the same
>>> +     *    PID as the main emulator thread.
>>> +     *  - In some cases the first vCPU will have a distinct
>>> +     *    PID, but other vCPUs will share the emulator thread
>>> +     *
>>> +     * For MTTCG, things work the same as KVM, with each
>>> +     * vCPU getting its own PID.
>>> +     *
>>> +     * We use the Host OS PIDs for doing vCPU pinning
>>> +     * and reporting. The TCG data reporting will result
>>> +     * in bad behaviour such as pinning the wrong PID.
>>> +     * We must thus detect and discard bogus PID info
>>> +     * from TCG, while still honouring the modern MTTCG
>>> +     * impl which we can support.
>>> +     */
>>> +    for (i = 0; i < maxvcpus && validTIDs; i++) {
>>> +        if (info[i].tid == vm->pid) {
>>> +            VIR_DEBUG("vCPU[%zu] PID %llu duplicates process",
>>> +                      i, (unsigned long long)info[i].tid);
>>> +            validTIDs = false;
>>> +        }
>>> +
>>
>> If !validTIDs does the next loop matter? IOW:
>>
>> Should the above section add a "continue;" since the loop exit would
>> force the exit?
> 
> Mostly I wanted to ensure that we logged all the vCPU pids and it
> won't impose an unreasonable performance impact by doing so.
> 

Those are only VIR_DEBUG's though, so who really sees them?  Perhaps
using VIR_WARN would be different.

>> Beyond that the logic and comments look reasonable. I assume since
>> domain XML doesn't care whether MTTCG or TCG is used and things are
>> handled under the covers by QEMU that means there's no migration or
>> save/restore issues. Of course you have a much deeper understanding of
>> the QEMU code than I do!
>>
>> The one other question I'd have is should validTIDs setting be done just
>> once and saved perhaps in the domain private block? There is more than 1
>> caller (and *Launch can call twice). It's not like it's going to change,
>> right? So doing the same loop from a hotplug path won't matter nor would
>> subsequent reconnects or attaches. So perhaps the validTIDs should be a
>> tristate that only needs to be checked when the value is UNKNOWN. It's
>> not like the loop is that expensive since it's only numeric comparisons,
>> so it doesn't matter.  I suppose I can be easily convinced taking this
>> route would be fine, but figured I'd ask.
> 
> AFAIK, this can only be called once per running VM for each libvirtd
> run, so I dont see need to do any more advanced caching.
> 

I looked at/for callers of qemuDomainRefreshVcpuInfo - this one stuck
out once the domain is started:

....
qemuProcessLaunch()
...
    VIR_DEBUG("setting up hotpluggable cpus");
    if (qemuDomainHasHotpluggableStartupVcpus(vm->def)) {
        if (qemuDomainRefreshVcpuInfo(driver, vm, asyncJob, false) < 0)
            goto cleanup;

        if (qemuProcessValidateHotpluggableVcpus(vm->def) < 0)
            goto cleanup;

        if (qemuProcessSetupHotpluggableVcpus(driver, vm, asyncJob) < 0)
            goto cleanup;
    }

    VIR_DEBUG("Refreshing VCPU info");
    if (qemuDomainRefreshVcpuInfo(driver, vm, asyncJob, false) < 0)
        goto cleanup;

...

but this is also called during hotplug add/remove and of course
attach/reconnect (which I'd expect).

John




More information about the libvir-list mailing list