[libvirt] [PATCH 1/2] qemu: fix recording of vCPU pids for MTTCG

John Ferlan jferlan at redhat.com
Mon Oct 29 21:55:36 UTC 2018



On 10/17/18 10:15 AM, Daniel P. Berrangé wrote:
> MTTCG is the new multi-threaded impl of TCG which follows
> KVM in having one host OS thread per vCPU. Historically
> we have discarded all PIDs reported for TCG guests, but
> we must now selectively honour this data.
> 
> We don't have anything in the domain XML that indicates
> whether a guest is using TCG or MTTCG. While QEMU does
> have an option (-accel tcg,thread=single|multi), it is
> not desirable to expose this in libvirt. QEMU will
> automatically use MTTCG when the host/guest architecture
> pairing is known to be safe. Only developers of QEMU TCG
> have a strong reason to override this logic.
> 
> Thus we use two sanity checks to decide if the vCPU
> PID information is usable. First we see if the PID
> duplicates the main emulator PID, and second we see
> if the PID duplicates any other vCPUs.
> 
> Signed-off-by: Daniel P. Berrangé <berrange at redhat.com>
> ---
>  src/qemu/qemu_domain.c | 81 ++++++++++++++++++++++++++----------------
>  1 file changed, 51 insertions(+), 30 deletions(-)
> 
> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
> index f00f1b3fdb..c7a0c03e3f 100644
> --- a/src/qemu/qemu_domain.c
> +++ b/src/qemu/qemu_domain.c
> @@ -10326,9 +10326,10 @@ qemuDomainRefreshVcpuInfo(virQEMUDriverPtr driver,
>      qemuDomainVcpuPrivatePtr vcpupriv;
>      qemuMonitorCPUInfoPtr info = NULL;
>      size_t maxvcpus = virDomainDefGetVcpusMax(vm->def);
> -    size_t i;
> +    size_t i, j;
>      bool hotplug;
>      bool fast;
> +    bool validTIDs = true;
>      int rc;
>      int ret = -1;
>  
> @@ -10336,6 +10337,8 @@ qemuDomainRefreshVcpuInfo(virQEMUDriverPtr driver,
>      fast = virQEMUCapsGet(QEMU_DOMAIN_PRIVATE(vm)->qemuCaps,
>                            QEMU_CAPS_QUERY_CPUS_FAST);
>  
> +    VIR_DEBUG("Maxvcpus %zu hotplug %d fast query %d", maxvcpus, hotplug, fast);
> +
>      if (qemuDomainObjEnterMonitorAsync(driver, vm, asyncJob) < 0)
>          return -1;
>  
> @@ -10348,39 +10351,57 @@ qemuDomainRefreshVcpuInfo(virQEMUDriverPtr driver,
>      if (rc < 0)
>          goto cleanup;
>  
> +    /*
> +     * The query-cpus[-fast] commands return information
> +     * about the vCPUs, including the OS level PID that
> +     * is executing the vCPU.
> +     *
> +     * For KVM there is always a 1-1 mapping between
> +     * vCPUs and host OS PIDs.
> +     *
> +     * For TCG things are a little more complicated.
> +     *
> +     *  - In some cases the vCPUs will all have the same
> +     *    PID as the main emulator thread.
> +     *  - In some cases the first vCPU will have a distinct
> +     *    PID, but other vCPUs will share the emulator thread
> +     *
> +     * For MTTCG, things work the same as KVM, with each
> +     * vCPU getting its own PID.
> +     *
> +     * We use the Host OS PIDs for doing vCPU pinning
> +     * and reporting. The TCG data reporting will result
> +     * in bad behaviour such as pinning the wrong PID.
> +     * We must thus detect and discard bogus PID info
> +     * from TCG, while still honouring the modern MTTCG
> +     * impl which we can support.
> +     */
> +    for (i = 0; i < maxvcpus && validTIDs; i++) {
> +        if (info[i].tid == vm->pid) {
> +            VIR_DEBUG("vCPU[%zu] PID %llu duplicates process",
> +                      i, (unsigned long long)info[i].tid);
> +            validTIDs = false;
> +        }
> +

If !validTIDs does the next loop matter? IOW:

Should the above section add a "continue;" since the loop exit would
force the exit?

Beyond that the logic and comments look reasonable. I assume since
domain XML doesn't care whether MTTCG or TCG is used and things are
handled under the covers by QEMU that means there's no migration or
save/restore issues. Of course you have a much deeper understanding of
the QEMU code than I do!

The one other question I'd have is should validTIDs setting be done just
once and saved perhaps in the domain private block? There is more than 1
caller (and *Launch can call twice). It's not like it's going to change,
right? So doing the same loop from a hotplug path won't matter nor would
subsequent reconnects or attaches. So perhaps the validTIDs should be a
tristate that only needs to be checked when the value is UNKNOWN. It's
not like the loop is that expensive since it's only numeric comparisons,
so it doesn't matter.  I suppose I can be easily convinced taking this
route would be fine, but figured I'd ask.

John


> +        for (j = 0; j < i; j++) {
> +            if (info[i].tid == info[j].tid) {
> +                VIR_DEBUG("vCPU[%zu] PID %llu duplicates vCPU[%zu]",
> +                          i, (unsigned long long)info[i].tid, j);
> +                validTIDs = false;
> +            }
> +        }
> +
> +        if (validTIDs)
> +            VIR_DEBUG("vCPU[%zu] PID %llu is valid",
> +                      i, (unsigned long long)info[i].tid);
> +    }
> +
> +    VIR_DEBUG("Extracting vCPU information validTIDs=%d", validTIDs);
>      for (i = 0; i < maxvcpus; i++) {
>          vcpu = virDomainDefGetVcpu(vm->def, i);
>          vcpupriv = QEMU_DOMAIN_VCPU_PRIVATE(vcpu);
>  
> -        /*
> -         * Current QEMU *can* report info about host threads mapped
> -         * to vCPUs, but it is not in a manner we can correctly
> -         * deal with. The TCG CPU emulation does have a separate vCPU
> -         * thread, but it runs every vCPU in that same thread. So it
> -         * is impossible to setup different affinity per thread.
> -         *
> -         * What's more the 'query-cpus[-fast]' command returns bizarre
> -         * data for the threads. It gives the TCG thread for the
> -         * vCPU 0, but for vCPUs 1-> N, it actually replies with
> -         * the main process thread ID.
> -         *
> -         * The result is that when we try to set affinity for
> -         * vCPU 1, it will actually change the affinity of the
> -         * emulator thread :-( When you try to set affinity for
> -         * vCPUs 2, 3.... it will fail if the affinity was
> -         * different from vCPU 1.
> -         *
> -         * We *could* allow vcpu pinning with TCG, if we made the
> -         * restriction that all vCPUs had the same mask. This would
> -         * at least let us separate emulator from vCPUs threads, as
> -         * we do for KVM. It would need some changes to our cgroups
> -         * CPU layout though, and error reporting for the config
> -         * restrictions.
> -         *
> -         * Just disable CPU pinning with TCG until someone wants
> -         * to try to do this hard work.
> -         */
> -        if (vm->def->virtType != VIR_DOMAIN_VIRT_QEMU)
> +        if (validTIDs)
>              vcpupriv->tid = info[i].tid;
>  
>          vcpupriv->socket_id = info[i].socket_id;
> 




More information about the libvir-list mailing list