[libvirt] [PATCH 24/34] qemu: Refactor qemuDomainHotplugVcpus

Thu Nov 26 12:02:14 UTC 2015

On Mon, Nov 23, 2015 at 16:39:54 -0500, John Ferlan wrote:
> 
> 
> On 11/20/2015 10:22 AM, Peter Krempa wrote:
> > Refactor the code flow so that 'exit_monitor:' can be removed.
> > 
> > This patch also moves the auditing and setting of the new vCPU count
> > right to the place where the hotplug happens, since it's possible that
> > the hotplug succeeds and adds a cpu while other stuff fails.
> > 
> > Lastly, failures of qemuMonitorGetCPUInfo are now reported rather than
> > ignored. The function retuns 0 if it "successfully" detected 0 threads.
> > ---
> >  src/qemu/qemu_driver.c | 54 +++++++++++++++++++++++---------------------------
> >  1 file changed, 25 insertions(+), 29 deletions(-)
> > 
> 
> Damn - should have peeked ahead...
> 
> > diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
> > index 9f0e3a3..9e0e334 100644
> > --- a/src/qemu/qemu_driver.c
> > +++ b/src/qemu/qemu_driver.c
> > @@ -4698,41 +4698,46 @@ qemuDomainHotplugVcpus(virQEMUDriverPtr driver,
> >  {
> >      qemuDomainObjPrivatePtr priv = vm->privateData;
> >      int ret = -1;
> > +    int rc;
> >      int oldvcpus = virDomainDefGetVCpus(vm->def);
> > -    int vcpus = oldvcpus;
> >      pid_t *cpupids = NULL;
> > -    int ncpupids;
> > +    int ncpupids = 0;
> >      virCgroupPtr cgroup_vcpu = NULL;
> >      char *mem_mask = NULL;
> >      virDomainNumatuneMemMode mem_mode;
> > 
> >      qemuDomainObjEnterMonitor(driver, vm);
> > 
> > -    if (qemuMonitorSetCPU(priv->mon, vcpu, true) < 0)
> > -        goto exit_monitor;
> > -
> > -    vcpus++;
> > +    rc = qemuMonitorSetCPU(priv->mon, vcpu, true);
> > 
> > -    /* After hotplugging the CPUs we need to re-detect threads corresponding
> > -     * to the virtual CPUs. Some older versions don't provide the thread ID
> > -     * or don't have the "info cpus" command (and they don't support multiple
> > -     * CPUs anyways), so errors in the re-detection will not be treated
> > -     * fatal */
> > -    if ((ncpupids = qemuMonitorGetCPUInfo(priv->mon, &cpupids)) <= 0) {
> > -        virResetLastError();
> > -        ret = 0;
> > -        goto exit_monitor;
> > -    }
> > +    if (rc == 0)
> > +        ncpupids = qemuMonitorGetCPUInfo(priv->mon, &cpupids);
> > 
> >      if (qemuDomainObjExitMonitor(driver, vm) < 0)
> 
> This could overwrite a qemuMonitorGetCPUInfo error, but that's no

Since that would happen only if the domain managed to die when we were
in the monitor, the reported error will probably be even more
descriptive than the IO error from sending the message to the VM.

By the way qemuDomainObjExitMonitor() does not overwrite the error in
this case since it does check whether we have a error reported at this
point so it will actually report the error from qemuMonitorGetCPUInfo if
it happened.

> different than we current could do (though we don't ResetLast on
> GetCPUInfo failure).
> 
> >          goto cleanup;
> 
> But if we leave here, then we know we're still active thus the
> adjustment from cleanup to call SetVcpus only if still active is
> satisfied...
> 
> > 
> > -    if (ncpupids != vcpus) {
> > +    virDomainAuditVcpu(vm, oldvcpus, oldvcpus + 1, "update", rc == 0);
> 
> If the ExitMonitor fails, then we won't Audit

I'm not entirely sure it would make sense to audit anything at the point
when the VM is dead. The audit log already contains the entry that the
VM died and freed all resources so auditing that the vCPU count change
failed is somewhat irrelevant IMO.

> 
> > +
> > +    if (rc < 0)
> > +        goto cleanup;
> > +
> > +    ignore_value(virDomainDefSetVCpus(vm->def, oldvcpus + 1));
> 
> Why not just :
> 
>     if (virDomainDefSetVCpus(vm->def, oldvcpus + 1) < 0 ||
>         ncpupids < 0)
>         goto cleanup;
> 
> I would *hope* that we don't fail SetVcpus at this point - at least we
> can avoid the pointless ignore_value

Currently and also with the planed changes virDomainDefSetVcpus can't
fail at this point, since we already checked that the requested CPU
count was in bounds of when it would fail.

> 
> ACK - as long as we can audit on failure to ExitMonitor... Whether you
> feel the GetCPUInfo error is worth saving/sending is up to you. The last
> comment is purely an observation.
> 
> John
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: Digital signature
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20151126/8ab5caa1/attachment-0001.sig>