[libvirt] [PATCH v3] lxc: fix show the wrong xml when guest start failed

John Ferlan jferlan at redhat.com
Thu Feb 12 20:45:25 UTC 2015



On 02/04/2015 08:42 AM, Luyao Huang wrote:
> https://bugzilla.redhat.com/show_bug.cgi?id=1176503
> 
> When guest start failed, libvirt will keep the current vm->def,
> this will make a issue that we cannot get a right xml after guest
> start failed. And don't call the stop/release hook to do some
> other clean work.
> 
> Call virLXCProcessCleanup to help us clean the source and call
> the hooks if start a vm failed
> 
> Signed-off-by: Luyao Huang <lhuang at redhat.com>
> ---
> v2: use virLXCProcessCleanup to free the source and call the hook.
> v3: rework the patch to suit the virLXCProcessStart code changed.
> 
>  src/lxc/lxc_process.c | 76 ++++++++++++++++++++++-----------------------------
>  1 file changed, 32 insertions(+), 44 deletions(-)
> 



FWIW:

v1 review starts here:

http://www.redhat.com/archives/libvir-list/2015-January/msg00112.html


At first I was 'concerned' about the number of changes, but after doing
more investigation - I think the patch is correct in general; however, I
think it needs some adjustments as there are really two bugs here...
I'll send an update shortly for comparison/review...

Essentially though - I think the console check*s* could be done earlier
before we get into other setup that requires going thru "cleanup:" (or
what error: was). That's "one bug" - which is a configuration error
which will prevent startup. Since it's easier to check early, provide an
error, and return -1 - that's what I think is cleaner solution.

Second, the other bug which you were trying to clean up at the same time
is that existing cleanup paths don't properly clean all things up. The
error path worked only when/once the container started and some
pre-container start cleanup was done, but a few important ones were
missing - so that's a separate patch.

I also will add a patch to add/modify the debugging to help future such
efforts...

BTW:
It seems commit id '88a1b542' from Cédric Bosdonnat caused yet another
merge conflict and seemingly from the description might have been in the
same area, but alas a quick check shows, it wasn't the same issue.

I'll leave the notes I was developing on this patch prior to just biting
the bullet and reformatting so you can perhaps "see" my thought process.

> diff --git a/src/lxc/lxc_process.c b/src/lxc/lxc_process.c
> index 01da344..1a6cfbb 100644
> --- a/src/lxc/lxc_process.c
> +++ b/src/lxc/lxc_process.c
> @@ -1022,6 +1022,7 @@ int virLXCProcessStart(virConnectPtr conn,
>      virCgroupPtr selfcgroup;
>      int status;
>      char *pidfile = NULL;
> +    bool need_stop = false;
>  


I think the check:

    if (vm->def->nconsoles == 0) {
        virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
                       _("At least one PTY console is required"));
        goto cleanup;
    }

Should perhaps be moved to just before this code:

    nttyFDs = vm->def->nconsoles;
    if (VIR_ALLOC_N(ttyFDs, nttyFDs) < 0)
        goto cleanup;
    for (i = 0; i < vm->def->nconsoles; i++)
        ttyFDs[i] = -1;

and that would "fix" the bug from the bz... as well as ensuring we don't
have a "if (VIR_ALLOC_N(ttdFDs, 0) < 0)"... In fact is there a reason
why that check cannot move much higher and be after the cgroup checks
which return -1?  While t it, why not move the following too:

    for (i = 0; i < vm->def->nconsoles; i++) {
        if (vm->def->consoles[i]->source.type != VIR_DOMAIN_CHR_TYPE_PTY) {
            virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
                           _("Only PTY console types are supported"));
            goto cleanup;
        }
    }

and then remove that from the loop later on.

This way checks that go to cleanup are a result of some error in
processing rather than some container configuration error...

Then the remainder of the code could be perhaps patch 2 which is fixing
a different, although related problem.

>      if (virCgroupNewSelf(&selfcgroup) < 0)
>          return -1;
> @@ -1259,6 +1260,7 @@ int virLXCProcessStart(virConnectPtr conn,
>          goto cleanup;
>      }
>  
> +    need_stop = true;
>      priv->stopReason = VIR_DOMAIN_EVENT_STOPPED_FAILED;
>      priv->wantReboot = false;
>      vm->def->id = vm->pid;

I was going to suggest using vm->pid (e.g. > 0) instead of 'need_stop';
however, I couldn't convince myself that would be 'safe enough'...

> @@ -1272,20 +1274,20 @@ int virLXCProcessStart(virConnectPtr conn,
>  
>      if (VIR_CLOSE(handshakefds[1]) < 0) {
>          virReportSystemError(errno, "%s", _("could not close handshake fd"));
> -        goto error;
> +        goto cleanup;
>      }
>  
>      if (virCommandHandshakeWait(cmd) < 0)
> -        goto error;
> +        goto cleanup;
>  
>      /* Write domain status to disk for the controller to
>       * read when it starts */
>      if (virDomainSaveStatus(driver->xmlopt, cfg->stateDir, vm) < 0)
> -        goto error;
> +        goto cleanup;
>  
>      /* Allow the child to exec the controller */
>      if (virCommandHandshakeNotify(cmd) < 0)
> -        goto error;
> +        goto cleanup;
>  
>      if (virAtomicIntInc(&driver->nactive) == 1 && driver->inhibitCallback)
>          driver->inhibitCallback(true, driver->inhibitOpaque);
> @@ -1298,7 +1300,7 @@ int virLXCProcessStart(virConnectPtr conn,
>                             _("guest failed to start: %s"), out);
>          }
>  
> -        goto error;
> +        goto cleanup;
>      }
>  
>      /* We know the cgroup must exist by this synchronization
> @@ -1310,13 +1312,13 @@ int virLXCProcessStart(virConnectPtr conn,
>                                    vm->def->resource->partition :
>                                    NULL,
>                                    -1, &priv->cgroup) < 0)
> -        goto error;
> +        goto cleanup;
>  
>      if (!priv->cgroup) {
>          virReportError(VIR_ERR_INTERNAL_ERROR,
>                         _("No valid cgroup for machine %s"),
>                         vm->def->name);
> -        goto error;
> +        goto cleanup;
>      }
>  
>      /* And we can get the first monitor connection now too */
> @@ -1329,17 +1331,17 @@ int virLXCProcessStart(virConnectPtr conn,
>              virReportError(VIR_ERR_INTERNAL_ERROR,
>                             _("guest failed to start: %s"), ebuf);
>          }
> -        goto error;
> +        goto cleanup;
>      }
>  
>      if (autoDestroy &&
>          virCloseCallbacksSet(driver->closeCallbacks, vm,
>                               conn, lxcProcessAutoDestroy) < 0)
> -        goto error;
> +        goto cleanup;
>  
>      if (virDomainObjSetDefTransient(caps, driver->xmlopt,
>                                      vm, false) < 0)
> -        goto error;
> +        goto cleanup;
>  
>      /* We don't need the temporary NIC names anymore, clear them */
>      virLXCProcessCleanInterfaces(vm->def);
> @@ -1358,47 +1360,38 @@ int virLXCProcessStart(virConnectPtr conn,
>           * If the script raised an error abort the launch
>           */
>          if (hookret < 0)
> -            goto error;
> +            goto cleanup;
>      }
>  
>      rc = 0;
>  
>   cleanup:
> -    if (rc != 0 && !err)
> -        err = virSaveLastError();
> -    virCommandFree(cmd);
>      if (VIR_CLOSE(logfd) < 0) {
>          virReportSystemError(errno, "%s", _("could not close logfile"));
>          rc = -1;
>      }
> -    for (i = 0; i < nveths; i++) {
> -        if (rc != 0 && veths[i])
> -            ignore_value(virNetDevVethDelete(veths[i]));
> -        VIR_FREE(veths[i]);
> -    }
>      if (rc != 0) {
> -        if (vm->newDef) {
> -            virDomainDefFree(vm->newDef);
> -            vm->newDef = NULL;
> -        }
> -        if (priv->monitor) {
> -            virObjectUnref(priv->monitor);
> -            priv->monitor = NULL;
> -        }
> -        virDomainConfVMNWFilterTeardown(vm);
> -
> -        virSecurityManagerRestoreAllLabel(driver->securityManager,
> -                                          vm->def, false);
> -        virSecurityManagerReleaseLabel(driver->securityManager, vm->def);
> -        /* Clear out dynamically assigned labels */
> -        if (vm->def->nseclabels &&
> -            vm->def->seclabels[0]->type == VIR_DOMAIN_SECLABEL_DYNAMIC) {
> -            VIR_FREE(vm->def->seclabels[0]->model);
> -            VIR_FREE(vm->def->seclabels[0]->label);
> -            VIR_FREE(vm->def->seclabels[0]->imagelabel);
> -            VIR_DELETE_ELEMENT(vm->def->seclabels, 0, vm->def->nseclabels);
> +        err = virSaveLastError();
> +        if (need_stop) {
> +            virLXCProcessStop(driver, vm, VIR_DOMAIN_SHUTOFF_FAILED);
> +        } else {
> +            virSecurityManagerRestoreAllLabel(driver->securityManager,
> +                                              vm->def, false);
> +            virSecurityManagerReleaseLabel(driver->securityManager, vm->def);
> +            /* Clear out dynamically assigned labels */
> +            if (vm->def->nseclabels &&
> +                vm->def->seclabels[0]->type == VIR_DOMAIN_SECLABEL_DYNAMIC) {

NOTE: This is where the main conflict arises with the now new check for
'|| clearSeclabel' ...

> +                VIR_FREE(vm->def->seclabels[0]->model);
> +                VIR_FREE(vm->def->seclabels[0]->label);
> +                VIR_FREE(vm->def->seclabels[0]->imagelabel);
> +                VIR_DELETE_ELEMENT(vm->def->seclabels, 0, vm->def->nseclabels);
> +            }
> +            virLXCProcessCleanup(driver, vm, VIR_DOMAIN_SHUTOFF_FAILED);

Somewhat related - virLXCProcessCleanup has a VIR_DEBUG to start
"Stopping..." - perhaps that should be changed to "Cleanup..." since
"virLXCProcessStop already has "Stopping..." - that might help
differentiate for someone debugging in the future...


John
>          }
>      }
> +    virCommandFree(cmd);
> +    for (i = 0; i < nveths; i++)
> +        VIR_FREE(veths[i]);
>      for (i = 0; i < nttyFDs; i++)
>          VIR_FORCE_CLOSE(ttyFDs[i]);
>      VIR_FREE(ttyFDs);
> @@ -1415,11 +1408,6 @@ int virLXCProcessStart(virConnectPtr conn,
>      }
>  
>      return rc;
> -
> - error:
> -    err = virSaveLastError();
> -    virLXCProcessStop(driver, vm, VIR_DOMAIN_SHUTOFF_FAILED);
> -    goto cleanup;
>  }
>  
>  struct virLXCProcessAutostartData {
> 




More information about the libvir-list mailing list