[libvirt] [PATCH] vz: handle gracefully races on undefining domain

Nikolay Shirokovskiy nshirokovskiy at virtuozzo.com
Mon Jun 20 08:56:01 UTC 2016



On 20.06.2016 10:40, Nikolay Shirokovskiy wrote:
>   This patch is not critical but nice to have. The original motivation
> was error message in logs on undefining domain thru vz driver.
> Undefine procedure drops domain lock while waiting for detaching
> disks vz sdk call. Meanwhile vz sdk event domain-config-changed
> arrives, its handler finds domain and is blocked waiting for job
> condition. After undefine API call finishes event processing procedes
> and tries to refreshes domain config thru existing vz sdk domain handle.
> Domain does not exists anymore and event processing fails. Everything
> is fine we just don't want to see error message in log for this
> particular case.
> 
>   Fortunately domain has flag that domain is removed from list. This
> also imply that vz sdk domain is also undefined. Thus if we check
> for this flag right after domain is locked again on accuiring
> job condition we gracefully handle this situation.
> 
>   Actually the race can happen in other situations too. Any
> time we wait for job condition in mutualy exclusive job in
> time when we acquire it vz sdk domain can cease to exist.
> So instead of general internal error we can return domain
> not found which is easier to handle. We don't need to patch
> other places in mutually exclusive jobs where domain lock
> is dropped as if job is started domain can't be undefine
> by mutually exclusive undefine job.
> 
>   In case of API calls that are not jobs (load snapshots etc) we'd
> better to check if domain exists every time domain lock is
> reacquired. Fortunately these calls drop domain lock only
> once to call appropriate vz sdk API call.
> 
>   The code of this patch is quite similar to qemu driver checks
> for is domain is active after acquiring a job. The difference
> only while qemu domain is operational while process is active
> vz domain is operational while domain exists.
> 
> Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy at virtuozzo.com>
> ---
>  src/vz/vz_driver.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++
>  src/vz/vz_sdk.c    | 32 +++++++++++++++++++++++++++++---
>  2 files changed, 75 insertions(+), 3 deletions(-)
> 

[snip]

Another case of hindsight is 20/20 )) I think the below changes are 
not good and should be dropped from the patch. We can't rely on 'removing' 
flag in case other then mutually exclusive jobs.

>  
> +static void
> +vzDomainObjCheckExist(virDomainObjPtr dom)
> +{
> +    char uuidstr[VIR_UUID_STRING_BUFLEN];
> +
> +    if (!dom->removing)
> +        return;
> +
> +    virUUIDFormat(dom->def->uuid, uuidstr);
> +    virReportError(VIR_ERR_NO_DOMAIN,
> +                   _("no domain with matching uuid '%s' (%s)"),
> +                   uuidstr, dom->def->name);
> +}
> +
>  int
>  prlsdkDomainManagedSaveRemove(virDomainObjPtr dom)
>  {
> @@ -3959,8 +3979,10 @@ prlsdkDomainManagedSaveRemove(virDomainObjPtr dom)
>      PRL_HANDLE job;
>  
>      job = PrlVm_DropSuspendedState(privdom->sdkdom);
> -    if (PRL_FAILED(waitDomainJob(job, dom)))
> +    if (PRL_FAILED(waitDomainJob(job, dom))) {
> +        vzDomainObjCheckExist(dom);
>          return -1;
> +    }
>  
>      return 0;
>  }
> @@ -4394,8 +4416,10 @@ prlsdkLoadSnapshots(virDomainObjPtr dom)
>      char *treexml = NULL;
>  
>      job = PrlVm_GetSnapshotsTreeEx(privdom->sdkdom, PGST_WITHOUT_SCREENSHOTS);
> -    if (PRL_FAILED(getDomainJobResult(job, dom, &result)))
> +    if (PRL_FAILED(getDomainJobResult(job, dom, &result))) {
> +        vzDomainObjCheckExist(dom);
>          goto cleanup;
> +    }
>  
>      if (!(treexml = prlsdkGetStringParamVar(PrlResult_GetParamAsString, result)))
>          goto cleanup;
> @@ -4427,8 +4451,10 @@ int prlsdkDeleteSnapshot(virDomainObjPtr dom, const char *uuid, bool children)
>      PRL_HANDLE job;
>  
>      job = PrlVm_DeleteSnapshot(privdom->sdkdom, uuid, children);
> -    if (PRL_FAILED(waitDomainJob(job, dom)))
> +    if (PRL_FAILED(waitDomainJob(job, dom))) {
> +        vzDomainObjCheckExist(dom);
>          return -1;
> +    }
>  
>      return 0;
>  }
> 




More information about the libvir-list mailing list