[libvirt] [PATCH] storage: netfs: Handle backend errors

Mon Apr 14 18:13:55 UTC 2014

On 04/14/2014 12:47 PM, Ján Tomko wrote:
> On 04/09/2014 08:42 PM, John Ferlan wrote:
>> Commit id '18642d10' caused a virt-test regression for NFS backend
>> storage error path checks when running the command:
>>
>>     'virsh find-storage-pool-sources-as netfs Unknown  '
>>
>> when the host did not have Gluster installed. Prior to the commit,
>> the test would fail with the error:
>>
>>     error: internal error: Child process (/usr/sbin/showmount --no-headers
>>     --exports Unknown) unexpected exit status 1: clnt_create: RPC: Unknown host
>>
>> After the commit, the error would be ignored, the call would succeed,
>> and an empty list of pool sources returned. This was tucked into the
>> commit message as an expected outcome.
>>
>> When the target host does not have a GLUSTER_CLI this is a regression
>> over the previous release. Furthermore, even if Gluster CLI was present,
>> but had a failure to get devices, the API would return a failure even if
>> the NFS backend had found devices.
>>
>> Add code to handle the error conditions.
>>   - If NFS fails
>>     - If there is no Gluster CLI, then jump to cleanup/failure.
>>     - If there is a Gluster CLI, then message the NFS
>>       failure and continue with the Gluster checks.
>>
>>   - If Gluster fails
>>     - If NFS had failed, then jump to cleanup/failure.
>>     - Message the Gluster failure and continue on.
>>
>>   - If only one fails, fetch and return a list of source devices
>>     even if it's empty
>>
>> Signed-off-by: John Ferlan <jferlan at redhat.com>
>> ---
>>  src/storage/storage_backend.c    | 14 +++++++++++++
>>  src/storage/storage_backend.h    |  1 +
>>  src/storage/storage_backend_fs.c | 43 +++++++++++++++++++++++++++++++++++-----
>>  3 files changed, 53 insertions(+), 5 deletions(-)
>>
>> diff --git a/src/storage/storage_backend.c b/src/storage/storage_backend.c
>> index b56fefe..4d73902 100644
>> --- a/src/storage/storage_backend.c
>> +++ b/src/storage/storage_backend.c
>> @@ -1643,6 +1643,13 @@ virStorageBackendStablePath(virStoragePoolObjPtr pool,
>>  }
>>  
>>  #ifdef GLUSTER_CLI
>> +bool
>> +virStorageBackendHaveGlusterCLI(void)
>> +{
>> +    return true;
>> +}
>> +
>> +
>>  int
>>  virStorageBackendFindGlusterPoolSources(const char *host,
>>                                          int pooltype,
>> @@ -1721,6 +1728,13 @@ virStorageBackendFindGlusterPoolSources(const char *host,
>>      return ret;
>>  }
>>  #else /* #ifdef GLUSTER_CLI */
>> +bool
>> +virStorageBackendHaveGlusterCLI(void)
>> +{
>> +    return false;
>> +}
>> +
>> +
>>  int
>>  virStorageBackendFindGlusterPoolSources(const char *host ATTRIBUTE_UNUSED,
>>                                          int pooltype ATTRIBUTE_UNUSED,
> 
> IMHO it's more readable to use GLUSTER_CLI directly, since we're not doing any
> runtime detection.
> 

My initial pass did use GLUSTER_CLI in storage_backend_fs.c; however, I
wasn't sure if there was a desire to limit it to storage_backend.c only.
If not, then I'm fine with removing these convenience functions. The
aesthetics of multiple places where #ifdef's occur vs. overhead of
additional API to support.  I suppose part of the answer is how much
more code is there that may need to know whether the CLI is available or
not.

>> diff --git a/src/storage/storage_backend.h b/src/storage/storage_backend.h
>> index 4f95000..41eed97 100644
>> --- a/src/storage/storage_backend.h
>> +++ b/src/storage/storage_backend.h
>> @@ -87,6 +87,7 @@ int virStorageBackendFindFSImageTool(char **tool);
>>  virStorageBackendBuildVolFrom
>>  virStorageBackendFSImageToolTypeToFunc(int tool_type);
>>  
>> +bool virStorageBackendHaveGlusterCLI(void);
>>  int virStorageBackendFindGlusterPoolSources(const char *host,
>>                                              int pooltype,
>>                                              virStoragePoolSourceListPtr list);
>> diff --git a/src/storage/storage_backend_fs.c b/src/storage/storage_backend_fs.c
>> index 4e4a7ae..f3d4a6d 100644
>> --- a/src/storage/storage_backend_fs.c
>> +++ b/src/storage/storage_backend_fs.c
>> @@ -238,9 +238,11 @@ virStorageBackendFileSystemNetFindPoolSourcesFunc(char **const groups,
>>  }
>>  
>>  
>> -static void
>> +static int
>>  virStorageBackendFileSystemNetFindNFSPoolSources(virNetfsDiscoverState *state)
>>  {
>> +    int ret = -1;
>> +
>>      /*
>>       *  # showmount --no-headers -e HOSTNAME
>>       *  /tmp   *
>> @@ -267,9 +269,13 @@ virStorageBackendFileSystemNetFindNFSPoolSources(virNetfsDiscoverState *state)
>>      if (virCommandRunRegex(cmd, 1, regexes, vars,
>>                             virStorageBackendFileSystemNetFindPoolSourcesFunc,
>>                             state, NULL) < 0)
>> -        virResetLastError();
>> +        goto cleanup;
>>  
>> +    ret = 0;
>> +
>> + cleanup:
>>      virCommandFree(cmd);
>> +    return ret;
>>  }
>>  
>>  
>> @@ -289,6 +295,7 @@ virStorageBackendFileSystemNetFindPoolSources(virConnectPtr conn ATTRIBUTE_UNUSE
>>      virStoragePoolSourcePtr source = NULL;
>>      char *ret = NULL;
>>      size_t i;
>> +    bool failNFS = false;
>>  
>>      virCheckFlags(0, NULL);
>>  
>> @@ -310,12 +317,38 @@ virStorageBackendFileSystemNetFindPoolSources(virConnectPtr conn ATTRIBUTE_UNUSE
>>  
>>      state.host = source->hosts[0].name;
>>  
>> -    virStorageBackendFileSystemNetFindNFSPoolSources(&state);
>> +    if (virStorageBackendFileSystemNetFindNFSPoolSources(&state) < 0) {
> 
> 
>> +        virErrorPtr err;
>> +        /* If no Gluster CLI, then force error right here */
>> +        if (!virStorageBackendHaveGlusterCLI())
>> +            goto cleanup;
>> +
>> +        /* If we have a Gluster CLI, then message the error, clean it out,
>> +         * and move onto the Gluster code
>> +         */
>> +        err = virGetLastError();
>> +        VIR_ERROR(_("Failed to get NFS pool sources: '%s'"),
>> +                  err != NULL ? err->message: _("unknown error"));
> 
> This takes the last logged error and logs it again with a prefix.
> 

Hmm - ah I see. Guess it wasn't clear to me when the message was logged
since getting the last error is used in other places where errors are
logged and we save/restore the last error for propagation.

>> +        virResetLastError();
> 
> IMO we can leave the error set even if we return 0.
> 

Which seems to be what I'm striving for - being able to message the user
- especially (and if) the gluster cli does exist...

>> +        failNFS = true;
>> +    }
>>  
> 
>>      if (virStorageBackendFindGlusterPoolSources(state.host,
>>                                                  VIR_STORAGE_POOL_NETFS_GLUSTERFS,
>> -                                                &state.list) < 0)
>> -        goto cleanup;
>> +                                                &state.list) < 0) {
>> +        virErrorPtr err;
>> +        /* If NFS failed as well, then force the error right here */
>> +        if (failNFS)
>> +            goto cleanup;
>> +
>> +        /* Otherwise, NFS passed, so we message the Gluster error, clean
>> +         * it out, and generate the source list (even if it's empty)
>> +         */
> 
> FindGlusterPoolSources doesn't report errors when the command returns non-zero.
> 

I see a virReportError() in virStorageBackendFindGlusterPoolSources().
There's also the less conspicuous memory allocation errors possible...
or am I misreading what you've typed?

>> +        err = virGetLastError();
>> +        VIR_ERROR(_("Failed to get Gluster pool sources: '%s'"),
>> +                  err != NULL ? err->message: _("unknown error"));
>> +        virResetLastError();
>> +    }
>>  
> 
> The whole hunk would IMHO look simpler as:
> 
> bool fail = false;
> 
> if (FindNFSPoolSources() < 0)
>   fail = true;
> 
> #ifdef GLUSTER_CLI
>   if (FindGlusterPoolSources() < 0)
>      fail = true;
>   else
>      fail = false;
> #endif
> 
> if (fail)
>   goto cleanup;

ahh, but if NFS succeeds and Gluster exists and fails, then we fail even
though we could return sources from NFS?  If NFS fails and Gluster
succeeds we succeed, but how do we let someone know that NFS failed?  Or
should we? I would think so. That's what I was attempting to do with all
the message stuff - make sure it's logged somewhere.

> /* optionally */
> else
>   virResetLastError();
> 
> (Though if we add more network backends, we would need a virBitmap to store
> all the fails :-))

yeah - oh joy.

John