[libvirt] [PATCH v2 1/3] virDomainGetDiskErrors public API and remote protocol

Eric Blake eblake at redhat.com
Mon Jan 30 22:56:11 UTC 2012


On 01/30/2012 09:00 AM, Jiri Denemark wrote:
> We already provide ways to detect when a domain has been paused as a
> result of I/O error, but there was no way of getting the exact error or
> even the device that experienced it.  This new API may be used for both.
> ---
>  daemon/remote.c              |   40 ++++++++++++++++++++++++++++++++
>  include/libvirt/libvirt.h.in |   32 +++++++++++++++++++++++++
>  python/generator.py          |    3 +-
>  src/driver.h                 |    7 +++++
>  src/libvirt.c                |   52 ++++++++++++++++++++++++++++++++++++++++++
>  src/libvirt_public.syms      |    1 +
>  src/remote/remote_driver.c   |   34 +++++++++++++++++++++++++++
>  src/remote/remote_protocol.x |   22 +++++++++++++++++-
>  src/remote_protocol-structs  |   16 +++++++++++++
>  src/rpc/gendispatch.pl       |   47 +++++++++++++++++++++++++++++++++++++
>  10 files changed, 252 insertions(+), 2 deletions(-)

This is big enough that I might have split it into public API (include,
python, src/{driver.h,libvirt.c,libvirt_public.syms}) and RPC
implementation (daemon, src/remote, src/rpc).  But I'll go ahead and
review this, whether or not you split it.

> +++ b/include/libvirt/libvirt.h.in
> @@ -1967,6 +1967,38 @@ virDomainGetBlockIoTune(virDomainPtr dom,
>                          int *nparams,
>                          unsigned int flags);
>  
> +/**
> + * virDomainDiskErrorCode:
> + *
> + * Disk I/O error.
> + */
> +typedef enum {
> +    VIR_DOMAIN_DISK_ERROR_NONE      = 0, /* no error */
> +    VIR_DOMAIN_DISK_ERROR_UNSPEC    = 1, /* unspecified I/O error */
> +    VIR_DOMAIN_DISK_ERROR_NO_SPACE  = 2, /* no space left on the device */
> +
> +#ifdef VIR_ENUM_SENTINELS
> +    VIR_DOMAIN_DISK_ERROR_LAST
> +#endif
> +} virDomainDiskErrorCode;

Looks reasonable and extensible.

> +
> +/**
> + * virDomainDiskError:
> + *
> + */
> +typedef struct _virDomainDiskError virDomainDiskError;
> +typedef virDomainDiskError *virDomainDiskErrorPtr;
> +
> +struct _virDomainDiskError {
> +    char *disk; /* disk target */
> +    int error;  /* virDomainDiskErrorCode */
> +};
> +
> +int virDomainGetDiskErrors(virDomainPtr dom,
> +                           virDomainDiskErrorPtr errors,
> +                           int maxerrors,

I guess inertia is on our side for using int maxerrors, even though the
value should never be negative.

> +++ b/python/generator.py
> @@ -423,7 +423,8 @@ skip_impl = (
>      'virDomainGetBlockIoTune',
>      'virDomainSetInterfaceParameters',
>      'virDomainGetInterfaceParameters',
> -    'virDomainGetCPUStats'  # not implemented now.
> +    'virDomainGetCPUStats',  # not implemented now.
> +    'virDomainGetDiskErrors',

Oops - blame me for not using a trailing comma, so that your hunk had to
touch more lines than one :)

> +/**
> + * virDomainGetDiskErrors:
> + * @dom: a domain object
> + * @errors: array to populate on output
> + * @maxerrors: size of @errors array
> + * @flags: extra flags; not used yet, so callers should always pass 0
> + *
> + * The function populates @errors array with all disks that encountered an
> + * I/O error.  Each disk is identified by its target (the dev attribute of
> + * target subelement in domain XML), such as "vda", and accompanied with
> + * the error that was seen on it.

Suppose I have a domain with 3 disks, and disk A and C have both
encountered an error.  It is not clear to me whether calling
virDomainGetDiskErrors(dom, array, 3, 0) will return 2 (info on A and C,
since those were the only ones with errors) or 3 (info on A, B, and C,
with B explicitly stating that there were no errors).  I guess
virDomainDiskErrorPtr has a reserved value for no error, so I'm assuming
the latter, but I'm wondering how best to word this.  Maybe:

s~array with all disks that encountered an I/O error~array with
information on all disks, and whether they have encountered an I/O error~

Or, if you really did mean to return only the disks with errors while
omitting working disks, then we should add a shortcut, where calling
with errors==NULL and maxerrors==0 returns how many errors would be
returned, in order for the caller to know how big to allocate things.
For that matter, even if we return no error, it would _still_ be handy
to support the special case as a way to count how many disks are being
tracked for errors.

> + * Returns number of disks with errors filled in the @errors array or -1 on
> + * error.

Please also document that the caller is responsible for calling free()
on each disk name returned.

> +int
> +virDomainGetDiskErrors(virDomainPtr dom,
> +                       virDomainDiskErrorPtr errors,
> +                       int maxerrors,
> +                       unsigned int flags)
> +{
> +    VIR_DOMAIN_DEBUG(dom, "errors=%p, maxerrors=%d, flags=%x",
> +                     errors, maxerrors, flags);
> +
> +    virResetLastError();
> +
> +    if (!VIR_IS_DOMAIN(dom)) {
> +        virLibDomainError(VIR_ERR_INVALID_DOMAIN, __FUNCTION__);
> +        virDispatchError(NULL);
> +        return -1;
> +    }
> +
> +    if (dom->conn->flags & VIR_CONNECT_RO) {
> +        virLibDomainError(VIR_ERR_OPERATION_DENIED, __FUNCTION__);
> +        goto error;
> +    }

Missing a sanity check that errors is non-NULL and maxerrors is > 0.
It's a common check, so it should be here rather than making each driver
repeat it, and we have precedence for that sort of sanity check in this
file.

> +struct remote_domain_get_disk_errors_ret {
> +    remote_domain_disk_error errors<REMOTE_DOMAIN_DISK_ERRORS_MAX>; /* insert at 1 */
> +};

If we implement the shorthand of virDomainGetDiskErrors(dom, NULL, 0, 0)
as a way to tell how big to allocate an array for the next call, then
this struct also needs to have a return length independent from the
array.  Quite a few of the virTypedParameters functions should serve as
a model.

> +
>  
>  /*----- Protocol. -----*/
>  
> @@ -2708,7 +2727,8 @@ enum remote_procedure {
>      REMOTE_PROC_STORAGE_VOL_RESIZE = 260, /* autogen autogen */
>  
>      REMOTE_PROC_DOMAIN_PM_SUSPEND_FOR_DURATION = 261, /* autogen autogen */
> -    REMOTE_PROC_DOMAIN_GET_CPU_STATS = 262 /* skipgen skipgen */
> +    REMOTE_PROC_DOMAIN_GET_CPU_STATS = 262, /* skipgen skipgen */
> +    REMOTE_PROC_DOMAIN_GET_DISK_ERRORS = 263 /* autogen autogen */

Nice that you figured out how to autogen the serialize/deserialize
calls.  We should probably do the same for a lot of the
virTypedParameter clients someday (then again, I think the reason the
virTypedParameter clients don't autogen is because we don't yet have the
shorthand for determining allocation limits wired up in the generator,
so I may have just broken that for you by requesting the shorthand).

Since adding a shorthand for returning the allocation size would imply
an ABI break for RPC, we need to do it now, before the 0.9.10 freeze, so
I think it's best to post a v3.

-- 
Eric Blake   eblake at redhat.com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 620 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20120130/6e1a1480/attachment-0001.sig>


More information about the libvir-list mailing list