[Libguestfs] [PATCH v3] python: Fix UnicodeError in inspect_list_applications2() (RHBZ#1684004)

Pino Toscano ptoscano at redhat.com
Tue Jun 30 08:42:46 UTC 2020


On Sunday, 26 April 2020 20:14:03 CEST Sam Eiderman wrote:
> The python3 bindings create PyUnicode objects from application strings
> on the guest (i.e. installed rpm, deb packages).
> It is documented that rpm package fields such as description should be
> utf8 encoded - however in some cases they are not a valid unicode
> string, on SLES11 SP4 the encoding of the description of the following
> packages is latin1 and they fail to be converted to unicode using
> guestfs_int_py_fromstring() (which invokes PyUnicode_FromString()):

Sorry, I wanted to reach our resident Python maintainers to get their
feedback, and so far had no time for it. Will do it shortly.

BTW do you have a reproducer I can actually try freely?

> diff --git a/python/handle.c b/python/handle.c
> index 2fb8c18f0..fe89dc58a 100644
> --- a/python/handle.c
> +++ b/python/handle.c
> @@ -387,7 +387,7 @@ guestfs_int_py_fromstring (const char *str)
>  #if PY_MAJOR_VERSION < 3
>    return PyString_FromString (str);
>  #else
> -  return PyUnicode_FromString (str);
> +  return guestfs_int_py_fromstringsize (str, strlen (str));
>  #endif
>  }
>  
> @@ -397,7 +397,12 @@ guestfs_int_py_fromstringsize (const char *str, size_t size)
>  #if PY_MAJOR_VERSION < 3
>    return PyString_FromStringAndSize (str, size);
>  #else
> -  return PyUnicode_FromStringAndSize (str, size);
> +  PyObject *s = PyUnicode_FromString (str);
> +  if (s == NULL) {
> +    PyErr_Clear ();
> +    s = PyUnicode_Decode (str, strlen(str), "latin1", "strict");

Minor nit: space between "strlen" and the opening bracket.

Also, isn't there any error we can check as a way to detect this
situation, rather than always attempting to decode it as latin1?

Thanks,
-- 
Pino Toscano
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part.
URL: <http://listman.redhat.com/archives/libguestfs/attachments/20200630/136b8935/attachment.sig>


More information about the Libguestfs mailing list