<div dir="ltr">I uploaded a v2, which does as you requested, more globally (across all python bindings) - tell me what you think.</div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Apr 20, 2020 at 2:42 PM Daniel P. Berrangé <<a href="mailto:berrange@redhat.com">berrange@redhat.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Mon, Apr 20, 2020 at 01:17:35PM +0300, Sam Eiderman wrote:<br>
> The python3 bindings create unicode objects from application strings<br>
> on the guest (i.e. installed rpm, deb packages).<br>
> It is documented that rpm package fields such as description should be<br>
> utf8 encoded - however in some cases they are not a valid unicode<br>
> string, on SLES11 SP4 the following packages fail to be converted to<br>
> unicode using guestfs_int_py_fromstring() (which invokes<br>
> PyUnicode_FromString()):<br>
> <br>
>  PackageKit<br>
>  aaa_base<br>
>  coreutils<br>
>  dejavu<br>
>  desktop-data-SLED<br>
>  gnome-utils<br>
>  hunspell<br>
>  hunspell-32bit<br>
>  hunspell-tools<br>
>  libblocxx6<br>
>  libexif<br>
>  libgphoto2<br>
>  libgtksourceview-2_0-0<br>
>  libmpfr1<br>
>  libopensc2<br>
>  libopensc2-32bit<br>
>  liborc-0_4-0<br>
>  libpackagekit-glib10<br>
>  libpixman-1-0<br>
>  libpixman-1-0-32bit<br>
>  libpoppler-glib4<br>
>  libpoppler5<br>
>  libsensors3<br>
>  libtelepathy-glib0<br>
>  m4<br>
>  opensc<br>
>  opensc-32bit<br>
>  permissions<br>
>  pinentry<br>
>  poppler-tools<br>
>  python-gtksourceview<br>
>  splashy<br>
>  syslog-ng<br>
>  tar<br>
>  tightvnc<br>
>  xorg-x11<br>
>  xorg-x11-xauth<br>
>  yast2-mouse<br>
> <br>
> This is a surgical fix for inspect_list_applications2()'s description<br>
> field.<br>
> <br>
> Signed-off-by: Sam Eiderman <<a href="mailto:sameid@google.com" target="_blank">sameid@google.com</a>><br>
> ---<br>
>  generator/<a href="http://python.ml" rel="noreferrer" target="_blank">python.ml</a> | 8 ++++++++<br>
>  1 file changed, 8 insertions(+)<br>
> <br>
> diff --git a/generator/<a href="http://python.ml" rel="noreferrer" target="_blank">python.ml</a> b/generator/<a href="http://python.ml" rel="noreferrer" target="_blank">python.ml</a><br>
> index f0d6b5d96..7394a943a 100644<br>
> --- a/generator/<a href="http://python.ml" rel="noreferrer" target="_blank">python.ml</a><br>
> +++ b/generator/<a href="http://python.ml" rel="noreferrer" target="_blank">python.ml</a><br>
> @@ -170,6 +170,14 @@ and generate_python_structs () =<br>
>          function<br>
>          | name, FString -><br>
>              pr "  value = guestfs_int_py_fromstring (%s->%s);\n" typ name;<br>
> +            (match typ, name with<br>
> +            | "application", "app_description"<br>
> +            | "application2", "app2_description" -><br>
> +                pr "  if (value == NULL) {\n";<br>
> +                pr "    value = guestfs_int_py_fromstring (\"\");\n";<br>
> +                pr "    PyErr_Clear ();\n";<br>
> +                pr "  }\n";<br>
<br>
I don't think this is especially friendly/helpful to users.<br>
<br>
I'm assuming that there's just a handful of characters that are not<br>
valid UTF-8. I think we really want a graceful conversion that will<br>
convert as much as possible, replacing any invalid UTF-8 with some<br>
generic placeholder character.<br>
<br>
Regards,<br>
Daniel<br>
-- <br>
|: <a href="https://berrange.com" rel="noreferrer" target="_blank">https://berrange.com</a>      -o-    <a href="https://www.flickr.com/photos/dberrange" rel="noreferrer" target="_blank">https://www.flickr.com/photos/dberrange</a> :|<br>
|: <a href="https://libvirt.org" rel="noreferrer" target="_blank">https://libvirt.org</a>         -o-            <a href="https://fstop138.berrange.com" rel="noreferrer" target="_blank">https://fstop138.berrange.com</a> :|<br>
|: <a href="https://entangle-photo.org" rel="noreferrer" target="_blank">https://entangle-photo.org</a>    -o-    <a href="https://www.instagram.com/dberrange" rel="noreferrer" target="_blank">https://www.instagram.com/dberrange</a> :|<br>
<br>
</blockquote></div>