[Libguestfs] [v2v PATCH 6/9] convert/libosinfo: wrap osinfo_os_get_all_devices()

Richard W.M. Jones rjones at redhat.com
Fri Jan 7 10:56:58 UTC 2022


On Fri, Jan 07, 2022 at 11:10:35AM +0100, Laszlo Ersek wrote:
> On 01/06/22 15:58, Richard W.M. Jones wrote:
> > 
> > 
> > On Thu, Jan 06, 2022 at 03:09:07PM +0100, Laszlo Ersek wrote:
> >> +value
> >> +v2v_osinfo_os_get_all_devices (value osv)
> >> +{
> >> +  CAMLparam1 (osv);
> >> +  CAMLlocal3 (retval, link, props);
> > 
> > It's a matter of style but I would tend to name variables which are
> > OCaml values (C type: value) as "<something>v", just to remind myself
> > that they have to follow the rules; see for example:
> > 
> > https://github.com/libguestfs/virt-v2v/blob/f9d5448d2efec106ab452e7b219ca179752983c3/convert/libosinfo-c.c#L108
> 
> So that's the explanation behind these "v" suffixes. :)
> 
> OK, I can do that, but I still very much want the *prefixes* to make sense. What about:
> 
> - retval_v, link_v, props_v;
> - or: retvalv, linkv, propsv?

The second one looks more consistent with what we have before ...

> I obviously based my function on v2v_osinfo_os_get_device_drivers(), and this line:
> 
>   CAMLlocal3 (rv, v, vi);

This is a bit obscure indeed.

> could not have been more obscure -- it almost felt like intentional obfuscation :)
> 
> > 
> >> +  g_autoptr (OsinfoDeviceList) dev_list = NULL;
> >> +  OsinfoList *ent_list;
> >> +  gint ent_nr;
> >> +
> >> +  retval = Val_emptylist;
> >> +  dev_list = osinfo_os_get_all_devices (OsinfoOs_t_val (osv), NULL);
> >> +  ent_list = OSINFO_LIST (dev_list);
> >> +  ent_nr = osinfo_list_get_length (ent_list);
> >> +
> >> +  while (ent_nr > 0) {
> >> +    OsinfoEntity *ent;
> >> +    size_t prop_nr;
> >> +
> >> +    --ent_nr;
> >> +    ent = osinfo_list_get_nth (ent_list, ent_nr);
> >> +
> >> +    props = caml_alloc (NUM_DEVICE_PROPS, 0);
> >> +    for (prop_nr = 0; prop_nr < NUM_DEVICE_PROPS; ++prop_nr) {
> >> +      const gchar *prop_val;
> >> +
> >> +      prop_val = osinfo_entity_get_param_value (ent, device_prop[prop_nr]);
> >> +      if (prop_val == NULL)
> >> +        prop_val = "";
> >> +      Store_field (props, prop_nr, caml_copy_string (prop_val));
> > 
> > This works because the struct only contains strings:
> > 
> > type osinfo_device = {
> >   id : string;
> >   vendor : string;
> >   vendor_id : string;
> >   product : string;
> >   product_id : string;
> >   name : string;
> >   class_ : string;
> >   bus_type : string;
> >   subsystem : string;
> > }
> > 
> > which is OK,
> 
> Yes, the code intentionally relies on this. See the comment on "device_prop":
> 
> + * All of the above properties have string values. Thus, for uniformity, access
> + * all these properties by their names at the OsinfoEntity level (i.e., forego
> + * the class- and property-specific, dedicated property getter functions).
> 
> > but doesn't libosinfo let us convert them to more natural
> > types?
> 
> Wait, I understand your question better now. Here's the thing: the *natural type* for each of these properties is *genuinely* string. It's not like I'm accessing things that are actually integers and booleans through their internal (low-level) string representation -- they are *genuinely* strings.

Fair enough.  I thought maybe libosinfo stored a type, but if they are
strings, then it's best to represent them as strings.

> I defined the "osinfo_device" OCaml type with purely string fields *because* all of the OsInfoDevice properties are strings in the first place!
> 
> I wish I could provide you a URL with http(s) scheme; however, the only place where I managed to find libosinfo documentation is my hard disk, from the "libosinfo-devel" package. So please open this URL;
> 
>   file:///usr/share/gtk-doc/html/Libosinfo/OsinfoDevice.html
> 
> and you'll see that all the properties are strings.
> 
> In fact, the first version of my array here did not contain strings (property names), but pointers to the following getter functions (which is "all of the existent getters"):
> 
> osinfo_device_get_vendor ()
> osinfo_device_get_vendor_id ()
> osinfo_device_get_product ()
> osinfo_device_get_product_id ()
> osinfo_device_get_bus_type ()
> osinfo_device_get_class ()
> osinfo_device_get_name ()
> osinfo_device_get_subsystem () 
> 
> This was possible because all of them return "const gchar *".
> 
> My first version worked fine, but then I noticed that I missed the most important property: the ID.
> 
> That property is defined by the abstract base class OsinfoEntity however:
> 
>   const gchar * osinfo_entity_get_id ()
> 
>   file:///usr/share/gtk-doc/html/Libosinfo/OsinfoEntity.html
> 
> So that was a complication: I would have to add some kind of dynamic casting, because osinfo_entity_get_id () would act at a different level than the other getters.
> 
> Here's the simplification: if I work solely at the OsinfoEntity level, and use osinfo_entity_get_param_value() exclusively, I can access all the properties uniformly, by name, regardless of whether they come from the base class or from the derived class. The property names themselves are public API; there are no tricks being pulled.
> 
> And given the actual properties at hand, the string representation returned by osinfo_entity_get_id() matches the *actual* property types -- because all those types are genuinely strings.

OK.

> > 
> > Another thing which is not actually a bug but is potentially
> > dangerous is:
> > 
> >   Store_field (props, prop_nr, caml_copy_string (prop_val));
> >                                ^^^^
> > 
> > A temporary variable is allocated to store the result of
> > caml_copy_string, so it's equivalent to:
> > 
> >   value tv = caml_copy_string (prop_val);
> >   Store_field (props, prop_nr, tv);
> > 
> > Now it turns out that Store_field (which calls caml_modify) is
> > documented to never call the garbage collector, so the block pointed
> > to by "tv" could never move, so we're OK!
> > 
> > But the code is nevertheless a bit tricky and you might want to make
> > the temporary explicit (and include it in the CAMLlocal expression at
> > the top of the function so it is a global root).
> 
> The
> 
>   Store_field (..., caml_copy_string (prop_val))
> 
> pattern was copied verbatim from the existent function v2v_osinfo_os_get_device_drivers():
> 
>     str = osinfo_device_driver_get_architecture (driver);
>     Store_field (vi, 0, caml_copy_string (str));
>     str = osinfo_device_driver_get_location (driver);
>     Store_field (vi, 1, caml_copy_string (str));
> 
> In fact, when writing my function, I meticulously reviewed the transfer (ownership) annotations of all the other (prior) libosinfo APIs used in this C source file.

Right - I don't think there's an actual bug here, but the implicit
temporary had me worried.

Thankfully, I found no bugs. I initially suspected that we had a leak here, in v2v_osinfo_os_get_device_drivers():
> 
>   OsinfoDeviceDriverList *list;
>   [...]
>   list = osinfo_os_get_device_drivers (OsinfoOs_t_val (osv));
> 
> I thought that it should have been wrapped into g_autoptr().
> 
> However, surprisingly, osinfo_os_get_device_drivers() is annotated with "transfer none" (so the code is correct!):
> 
>   file:///usr/share/gtk-doc/html/Libosinfo/OsinfoOs.html#osinfo-os-get-device-drivers
> 
> while the function I needed myself is annotated with "transfer full":
> 
>   file:///usr/share/gtk-doc/html/Libosinfo/OsinfoOs.html#osinfo-os-get-all-devices
> 
> > 
> >> +    }
> >> +
> >> +    link = caml_alloc (2, 0);
> >> +    Store_field (link, 0, props);
> >> +    Store_field (link, 1, retval);
> >> +    retval = link;
> >> +  }
> >> +
> >> +  CAMLreturn (retval);
> > 
> > So the code is fine as it turns out, but:
> > 
> > (a) Might consider using <something>v for value names
> 
> Sure, I'll do that (please advise me on the "v" vs. "_v" suffix)

"v" is more consistent.

> > 
> > (b) It'd be nice if libosinfo gave us real types
> 
> It does -- but the string representations I use here already matches the actual types in question! (That's why I went with this loop-based approach in the first place, rather than the explicit property-wise copying seen in v2v_osinfo_os_get_device_drivers().)
> 
> > 
> > (c) Better IMHO to make the temporary variable explicit
> 
> Hmmm, I'm not attracted to this; it will create a discrepancy between v2v_osinfo_os_get_device_drivers() and the new function. I took
> 
>   Store_field (..., caml_copy_string (prop_val))
> 
> as a known-safe, canned pattern.
> 
> The pattern has three *prior* uses in this source file, plus we have one
> 
>   CAMLreturn (caml_copy_string (id));
> 
> as well.
>
> If the pattern is obscure (or risky), we should eliminate all uses of it.

While it's not actually a bug, I think it would be worth eliminating
these implicit temporaries.  I'm sure we've had problems related to
this kind of thing in the past, but I couldn't find a particular
instance by looking through the git logs.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top




More information about the Libguestfs mailing list