[fedora-virt] Re: RFC: libosinfo: Library for virt OS/distro metadata 3

Daniel P. Berrange berrange at redhat.com
Mon Jun 15 12:48:05 UTC 2009

On Sun, Jun 14, 2009 at 06:50:02PM -0400, Cole Robinson wrote:
> The public API looks like:
> /**
>  * Values stored in the OS dictionary
>  */
> enum _os_value_type {
>     OS_VALUE_NAME = 1,          /** Human readable family/distro... name */
>     OS_VALUE_MEDIA_INSTALL_URL, /** URL to an install tree */
> };
> typedef enum _os_value_type os_value_t;
> int     os_init();
> void    os_close();
> int     os_find_families        (char ***list);
> int     os_find_distros         (const char *parent_id, char ***list);
> int     os_find_releases        (const char *parent_id, char ***list);
> int     os_find_updates         (const char *parent_id, char ***list);
> int     os_lookup_value         (os_value_t value_type,
>                                  const char *os_id,
>                                  char **value);
> The unique identifier for each distro is its 'id', which is a simple human
> readable string, similar to values we use for virt-install --os-variant today.

As John suggested, I think we'd be safer having opaque structs for the
conceptual objects. One for the library itself, and another for an OS

Perhaps have an 

   'os_info_t'    as a handle for a library itself returned by os_init
   'os_distro_t'  as a handle for a single OS distro instance

    os_info_t os_info_new()
    os_info_init(os_info_t *info, char *uri);   /* loads the XML data */

For OS distros I think we need APIs to:

 - List all OS distros
 - Find OS distros, matching a specific set of properties
 - Read a property from an OS distro
 - Read all properties from an OS distro
 - List unique values for a property across all distros

> The user will ask the API for available families/distros/releases/updates,
> which will return a list of ids. We then pass an id to os_lookup_value to
> actually retrieve data. The family/distro/... separation will likely be
> removed pretty soon, in favor of an arbitrary hierarchy, where every OS
> can have child OSes: no doubt hardcoding the family/distro/... split would
> come back to bite us in the ass.

I agree, the fixed hierarchy I describe really doesn't seem very nice
looking back on it. The names I gave them are rather contrived and only
really map nicely onto RHEL/Fedora release process. I think we're better
off being more flexible and allowing for arbitrary relationships in the
data files and API. I don't think we neccessarily want to force a single
rooted tree structure here. The key important factor with the hierarchy
is the concept of sharing metadata. 

I think we should take a hint from the way RDF works and define the API
and XML format as a flat list, but allow relationships to be defined,
and also allow tagging. 

 - Flat list of OS distros with their full name,  as defined by their

     "Red Hat Enterprise Linux 4.7"
     "Red Hat Enterprise Linux 5.0"
     "Fedora 10"
     "Fedora 10"
     "Debian Sarge"

 - A 'derived' property. Allows derived distros to declare
    they should inherit metdata (eg Scientific Linux derives from 

 - A 'clone' property. Allows functionally identical rebuilds
   to declare they use exactly same metadata. (eg CentOS / RHEL)

 - A 'upgrades' property. Allows to indicate 'Fedora 11' is the
   release following on from 'Fedora 10'. 

 - A 'publisher' property to give name of entity producing the
   distro eg  'Fedora Project', 'Red Hat', 'Microsoft'

 - A 'kernel type' and 'kernel version' property, eg 'linux'
   and '2.6.26'.

Application UI might simulate a hierarchy by using the 'publisher'
property at first level, and then filtering the flat list of OS
distros at the 2nd level according to selected publisher. This
satisfies the key 'UI' reason for the hierarchy. The 'derived'
and 'clone'  allow for inheritance of metadata. 

> So, things that I'm interested in feedback on:
> - How do we expect apps to list OS choices? Currently, virt-manager lists
>   type (linux, windows, unix, etc.) and associated distros (Fedora 8, RHEL4,
>   Debian Lenny, etc.). The linux/windows/unix info isn't represented in the
>   xml (should it be?) so the best way seems to be:
>   Distro
>     |
>     --> Release
>           |
>           --> Update
>   Ex.
>   RHEL
>     |
>     -> RHEL5
>         |
>         -> 5.0
>            5.1
>            5.2
>   If we do away with the family/distro/... distinction, the user won't have
>   much choice in the matter, but the 'family' concept (e.g. value of
>   'Red Hat') isn't very useful to expose to a user.

We should try to avoid forcing one representation onto apps. I think the
flat OS list + sets of properties will allow apps to build a variety of
UI models for this, either search based, tree based or filter based.

> - How should we handle derivatives like Scientific Linux + CentOS: should we
>   expect users to understand they are based on RHEL, or give them explicit
>   IDs?

They need explicit IDs, since they have unique download URLs that have
to be stored. The 'clone' and 'derived' properties will allow us to avoid
duplicating other metadata, and also allow apps to show/hide clones as

> - Querying for device values (supported buses, models, etc.). Dan's original
>   proposal talks about this; to recommend a default with the best chance of
>   actually working, we need to know:
>   - OS being installed
>   - Virt type ('hvm' vs. 'xen')
>   - Guest Architecture (i386, x86_64, ...)
>   - Hypervisor (kvm, qemu, xen, vbox, ...)
>   - Hypervisor version
>   - Libvirt version
>   We would need to find the intersection of what the OS, the hypervisor,
>   and libvirt support, and return what we decide is the best choice.
>   How to expose this in the API? We could simply have one long function
>   os_lookup_device_value(char *os_id, char *virt_type, char *arch, ...)
>   It works, but its pretty tedious, and I'm afraid that we would need
>   even more info to make a correct choice in the future, and the above
>   isn't flexible. We may also need some of the above info for other values
>   (ACPI/APIC settings, returning a proper install url may depend on arch).
>   Any suggestions?

The more I think about this, the more I think we should avoid any specific
named attributes in the API. Supported devices are just other types of
property we can associated with a distro, in addition to ones I already
listed earlier. This could be useful in the UI too, for example, if you
know the hypervisor requires support for 'Xen paravirt disk', then when
browsing OS, you can filter on this property just as you would with the

> - os_init and os_close: Any better ideas for this? os_init just parses the
>   xml document, os_close frees it. We could run os_init with the first API
>   call, but I think that makes it less clear that the user would then
>   need to call os_close().

I think its good to keep the initializer explicit, and if you add an
opaque type representing a handle to the library, this will force apps
to caller it and track it.

|: Red Hat, Engineering, London   -o-   http://people.redhat.com/berrange/ :|
|: http://libvirt.org  -o-  http://virt-manager.org  -o-  http://ovirt.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505  -o-  F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

More information about the Fedora-virt mailing list