[Libvir] Proposal for dealing with host devices

Daniel P. Berrange berrange at redhat.com
Wed Apr 2 14:00:09 UTC 2008

On Wed, Apr 02, 2008 at 08:48:17AM -0400, Daniel Veillard wrote:
> On Wed, Apr 02, 2008 at 02:05:58AM +0100, Daniel P. Berrange wrote:
> > 
> > The following document illustrates an API for host device enumeration,
> > creation and deletion. This has a number of use cases:
> [...]
> > This all sounds like alot of data / stuff to manage, but the good news
> > is that there is already an application which does most of this for
> > us. ie HAL.  So at the basic level all we need to do is map the HAL
> > properties into a libvirt XML format for describing devices.
>   Having gone though the description a couple of time, this makes sense
> to me I just have a couple of remarks:
>    - basically we are now extending libvirt to be a generic accessor
>      for host physical data it's fine because we need it but ...

Yes, that is basically correct. We have some limited host physical data
like NUMA topology, CPU capabilities. This is extending that idea to
cover devices too.

>    - if hald daemons (I have 5 hal related daemons running mon my F8 
>      desktops) had exported things in a secure way most of this would not
>      be needed, right ?

Yes & no. While this API is mostly virtualization agnostic, we'd still
need to provide a different impl for VMWare or any other hypervisor where
you don't have a Dom0 + HAl host OS available. Fortunately all our drivers
do have that currently, so its easy in the short term & by minimizing the
amount of data in the XML its practical to provide an impl for VMWare in
the future if desired.

> But having a remote secure way to access hardware data is needed, if libvirt
> is the first one to provide it, why not ! It will certainly make things
> easier for libvirt users. 

Adding remote support to HAL directly would involve adding Kerberos, SSL,
x509 support to DBus, and then defining an new security mechanism for
DBus to replace the local SELinux controls, or getting them to work cross
network. Not to mention the extra administrative setup step. So I think 
it is simpler all round to proxy it in libvirt.

> [...]
> > Now some example XML descriptions....
> [..]
> > 
> > Notice how the specific functional devices like NICs, HBAs, are
> > children of the physical USB or PCI device. This is where the
> > hierarchy comes in.
>   Well except the hierarchy is not reflected at the XML structure level.
> But I understand we need to be able to isolate devices descriptions and
> this could get too complex to be represented by a tree, so that's fine.

We could define an official way to nest the <device> XML fragments, but
I can't think of any application use case (yet) where I'd want them all
nested. So I figure its best to keep it simple now.

> > There are some devices HAL does not represent so we'll have to augment
> > the HAL information. Specifically devices which don't correspond to
> > a physical device, eg
> > 
> >   - Bonding NICs
> >   - Bridges
> >   - VLANs
> how much of that could be separated and considered a local network
> topology and extracted with the network APIs instead ?

I'm not entirely sure to be honest - that's certainly an option I've
considered. Even with this host device API I think we'll need some 
extra APIs to deal with network device configs because they get very
complex & there's fair amount of stateful configuration / lifecycle
transitions to track which isn't something that can be expressed in
this generic device enumeration API. So perhaps we should leave out
the bonding/bridge/vlan stuff from this API for now...

> >   int
> >   virNodeNumOfDevices(virConnectPtr conn)
> > 
> >   int
> >   virNodeListDevices(virConnectPtr conn,
> >                      char **const names,
> >                      int maxnames)
>   I would add a flags for future extensibility of those 2 entry points.
> For example to be able to query 'active' devices.

There isn't really a concept of 'active' in these APIs, since they're
really expressing hardware devices & functional capabilities exposed
by the hardware. Each different type of device has its own 'lifecycle'
which may or may not involve a concept of 'active', as well as number
of other states. So I think its best to keep lifecycle tracking out
of this API & concept just on device enumeration and metadata, which
is basically what HAL does.

> >   int
> >   virNodeNumOfDevicesByCap(virConnectPtr conn,
> > 			   const char *cap)
> > 
> >   int
> >   virNodeListDevicesByCap(virConnectPtr conn,
> > 			  const char *cap,
> > 			  char **const names,
> > 			  int maxnames)
>   How do you know the proper values for cap (or bus below) ?

They're mostly defined in the HAL spec, but there'll be some extra
ones that we add - eg for FibreChannel / NPIV vports.  Off top of
my head there is





> >   int
> >   virNodeNumOfDevicesByBus(virConnectPtr conn,
> > 			   const char *bus)
> >   int
> >   virNodeListDevicesByBus(virConnectPtr conn,
> > 			  const char *bus,
> > 			  char **const names,
> > 			  int maxnames)
>   Okay that's server side filtering why not make it a bit more generic ?
>    int
>    virNodeNumOfDevicesSubset(virConnectPtr conn,
>                              const char *selector);
>    int
>    virNodeListDevicesSubset( virConnectPtr conn,
>                              const char *selector,
> 			     char **const names,
> 			     int maxnames)
> where the selector could be something like:
>      bus='usb'
>      cap='net'
>      bus='pci' and cap='net'
> To me the problem may quickly become to filter the available informations
> while still allowing the API to be 1/ extensive 2/ fast (limited round trip)
> when you know what you want.

Yep, that's an option - on a typical machine I expect you'd have 100 - 200
core devices + 1 or more devices for each block device. So if you have a
SAN exported hundreds of LUNs each with several partitions you could get
many many 100's of devices listed via this API. So filtering is definitely

> Ultimately, you may want to move a domain from one machine with a very
> specific kind of device to another one in the pool with the same hardware
> (and not used yet by a running domain), and querying for this may need
> a relatively specific selector, that's why I think just selecting on bus 
> or on capability may not be sufficient.
> Maybe HAL has better API for this.

The HAL API is basically matching the ByBus/ByCapability stuff I showed
above, but we don't have to map 1-to-1 here, because we don't have to
query HAL in real time. I expect we'll query HAL & cache the data in a
mnore suitable intermediate format in libvirt

> >   virNodeDevicePtr
> >   virNodeDeviceLookupByName(virConnectPtr conn, const char *name)
> > 
> >   virNodeDevicePtr
> >   virNodeDeviceLookupByName(virConnectPtr conn, const char *key)
>   you mean virNodeDeviceLookupByKey there

Yes. Cut & paste mistake

> > 
> >   virNodeDevicePtr
> >   virNodeDeviceCreate(virConnectPtr conn,
> >                       const char *xml)
> > 
> >   int
> >   virNodeDeviceDestroy(virNodeDevicePtr dev)
>   Except for the querying capability this sounds fine to me. Of course
> at some point people may need change informations lookup (I'm not sure
> we want to provide a callback based API, something querying for changes
> since last check might be more useful state could be preserved in the
> libvirtd).

HAL lets you get notifications on property changes, so we can definitely
provide some form of callback to be notified when a device changes, or
when a device is created / deleted.

> > I don't propose to expose all the data - only specific properties we have
> > immediate need for. The HAL spec describes the meaning of various props
>   The only thing I would like to avoid is that the immediate need viewpoint
> to lead to an API which we would have to modify and deprecate once used
> for a couple of years :-)

Most of the stuff we'd have to add over time is likely to be just new
elements/attributes in the XML. The HAL api itself hasn't really changed
in any significant way in the last few years - they aim to provide a stable
API themselves which is good for us.

|: Red Hat, Engineering, Boston   -o-   http://people.redhat.com/berrange/ :|
|: http://libvirt.org  -o-  http://virt-manager.org  -o-  http://ovirt.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505  -o-  F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

More information about the libvir-list mailing list