[Libvir] using gnulib: starting with the physmem and getaddrinfo modules

Wed Dec 5 02:36:49 UTC 2007

On Wed, Dec 05, 2007 at 12:33:24AM +0100, Jim Meyering wrote:
> The first portability problem was to determine the total physical memory
> available on the current system.  Currently the code works only on
> Linux-like systems that have /proc/meminfo of an expected form.  However,
> the gnulib physmem module handles 13 distinct types of systems and is
> well tested:
> 
>     http://www.gnu.org/software/gnulib/MODULES.html#module=physmem

No argument there. I'm not fond of my current code code physical mem.

> The second portability problem was to find a robust and LGPL-compatible
> getaddrinfo function to be used on systems lacking it.  Here's the
> gnulib module:
> 
>     http://www.gnu.org/software/gnulib/MODULES.html#module=getaddrinfo

Yep, need it for windows

> Once the few gnulib hooks are in your configure.ac and Makefile.am
> files, there is very little extra work required to use the new
> functions.  In the case of physmem, just use the function and
> include "physmem.h".  For getaddrinfo, merely include "getaddrinfo.h"
> from the two files that use the function.

I've not looked closely, but since we're unconditiaonlly including
it, I assume the getaddrinfo.h file is setup to not override the
existing definition of getaddrinfo if the local OS has it.

> This change brings in a lot of code, but many of the lib/.[ch] files
> are used only on systems that lack some required functionality.  For
> example, the getaddrinfo.c file isn't even compiled when it's not
> needed.
> 
> In the patch below, I've included a new script called bootstrap.
> It is a wrapper around gnulib-tool that pulls into libvirt the
> files selected by the (currently two) modules in use.  Those new
> files go in three places:
> 
>   m4/*.m4
>   lib/*.[ch] and a few template .h.in files
>   gl-tests/ for unit test C programs and Bourne shell scripts

I think the source code should go into  gnulib/*.[ch]  in case we
ever want to have a lib/ dir for code shared by the daemon & library.
There's no need to pollute the top level dir with gl-tests, when we
can have  tests/gnulib/, or  gnulib/tests/.  We've already got an m4/
directory, so we might as well use that (or a m4/gnulib, or gnulib/m4
subdir).

> However, note that gettextize and libtoolize (run by autogen.sh)
> also deposit many *.m4 files in m4.  I compared and found that 8
> of the files that are already pulled in by the *ize programs are
> also pulled in (potentially newer versions) from gnulib.  But currently,
> using gettext-0.16.1 or gettext-0.17, there is no difference in any
> of the overlapping files.

This sounds like the trickiest issue. We currently run both of those
tools using --force, so they overwrite any existing files present in
the local tree. IIRC libtoolize shouldn't touch the m4/ directory,
but gettextize (or rather autopoint) definitely does.

> Here's the patch that shows what existing parts of libvirt have to
> be modified to use these two new modules.  To try it out, just apply
> the patch and then run this:
> 
>     ./autogen.sh && ./bootstrap && make && make check
> 
> Running bootstrap creates the new lib/ and gl-tests/ directories.

I don't want a dependancy on a script pulling in files from another random 
project that we have to grab from the internet.

> Personally, I prefer not to add generated files to version control
> systems, because it can lead to problems with version skew if all
> developers don't use the same releases of the tools that do the
> generating.  Perhaps more importantly, when there are massive diffs in
> the generated files, that can obscure real changes in non-generated parts
> of the code.  That already happens to me whenever the .po files change.

Given a fresh CVS checkout of libvirt & fresh Fedora/RHEL/etc OS 
install one should be able to do a full build without requiring extra
code checkouts from unrelated projects. Since gnulib is only distributed
in source form, and not available from the OS package manager, the
only viable approach is to commit the GNULib files to CVS.

IMHO, this *reduces* the problem of version skew, because it doesn't
rely on each developer getting the same copy of gnulib. We can update
the in-CVS copy of the gnulib bits we care about as & when we need them
and know everyone has the same bits. If gnulib really is as stable as
its said to be, then we won't be seeing frequent massive diffs - we'll
just have minor additions as we find more portability modules we need.

> But if people prefer to add all of these imported files to CVS, just
> say the word and I'll prepare the patch.  If so, do you guys want the
> gettextize- and libtoolize-added files to be version-controlled, now, too?

So the gettextize script copies its stuff into m4/ when you run the autogen
script. We need to make sure the gnulib stuff in CVS takes priority over
stuff added by gettextize.  Does aclocal  have any kind of ordering when
you give it multiple include paths with -I arg. eg would 

   'aclocal -I m4/ -I gnulib/m4'

ensure correct prioritization ?

Another option if we  go with the idea of having gnulib stuff in a subdir
like gnulib/m4/*.m4, would be to change autogen.sh to do

   autopoint --force
   cp -f gnulib/m4/*.m4 m4/

ie, always use the top-level m4/ directory as a scratch area for stuff we
don't commit to CVS, but that is used when creating configure.in.

On the other hand, this means the released tar.gz would have 2 copies
of the m4, so a 3rd option is to just remove the --force arg from autopoint
and let the gnulib stuff naturally take priority by virtue of already existing
on disk when autopoint is run.

Dan.
-- 
|=- Red Hat, Engineering, Emerging Technologies, Boston.  +1 978 392 2496 -=|
|=-           Perl modules: http://search.cpan.org/~danberr/              -=|
|=-               Projects: http://freshmeat.net/~danielpb/               -=|
|=-  GnuPG: 7D3B9505   F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505  -=|