[Libguestfs] 1.39 proposal: Let's split up the libguestfs git repo and tarballs

Richard W.M. Jones rjones at redhat.com
Mon Jun 10 15:35:52 UTC 2019


Sorry for the late reply to this ...

On Tue, Apr 30, 2019 at 06:28:01PM +0200, Pino Toscano wrote:
> On Friday, 9 February 2018 19:01:53 CEST Richard W.M. Jones wrote:
> > My contention is that the libguestfs git repository is too large and
> > unwieldy.  There are too many separate, unrelated projects and as a
> > result of that the source has too many dependencies and takes too long
> > to build and test.
> > 
> > The project divides (sort of) naturally into layers -- the library,
> > the bindings, the various virt tools -- and could be split along those
> > lines into separate projects which can then be released and evolve at
> > their own pace.
> 
> As also other answers to this email say, splitting tools, and bindings
> may be very complex, and thus for now it is still a too far goal.
> 
> However...
> 
> > My suggested split would be something like this:
> > 
> > [...]
> >        virt-v2v and virt-p2v
> 
> I'd rather split virt-p2v in its own repository.  There are various
> reasons for this:
> - it does not use libguestfs (the library), just the tools for testing
>   stuff
> - the communication with virt-v2v is done via network, and its
>   capabilities are dynamically probed (so theoretically virt-p2v, and
>   virt-v2v can be used even when their versions are odd)
> - it is written only in C
> 
> However, even if it looks simple, in reality there are number of common
> things used from the rest of the libguestfs tree:
> 1) gnulib

We hardly use gnulib in virt-p2v.  I think it's only used for
ignore-value.h, getprogname.h, and c-ctype.h, all of which are likely
to be easily worked around.

> 2) some build system bits (e.g. m4/guestfs-v2v.m4)

Right, although this in itself should be split up, so no bad thing.

> 3) auto-cleanup bits (e.g. CLEANUP_FREE), although only few are used
>    (CLEANUP_FREE, CLEANUP_FREE_STRING_LIST, CLEANUP_PCLOSE,
>    CLEANUP_FCLOSE, and CLEANUP_XMLFREETEXTWRITER)
> 4) other internal macros, i.e. guestfs-utils.h

Common code is a bit tricker, as is ...

> 5) the list of credits generated by the generator
>    (i.e. generator/authors.ml)
> 6) the p2v configuration generated by the generator
>    (i.e. generator/p2v_config.ml)

... the generator and ...

> 7) test images/data (phony images, and virt-tools)

test data.

> 8) the miniexpect module, right now out of the p2v subdirectory

This is only used by virt-p2v I think, so it could go with virt-p2v or
be made into a separate project.

> Possible solutions may/might be:
> 1) add own submodule (use its own set of modules)

I think we should ditch gnulib as much as possible, so see above.

> 2) copy/implement them them locally: luckly they are not many, so
>    inlining them in configure.ac will not be a problem; the common
>    bits (e.g. the distro detection from os-release) can be split in
>    its own module in libguestfs, copying it in p2v
> 3/4) have a local version of them; not pretty, although they are not
>      that many
> 5) this list is reflected in two places: the p2v/about-authors.c file,
>    and the AUTHORS file (theoretically mandatory for automake, unless
>    "foreign" is used, which it is); my idea was to go back to a manually
>    written about-authors.c file without the libguestfs credits, leaving
>    the few p2v ones easy to manage; the same for the AUTHORS file
> 6) this is a bit more complex: my idea was to keep it as OCaml script
>    to run at build time, instead of being statically shipped at dist
>    time
> 7) create their own versions at test time using guestfish/virt-builder;
>    maybe use a fedora image, instead of a phony windows one (will avoid
>    hivex for the tests)
> 8) 

So while I'm not a massive fan of git submodules, now that I have used
them a few times with riscv stuff, they do solve a certain problem as
long as they are managed carefully.  I think the common code and the
generator are cases where a submodule or two would work.

Does this mean we need to move immediately to a submodule if just
splitting virt-p2v, or copy code as you suggest?  Maybe not, because
you can imagine for just this project copying the code needed from the
common/ directory, and creating a new "mini-generator" for the project
which handles the little bits that need to be generated in virt-p2v.

However in the long term if we split up everything a submodule or two
does seem to make sense, so maybe we should start there?

> The other problem is how to split the repository, as the various bits
> are in different places:
> a) git filter-branch --subdirectory-filter p2v
> + very small repo with the current p2v subdirectory
> + preserves the history of the p2v subdirectory, with branches and tags
> - missing all the other bits, which will have no history
> - not usable to build older releases (e.g. for bisecting)

I'm not exactly sure what this does.  Is this something to do with
preserving the history?  TBH I don't think we need to bother with the
history -- it exists still in libguestfs.git.

> b) create a work branch in libguestfs, then in that branch move/copy all
> the stuff making the p2v subdirectory build standalone there, and then
> import the content of the p2v subdirectory of that branch in a new empty
> repo
> + very small repo with the current p2v subdirectory
> - no history, no tags nor branches
> + using a graft it is possible to "stitch" the history of the new repo
>   with the work branch in libguestfs
> 
> c) git filter-branch to remove all the bits not related to p2v from all
> the commits
> + not that big repo
> + preserves the history of all the content, with branches and tags
> - will take a very long time to create (e.g. iterate over and over to
>   find out what to remove)
> - not usable to build older releases (e.g. for bisecting)

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://people.redhat.com/~rjones/virt-df/




More information about the Libguestfs mailing list