[Libguestfs] Splitting up virt-v2v

Richard W.M. Jones rjones at redhat.com
Wed Nov 25 10:29:45 UTC 2020

For a long time I've wanted to split up virt-v2v into smaller
components to make it easier to consume.  It's never been clear how to
do this, but I think I have a workable plan now, described in this email.


First, the AIMS, which are:

(a) Preserve current functionality, including copying conversion,
    in-place conversion, and the virt-v2v command line.

(b) Allow warm migration to use virt-v2v without requiring the
    "--debug-overlays hack".

(c) Allow threads, multi-conn, and parallel copying of guest disks, all
    for better copying performance.

(d) Allow an alternate supervisor to convert and copy many guests in
    parallel, given that the supervisor has a global view of the
    system/network (I'm not intending to implement this, only to make
    it possible).

(e) Better progress bars.

(f) Better logging.

(g) Reuse as much existing code as possible.  This is NOT a rewrite!


Here's my PLAN:

/usr/bin/virt-v2v still exists, but it's now a supervisor program
(possibly even a shell script) that runs the steps below:

(1) Set up the input side by running "helper-v2v-input-<type>".  For
    all input types this creates a temporary directory containing:

    /tmp/XXXXXX/in1    NBD endpoints overlaying the source disk(s)
    /tmp/XXXXXX/in2    (these are actually Unix domain sockets)
    /tmp/XXXXXX/metadata.in   Metadata parsed from the source.

    Currently for most inputs we have a running nbdkit process for
    each source disk, and we'd do the same here, except we add
    nbdkit-cow-filter on top so that the source disk is protected from
    being modified.  Another small difference is that for -i disk
    (local input) we would need an active nbdkit process on top of the
    disk, whereas currently we set the disk as a qcow2 backing file.

(2) Perform the conversion by running "helper-v2v-convert".  This does
    the conversion and sparsification.  It writes directly to the NBD
    endpoints (in*) above.  The writes are stored in the COW overlay
    so the source disk is not modified.

    Conversion will also create an output metadata file:

    /tmp/XXXXXX/metadata.out   Target metadata

    Exact format of the metadata files is to be decided, but some kind
    of not-quite-libvirt-XML may be suitable.  It's also not clear if
    the metadata format is an internal detail of virt-v2v, or if we
    document it as a stable API.

(3) Set up the output side by running "helper-v2v-output-<type>
    setup".  This will read the output metadata and do whatever is
    needed to set up the empty output disks (perhaps by creating a
    guest on the target, but also this could be done in step (5)

    This will create:

    /tmp/XXXXXX/out1    NBD endpoints overlaying the target disk(s)
    /tmp/XXXXXX/out2    (these are actually Unix domain sockets)

(4) Do the copy.  By default this will run either nbdcopy or qemu-img
    convert from in* -> out*.

    Copying could be done in parallel, currently it is done serially.

(5) Finalize the output by running "helper-v2v-output-<type> final".
    This might create the target guest and whatever else is needed.

(6) Kill the NBD servers and clean up the temporary directory.


Let's see how this plan matches the aims.

Aim (a):

  Copying conversion works as outlined above.  In-place conversion
  works by placing an NBD server on top of the files you want to
  convert and running helper-v2v-convert (virt-v2v --in-place would
  also still work for backwards compat).

Aim (b):

  Warm migration: Should be fairly clear this can work in the same way
  as in-place conversion, but I'll discuss this further with Martin K
  and Tomas to make sure I'm not missing anything.

Aims (c), (d):

  Threads etc for performance: Although I don't plan to implement
  this, it's clear that an alternate supervisor program could improve
  performance here by either doing copies of a single guest / multiple
  disks in parallel, but even better by having a global view of the
  system and doing copies of multiple guests' disks in parallel.

  This is outside the scope of the virt-v2v project, but in scope for
  something like MTV.

Aim (e):

  Better progress bars: nbdcopy should have support for
  machine-readable progress bars, once I push the changes.  It will
  mean no more need to parse debug logs.

Aim (f):

  Better logging: I hope we can log each step separately.

  A custom supervisor program would also be able to tell which
  particular step failed (eg. did it fail in conversion?  did it fail
  copying a disk and which one?)

Aim (g):

  This works by splitting up the existing v2v code base into separate
  binaries.  It is already broadly structured (internally) like this.
  So it's not a rewrite, it's a big refactoring.

  However I'd probably write a new virt-v2v supervisor binary, because
  the existing command line parsing code is extremely complex.


Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.

More information about the Libguestfs mailing list