[Libguestfs] OpenStack output workflow

Wed Sep 26 14:24:56 UTC 2018

On Wed, Sep 26, 2018 at 02:40:54PM +0200, Fabien Dupont wrote:
> [Adding Tomas Golembiovsky]
>
> Well, that's mainly IMS related challenges. We're working on
> OpenStack output support and migration throttling and this implies
> changes to virt-v2v-wrapper.  This is then the opportunity to think
> about virt-v2v-wrapper maintenance and feature set. It has been
> created in the first place to simplify interaction with virt-v2v
> from ManageIQ.

Stepping back here, the upstream community on this mailing list have
no idea what you mean by the terms "IMS", "virt-v2v-wrapper" and
"ManageIQ".  I'll try to explain briefly:

* ManageIQ (http://manageiq.org/) = a kind of scriptable, universal
  management tool for cloudy things, using Ansible.

* Ansible = automates remote management of machines using ssh.

* IMS = an internal Red Hat project to add virt-v2v support to
  ManageIQ.

* virt-v2v-wrapper = a wrapper around virt-v2v which allows it to be
  called from Ansible.  The reason we need this is because Ansible
  doesn't support managing long-running processes like virt-v2v, so we
  need to have another component which provides an API which can be
  queried remotely, while tending to the long-running virt-v2v behind
  the scenes.  [Tomáš: Got a link to the code?  I can't find it right now]

> The first challenge we faced is the interaction with virt-v2v. It's
> highly versatile and proposes a lot of options for input and
> output. The downside of it is that over time is becomes more and
> more difficult to know them all.

The options are all documented in the manual.  I have thought for a
while that we need to tackle the virt-v2v manual: It's too big, and
unapproachable.  Really I think we need to split it into topic
sections and rewrite parts of it.  Unfortunately I've not had time to
do that so far.

> And all the error messages are made for human beings, not machines,
> so providing feedback through a launcher, such as virt-v2v-wrapper,
> is difficult.

This is indeed an issue.  Pino recently added enhanced support for the
‘--machine-readable’ option which should address some problems:

  https://github.com/libguestfs/libguestfs/commit/afa8111b751ed33e1989e6d9bb03928cefa17917

If this change still doesn't fully address the issues with automating
virt-v2v then please let us know what specifically can be improved
here.

[...]
> For progress, the only way to know what happens is to run virt-v2v
> in debug mode (-v -x) and parse the (very extensive)
> output. Virt-v2v-wrapper does it for us in IMS, but it is merely a
> workaround.

Right, this is indeed another problem which we should address.  I
thought we had an RFE filed for this, but I cannot find it.  At the
moment the workaround you mention is very ugly and clunky, but AFAIK
it does work.

> I'd expect a conversion tool to provide a comprehensive progress,
> such as "I'm converting VM 'my_vm' and more specifically disk X/Y
> (XX%). Total conversion progress is XX%". Of course, I'd also expect
> a machine readable output (JSON, CSV, YAML…). Debug mode ensures we
> have all the data in case of failure, so I don't say remove it, but
> simply add specialized outputs.

We can discuss debug output vs progress output and formats to use
separately when fixing the above, but yes, point taken.

> The third challenge was to clean up in case of virt-v2v failure. For
> example, when it fails converting a disk to RHV, it doesn't clean
> the finished and unfinished disks.

This is a bug (https://bugzilla.redhat.com/show_bug.cgi?id=1616226).
It's been on my to-do list for quite a while , but I haven't got to
it, so patches welcome ...

> Virt-v2v-wrapper was initially written by RHV team (Tomas) for RHV
> migrations, so it sounded fair(ish). But, extending the outputs to
> OpenStack, we'll have to deal with leftovers in OpenStack too. Maybe
> a cleanup on failure option would be a good idea, with a default to
> false to not break existing behaviour.

The issue of cleaning up disks in general is a hard one to solve.

With the OpenStack backend we try our best as long as virt-v2v
exits on a normal failure path:

  https://github.com/libguestfs/libguestfs/blob/e2bafffce24cd8c0436bf887ee166a3ae2257bbb/v2v/output_openstack.ml#L370-L384

However there are always going to be cases where that is not possible
(eg. virt-v2v segfaults or is kill -9'd or whatever), and in that case
I envisaged for OpenStack some sort of external garbage collector.  To
this end, disks which have not been finalized are given a special
description so it should be possible to find them after a full
migration has completed:

  https://github.com/libguestfs/libguestfs/blob/e2bafffce24cd8c0436bf887ee166a3ae2257bbb/v2v/output_openstack.ml#L386-L392

IIRC virt-v2v-wrapper is sending kill -9 to virt-v2v, which it should
not do.

> The fourth challenge is to limit the resources allocated to virt-v2v
> during conversion, because concurrent conversions may have a huge
> impact on conversion host performance. In the case of an oVirt host,
> this can impact the virtual machines that run on it. This is not
> covered yet by the wrapper, but implementation will likely be based
> on Linux cgroups and tc.

Right, sounds sensible.

> The wrapper also adds an interesting feature: both virt-v2v and
> virt-v2v-wrapper run daemonized and we can asynchronously poll the
> progress. This is really key for IMS (and maybe for others): this
> allows us to start as many conversions in parallel as needed and
> monitor them. Currently, the Python code forks and detaches itself,
> after providing the paths to the state file. In the discussion about
> cgroups, it was mentioned that systemd units could be used, and it
> echoes with the daemonization, as systemd-run allows running
> processes under systemd and in their own slice, on which cgroups
> limits can be set.

"the Python code" meaning the code in virt-v2v-wrapper?

I also agree that using a systemd temporary unit is the way to go
here.  As well as providing a natural way to limit the resources used
by virt-v2v (since systemd unit implies a cgroup), it also solves the
problems around logging and collecting debug logs.

> About the evolution of virt-v2v-wrapper that I'm going to describe,
> let me state that this is my personal view and it endorses only
> myself.
>
> I would like to see the machine-to-machine interaction, logging and
> cleanup in virt-v2v itself because it is valuable to everyone, not
> only IMS.

Here's I think where we are going to disagree.  virt-v2v is a command
line tool, with lots of users outside of the internal IMS Red Hat
project.  Here in the upstream community I'm afraid we make decisions
which are best for all our users, not for one particular user!

But in any case I don't see how adding systemd unit support to
virt-v2v itself helps very much.  It's really easy to run virt-v2v in
a systemd unit -- see the attached email for full details.

This gains all the benefits I mention above and is hardly any effort
at all.  You can even adjust the properties on the fly.

(I will admit this is really obscure and undocumented, it took me
quite a lot of time last month to work it out.  We should add this to
the virt-v2v documentation, but at least the docs are available on the
mailing list now.)

> I would also like to convert virt-v2v-wrapper to a conversion API
> and Scheduler service. The idea is that it would provide an
> as-a-Service endpoint for conversions, that would allow creation of
> conversion jobs (POST), fetching of the status (GET), cancelation of
> a conversion (DELETE) and changing of the limits (PATCH). In the
> background, a basic scheduler would simply ensure that all the jobs
> are running. Each virt-v2v process would be run as a systemd unit
> (journald could capture the debug output), so that it is independent
> from the API and Scheduler processes.

This sounds like an interesting and useful evolution of the wrapper,
and we should try to add pieces to virt-v2v to make it easier to run
under the wrapper, but at the end of the day virt-v2v is a command
line tool used by many different projects and purposes so actually
adding all this to virt-v2v itself is a non-starter.

> I know that I can propose patches for changes to virt-v2v, or at
> least file RFEs in Bugzilla (my developer skills and programing
> languages breadth are limited). For the evolved wrapper, my main
> concern is its housing and maintenance. It doesn't work only for
> oVirt, so having its lifecycle tied to oVirt doesn't seem relevant
> in the long term. In fact, it can be for any virt-v2v output, so my
> personal opinion is that it should live in the virt-v2v ecosystem
> and follow it's lifecycle. As for its maintenance, we still have to
> figure out who will be responsible for it, i.e. who will be able to
> dedicate time to it.

There's certainly a case for making the wrapper into a standalone
project, with a proper upstream etc.  It could even be shipped under
the libguestfs umbrella.  But that's Tomáš's domain so I leave it up
to him to decide what to do.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
Fedora Windows cross-compiler. Compile Windows programs, test, and
build Windows installers. Over 100 libraries supported.
http://fedoraproject.org/wiki/MinGW
-------------- next part --------------
An embedded message was scrubbed...
From: "Richard W.M. Jones" <rjones at redhat.com>
Subject: Limiting virt-v2v block I/O using cgroups - network I/O still unknown
Date: Wed, 22 Aug 2018 14:42:51 +0100
Size: 3363
URL: <http://listman.redhat.com/archives/libguestfs/attachments/20180926/b34db6f0/attachment.eml>