[Libguestfs] OpenStack output workflow

Wed Sep 26 14:57:19 UTC 2018

On Wed, Sep 26, 2018 at 4:25 PM Richard W.M. Jones <rjones at redhat.com>
wrote:

> On Wed, Sep 26, 2018 at 02:40:54PM +0200, Fabien Dupont wrote:
> > [Adding Tomas Golembiovsky]
> >
> > Well, that's mainly IMS related challenges. We're working on
> > OpenStack output support and migration throttling and this implies
> > changes to virt-v2v-wrapper.  This is then the opportunity to think
> > about virt-v2v-wrapper maintenance and feature set. It has been
> > created in the first place to simplify interaction with virt-v2v
> > from ManageIQ.
>
> Stepping back here, the upstream community on this mailing list have
> no idea what you mean by the terms "IMS", "virt-v2v-wrapper" and
> "ManageIQ".  I'll try to explain briefly:
>
> * ManageIQ (http://manageiq.org/) = a kind of scriptable, universal
>   management tool for cloudy things, using Ansible.
>
> * Ansible = automates remote management of machines using ssh.
>
> * IMS = an internal Red Hat project to add virt-v2v support to
>   ManageIQ.
>
> * virt-v2v-wrapper = a wrapper around virt-v2v which allows it to be
>   called from Ansible.  The reason we need this is because Ansible
>   doesn't support managing long-running processes like virt-v2v, so we
>   need to have another component which provides an API which can be
>   queried remotely, while tending to the long-running virt-v2v behind
>   the scenes.  [Tomáš: Got a link to the code?  I can't find it right now]
>

Thanks for adding explanation. FWIW, Ansible is not involved in IMS.
There are some rogue code out there
(https://github.com/fdupont-redhat/ims-v2v-engine_ansible)
doing things with Ansible, because I like experimenting.

Virt-v2v-wrapper code is available here:
https://github.com/oVirt/ovirt-ansible-v2v-conversion-host/blob/master/files/virt-v2v-wrapper.py

> > The first challenge we faced is the interaction with virt-v2v. It's
> > highly versatile and proposes a lot of options for input and
> > output. The downside of it is that over time is becomes more and
> > more difficult to know them all.
>
> The options are all documented in the manual.  I have thought for a
> while that we need to tackle the virt-v2v manual: It's too big, and
> unapproachable.  Really I think we need to split it into topic
> sections and rewrite parts of it.  Unfortunately I've not had time to
> do that so far.
>
> > And all the error messages are made for human beings, not machines,
> > so providing feedback through a launcher, such as virt-v2v-wrapper,
> > is difficult.
>
> This is indeed an issue.  Pino recently added enhanced support for the
> ‘--machine-readable’ option which should address some problems:
>
>
> https://github.com/libguestfs/libguestfs/commit/afa8111b751ed33e1989e6d9bb03928cefa17917
>
> If this change still doesn't fully address the issues with automating
> virt-v2v then please let us know what specifically can be improved
> here.
>

Thanks for the link. virt-v2v-wrapper is using the standalone call with
--machine-readable to get the capabilities of virt-v2v. It's used to
check for --mac option support.

[...]
> > For progress, the only way to know what happens is to run virt-v2v
> > in debug mode (-v -x) and parse the (very extensive)
> > output. Virt-v2v-wrapper does it for us in IMS, but it is merely a
> > workaround.
>
> Right, this is indeed another problem which we should address.  I
> thought we had an RFE filed for this, but I cannot find it.  At the
> moment the workaround you mention is very ugly and clunky, but AFAIK
> it does work.
>

You're right. I works. We'll test if the --machine-readable enhances the
machine experience, and make RFEs if needed.

> > I'd expect a conversion tool to provide a comprehensive progress,
> > such as "I'm converting VM 'my_vm' and more specifically disk X/Y
> > (XX%). Total conversion progress is XX%". Of course, I'd also expect
> > a machine readable output (JSON, CSV, YAML…). Debug mode ensures we
> > have all the data in case of failure, so I don't say remove it, but
> > simply add specialized outputs.
>
> We can discuss debug output vs progress output and formats to use
> separately when fixing the above, but yes, point taken.
>
> > The third challenge was to clean up in case of virt-v2v failure. For
> > example, when it fails converting a disk to RHV, it doesn't clean
> > the finished and unfinished disks.
>
> This is a bug (https://bugzilla.redhat.com/show_bug.cgi?id=1616226).
> It's been on my to-do list for quite a while , but I haven't got to
> it, so patches welcome ...
>

Thanks for the BZ reference. IIUC, this will clean the disk being converted
at the kill interruption time. What about the already converted disks for
a multi-disks VM ? IMO, they should also be removed.

> > Virt-v2v-wrapper was initially written by RHV team (Tomas) for RHV
> > migrations, so it sounded fair(ish). But, extending the outputs to
> > OpenStack, we'll have to deal with leftovers in OpenStack too. Maybe
> > a cleanup on failure option would be a good idea, with a default to
> > false to not break existing behaviour.
>
> The issue of cleaning up disks in general is a hard one to solve.
>
> With the OpenStack backend we try our best as long as virt-v2v
> exits on a normal failure path:
>
>
> https://github.com/libguestfs/libguestfs/blob/e2bafffce24cd8c0436bf887ee166a3ae2257bbb/v2v/output_openstack.ml#L370-L384
>
> However there are always going to be cases where that is not possible
> (eg. virt-v2v segfaults or is kill -9'd or whatever), and in that case
> I envisaged for OpenStack some sort of external garbage collector.  To
> this end, disks which have not been finalized are given a special
> description so it should be possible to find them after a full
> migration has completed:
>
>
> https://github.com/libguestfs/libguestfs/blob/e2bafffce24cd8c0436bf887ee166a3ae2257bbb/v2v/output_openstack.ml#L386-L392
>
> IIRC virt-v2v-wrapper is sending kill -9 to virt-v2v, which it should
> not do.
>

It's not virt-v2v-wrapper that kills virt-v2v, it's ManageIQ. We have the
PID from virt-v2v-wrapper state file. What would be the preferred way
to interrupt it ?

> > The fourth challenge is to limit the resources allocated to virt-v2v
> > during conversion, because concurrent conversions may have a huge
> > impact on conversion host performance. In the case of an oVirt host,
> > this can impact the virtual machines that run on it. This is not
> > covered yet by the wrapper, but implementation will likely be based
> > on Linux cgroups and tc.
>
> Right, sounds sensible.
>
> > The wrapper also adds an interesting feature: both virt-v2v and
> > virt-v2v-wrapper run daemonized and we can asynchronously poll the
> > progress. This is really key for IMS (and maybe for others): this
> > allows us to start as many conversions in parallel as needed and
> > monitor them. Currently, the Python code forks and detaches itself,
> > after providing the paths to the state file. In the discussion about
> > cgroups, it was mentioned that systemd units could be used, and it
> > echoes with the daemonization, as systemd-run allows running
> > processes under systemd and in their own slice, on which cgroups
> > limits can be set.
>
> "the Python code" meaning the code in virt-v2v-wrapper?
>

Yes, it is.

I also agree that using a systemd temporary unit is the way to go
> here.  As well as providing a natural way to limit the resources used
> by virt-v2v (since systemd unit implies a cgroup), it also solves the
> problems around logging and collecting debug logs.
>
> > About the evolution of virt-v2v-wrapper that I'm going to describe,
> > let me state that this is my personal view and it endorses only
> > myself.
> >
> > I would like to see the machine-to-machine interaction, logging and
> > cleanup in virt-v2v itself because it is valuable to everyone, not
> > only IMS.
>
> Here's I think where we are going to disagree.  virt-v2v is a command
> line tool, with lots of users outside of the internal IMS Red Hat
> project.  Here in the upstream community I'm afraid we make decisions
> which are best for all our users, not for one particular user!
>

I agree with you. That's why I propose that M2M interaction (e.g.
--machine-readable), logging (format and output options, not only
journald capture) and cleanup (handling interruption) are done at
the virt-v2v level. IMO, these would be beneficial to everybody.

As you say, invocation via systemd is useful for IMS, and
documenting it would make it valuable to others. The other three
topics are not linked to systemd and should be addressed on their
own.

> But in any case I don't see how adding systemd unit support to
> virt-v2v itself helps very much.  It's really easy to run virt-v2v in
> a systemd unit -- see the attached email for full details.
>
> This gains all the benefits I mention above and is hardly any effort
> at all.  You can even adjust the properties on the fly.
>
> (I will admit this is really obscure and undocumented, it took me
> quite a lot of time last month to work it out.  We should add this to
> the virt-v2v documentation, but at least the docs are available on the
> mailing list now.)
>
> > I would also like to convert virt-v2v-wrapper to a conversion API
> > and Scheduler service. The idea is that it would provide an
> > as-a-Service endpoint for conversions, that would allow creation of
> > conversion jobs (POST), fetching of the status (GET), cancelation of
> > a conversion (DELETE) and changing of the limits (PATCH). In the
> > background, a basic scheduler would simply ensure that all the jobs
> > are running. Each virt-v2v process would be run as a systemd unit
> > (journald could capture the debug output), so that it is independent
> > from the API and Scheduler processes.
>
> This sounds like an interesting and useful evolution of the wrapper,
> and we should try to add pieces to virt-v2v to make it easier to run
> under the wrapper, but at the end of the day virt-v2v is a command
> line tool used by many different projects and purposes so actually
> adding all this to virt-v2v itself is a non-starter.
>

Again, agreed. That's why I talk about an evolution of
virt-v2v-wrapper. Some of the clunky workarounds virt-v2v-wrapper
implements should be handled by virt-v2v. The launch and
monitoring of virt-v2v is out of scope of the virt-v2v command.

> > I know that I can propose patches for changes to virt-v2v, or at
> > least file RFEs in Bugzilla (my developer skills and programing
> > languages breadth are limited). For the evolved wrapper, my main
> > concern is its housing and maintenance. It doesn't work only for
> > oVirt, so having its lifecycle tied to oVirt doesn't seem relevant
> > in the long term. In fact, it can be for any virt-v2v output, so my
> > personal opinion is that it should live in the virt-v2v ecosystem
> > and follow it's lifecycle. As for its maintenance, we still have to
> > figure out who will be responsible for it, i.e. who will be able to
> > dedicate time to it.
>
> There's certainly a case for making the wrapper into a standalone
> project, with a proper upstream etc.  It could even be shipped under
> the libguestfs umbrella.  But that's Tomáš's domain so I leave it up
> to him to decide what to do.
>

That's why I added him to this thread :)

> Rich.
>
> --
> Richard Jones, Virtualization Group, Red Hat
> http://people.redhat.com/~rjones
> Read my programming and virtualization blog: http://rwmj.wordpress.com
> Fedora Windows cross-compiler. Compile Windows programs, test, and
> build Windows installers. Over 100 libraries supported.
> http://fedoraproject.org/wiki/MinGW
>
>
>
> ---------- Forwarded message ----------
> From: "Richard W.M. Jones" <rjones at redhat.com>
> To: v2v-devel at redhat.com
> Cc:
> Bcc:
> Date: Wed, 22 Aug 2018 14:42:51 +0100
> Subject: Limiting virt-v2v block I/O using cgroups - network I/O still
> unknown
> Turns out this is fairly easy, although quite obscure.
>
> Just use ‘systemd-run --pipe’ to run the virt-v2v command in a cgroup.
> The ‘--pipe’ option ensures it is still connected to stdin/stdout/
> stderr (but see below).
>
>   $ systemd-run --user --pipe \
>       -p BlockIOWriteBandwidth="/dev/sda2 1K" \
>       virt-v2v -i disk /var/tmp/fedora-27.img -o local -os /var/tmp
>
>   Running as unit: run-u4429.service
>   [   0.0] Opening the source -i disk /var/tmp/fedora-27.img
>   [   0.0] Creating an overlay to protect the source from being modified
>   etc.
>
> See systemd.resource-control(5) for a list of controls.  If you are
> experimenting with this then it is easier to start with a command like
> ‘sleep 1000000’ rather than using virt-v2v.
>
> Some notes:
>
> (1) systemd-run changes the directory to ‘/’ so all path parameters to
> virt-v2v must be absolute.
>
> (2) If using something called "cgroup v2" you have to use
> IOWriteBandwithMax.  However even though I'm using a very recent
> Fedora & Linux kernel, I'm apparently not using cgroup v2.
>
> (3) You can modify the settings on the fly using:
>
>   $ systemctl [--user] set-property --runtime run-uXXX.service \
>       BlockIOWriteBandwith="/dev/sda2 NEW_SETTING"
>
> where run-uXXX.service is the service name printed by systemd-run
> before it starts virt-v2v.
>
> Also:
>
>   $ systemctl --user show run-u4466.service | grep BlockIO
>   BlockIOAccounting=no
>   BlockIOWeight=[not set]
>   StartupBlockIOWeight=[not set]
>   BlockIOWriteBandwidth=/dev/sda2 10000
>
> Note you have to enable BlockIOAccounting to collect stats.
>
> Also:
>
>   systemctl [--user] status run-uXXX.service
>
> to read information about the status of the service.
>
> Also there are other systemd/cgroup tools like ‘systemd-cgtop’ and
> ‘systemd-cgls’ which may be useful to administrators.
>
> (4) Specifying the block device is a PITA.  My reading of the
> documentation makes me think you can use filesystem paths instead of
> block device names, but it didn't work for me
> (https://github.com/systemd/systemd/issues/9908).
>
>                 - * - * - * -
>
> Network prioritization (what we actually care about) is quite a lot
> more complex.  First of all, documentation everywhere refers to
> net_cls (NetClass), but that is apparently obsolete.  The new thing is
> net_prio, but virtual nothing talks about how to use that.  In any
> case we'd use it in conjunction with ‘tc’.
>
>                 - * - * - * -
>
> While looking into this I wondered if it wouldn't be better to run
> virt-v2v in a proper systemd unit.  We'd use systemd to set up
> logging.  The wrapper would simply become a small script that controls
> the unit through systemctl.
>
> Rich.
>
> --
> Richard Jones, Virtualization Group, Red Hat
> http://people.redhat.com/~rjones
> Read my programming and virtualization blog: http://rwmj.wordpress.com
> virt-top is 'top' for virtual machines.  Tiny program with many
> powerful monitoring features, net stats, disk stats, logging, etc.
> http://people.redhat.com/~rjones/virt-top
>

-- 

*Fabien Dupont*

PRINCIPAL SOFTWARE ENGINEER

Red Hat - Solutions Engineering

fabien at redhat.com     M: +33 (0) 662 784 971 <+33662784971>

<http://redhat.com>  *TRIED. TESTED. TRUSTED.*

Twitter: @redhatway <https://twitter.com/redhatway> | Instagram: @redhatinc
<https://www.instagram.com/redhatinc/> | Snapchat: @redhatsnaps
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/libguestfs/attachments/20180926/5c1dbf90/attachment.htm>