[Libguestfs] [PATCH 2/5] v2v/lib/util.ml: Get disk allocation from input

Laszlo Ersek lersek at redhat.com
Mon Dec 20 10:04:16 UTC 2021


On 12/19/21 00:30, Nir Soffer wrote:

> Laszlo, can you explain why we need the number of allocated bytes
> after the disk was already converted to the target storage?

It's all described in commit a2a4f7a09996 ("lib/create_ovf: populate
"actual size" attributes again", 2021-12-10):

> commit a2a4f7a09996a5e66d027d0d9692e083eb0a8128
> Author: Laszlo Ersek <lersek at redhat.com>
> Date:   Fri Dec 10 12:35:37 2021 +0100
>
>     lib/create_ovf: populate "actual size" attributes again
>
>     Commit 255722cbf39a ("v2v: Modular virt-v2v", 2021-09-07) removed the
>     following attributes from the OVF output:
>
>     - ovf:Envelope/References/File/@ovf:size
>     - ovf:Envelope/Section[@xsi:type='ovf:DiskSection_Type']/Disk/@ovf:actual_size
>
>     Unfortunately, ovirt-engine considers the second one mandatory; without
>     it, ovirt-engine refuses to import the OVF.
>
>     Restore both attributes, using the utility functions added to the Utils
>     module previously.
>
>     (If we do not have the information necessary to fill in @ovf:actual_size,
>     we still have to generate the attribute, only with empty contents.
>     Ovirt-engine does cope with that.)
>
>     Fixes: 255722cbf39afc0b012e2ac00d16fa6ba2f8c21f
>     Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2027598
>     Signed-off-by: Laszlo Ersek <lersek at redhat.com>
>     Message-Id: <20211210113537.10907-4-lersek at redhat.com>
>     Acked-by: Richard W.M. Jones <rjones at redhat.com>

Basically if we place an OVF file in the Export Storage Domain directory
of ovirt-engine such that the OVF lacks the above-mentioned
"actual_size" attribute, then ovirt-engine rejects the OVF file with a
parse error.

This behavior is actually a regression from ovirt-engine commit
1082d9dec289 ("core: undo recent generalization of ovf processing",
2017-08-09; however, we triggered this ovirt-engine bug first in
virt-v2v commit 255722cbf39a ("v2v: Modular virt-v2v", 2021-09-07).
Therefore restoring the previous virt-v2v behavior makes sense. We could
of course just provide an empty actual_size='' attribute (as explained
above), but that would still not restore the original virt-v2v behavior,
and I cannot tell what the consequences for ovirt-engine would be.

Back to your email:

On 12/19/21 00:30, Nir Soffer wrote:

> Looking at the bug:
> https://bugzilla.redhat.com/2027598
>
> This is a fix for the issue in the old rhv output, which is deprecated
> and

I made the exact same argument before my analysis / fix, namely in
bullet (1) of <https://bugzilla.redhat.com/show_bug.cgi?id=2027598#c16>.

Please see Rich's answer to that at
<https://bugzilla.redhat.com/show_bug.cgi?id=2027598#c19>. That was the
reason we continued working on the analysis / patch at all.

> should not be used in new code, and it is not compatible with
> rhv-upload which is the modern way to import into RHV.
>
> Also this cannot work for RHV now, so we need a way to disable this
> when importing to RHV.
>
> We can expose the block status in RHV plugin, but to use this we must
> call block status after the disk is converted, but before the output
> is finalized, since when you finalize RHV disconnect the disk from the
> host, and the RHV plugin cannot access it any more.

The code very much depends on being run during finalization -- please
search commit a2a4f7a09996 for the word "finalization".

The disconnection that you describe -- does that happen in the
"rhv-upload-finalize.py" script?

This is what we've got in "output/output_rhv_upload.ml":

>   (* Finalize all the transfers. *)
>   let json_params =
>     let ids = List.map (fun id -> JSON.String id) transfer_ids in
>     let json_params = ("transfer_ids", JSON.List ids) :: json_params in
>     let ids = List.map (fun uuid -> JSON.String uuid) disk_uuids in
>     let json_params = ("disk_uuids", JSON.List ids) :: json_params in
>     json_params in
>   if Python_script.run_command finalize_script json_params [] <> 0 then         <------- is the output disk detached here?
>     error (f_"failed to finalize the transfers, see earlier errors");
>
>   (* The storage domain UUID. *)
>   let sd_uuid =
>     match rhv_storagedomain_uuid with
>     | None -> assert false
>     | Some uuid -> uuid in
>
>   (* The volume and VM UUIDs are made up. *)
>   let vol_uuids = List.map (fun _ -> uuidgen ()) disk_sizes
>   and vm_uuid = uuidgen () in
>
>   (* Create the metadata. *)
>   let ovf =
>     Create_ovf.create_ovf source inspect target_meta disk_sizes                 <------- output NBD block status collected here
>       Sparse output_format sd_uuid disk_uuids vol_uuids dir vm_uuid
>       OVirt in
>   let ovf = DOM.doc_to_string ovf in

Back to your email:

On 12/19/21 00:30, Nir Soffer wrote:

> The flow for rhv-upload should be:
>
>     run nbdcopy
>     query block status
>     finalize output
>     create ovf

Currently, commit a2a4f7a09996 passes the temp dir, where the NBD
sockets live, to "create_ovf". Then "create_ovf" collects the block
status for each disk.

One of the other approaches I outlined earlier was this: let each output
plugin collect the list of block statuses (-> "int64 option list")
before calling "create_ovf". Then pass that *list* to "create_ovf".
Please see here
<https://listman.redhat.com/archives/libguestfs/2021-December/msg00116.html>:

> Yet another idea was to collect the actual sizes for all the output
> disks in one go, separately, then pass in that list, to create_ovf. Then
> in create_ovf, I would replace:
>
>   (* Iterate over the disks, adding them to the OVF document. *)
>   List.iteri (
>     fun i (size, image_uuid, vol_uuid) ->
> ...
>   ) (List.combine3 sizes image_uuids vol_uuids)
>
> with "combine4", adding a fourth component (the actual size) to the
> tuple parameter of the nameless iterator function.

This complication did not seem justified at that point (see the last
paragraph at
<https://listman.redhat.com/archives/libguestfs/2021-December/msg00119.html>).

I can do this *if*:

- you can please tell me *where exactly* I should collect the block
  statuses inside the "rhv_upload_finalize" function
  [output/output_rhv_upload.ml] -- that is, if you can confirm where the
  disks get detached (because I need to collect the acutal sizes some
  place before that)

- AND we can confirm if the "actual sizes" are actually useful for *all*
  output plugins different from the old rhv output. The whole thing is
  being done for the old rhv output's sake -- if neither "rhv-upload"
  nor "vdsm" (= the other two OVF-creating output plugins) need the
  actual disk sizes, then the actual list construction should be unique
  to "output/output_rhv_upload.ml".

Sorry about the regression, my only excuse is that the comment that
commit a2a4f7a09996 removed:

-      (* virt-v2v 1.4x used to try to collect the actual size of the
-       * sparse disk.  It would be possible to get this information
-       * accurately now by reading the extent map of the output disk
-       * (this function is called during finalization), but we don't
-       * yet do that. XXX
-       *)

implied that fetching the extent map was safe during finalization. The
rhv-upload output plugin does not conform to that invariant however.

Thanks,
Laszlo




More information about the Libguestfs mailing list