[Libguestfs] [PATCH 5/5] v2v: Add -o rhv-upload output mode.

Pino Toscano ptoscano at redhat.com
Fri Feb 23 12:37:32 UTC 2018


On Thursday, 22 February 2018 14:57:25 CET Richard W.M. Jones wrote:
> PROBLEMS:
>  - Spools to a temporary disk
>  - Need to specify direct/indirect upload via flag
>  - Location of ca.pem
>  - Target cluster
>  - Delete disks on failure, or rename disks on success?
>  - Handling of sparseness in raw format disks
> 
> This adds a new output mode to virt-v2v.  virt-v2v -o rhv-upload
> streams images directly to an oVirt or RHV >= 4 Data Domain using the
> oVirt SDK v4.  It is more efficient than -o rhv because it does not
> need to go via the Export Storage Domain, and is possible for humans
> to use unlike -o vdsm.
> 
> The implementation uses the Python SDK by running snippets of Python
> code to interact with the ‘ovirtsdk4’ module.  It requires both Python 3
> and the Python SDK v4 to be installed at run time (these are not,
> however, new dependencies of virt-v2v since most people wouldn't have
> them).
> ---

Since the patch is still RFC, and I cannot comment on the actual
procedure, I'll just leave few misc comments.

> +    | `RHV_Upload ->
> +      let output_conn =
> +        match output_conn with
> +        | None ->
> +           error (f_"-o rhv-upload: output connection was not specified, use ‘-oc’ to point to the oVirt or RHV server REST API URL")

More than "output connection", I'd just mention "REST API URL"
directly.

> +# Wait til the disk is up.  The transfer cannot start if the
> +# disk is locked.
> +disk_service = disks_service.disk_service(disk_id)
> +while True:
> +    time.sleep(5)
> +    disk = disk_service.get()
> +    if disk.status == types.DiskStatus.OK:
> +        break

Is a busy loop the only way to wait for an OK disk?

Regardless, this (and other busy loops in the Python snippets) IMHO
need also a timeout, otherwise any error coming from oVirt will block
v2v indefinitely...

> +# This seems to give the best throughput when uploading from Yaniv's
> +# machine to a server that drops the data. You may need to tune this
> +# on your setup.
> +BUF_SIZE = 128 * 1024
> +
> +with open(image_path, \"rb\") as disk:
> +    pos = 0
> +    while pos < image_size:
> +        # Send the next chunk to the proxy.
> +        to_read = min(image_size - pos, BUF_SIZE)
> +        chunk = disk.read(to_read)
> +        if not chunk:
> +            transfer_service.pause()
> +            raise RuntimeError(\"Unexpected end of file at pos=%%d\" %% pos)
> +
> +        proxy_connection.send(chunk)
> +        pos += len(chunk)

No need for 'pos', just use disk.tell() to get the current position
in the file being read.

Also, I'd flip it to the other side: instead of count the read bytes
until the file size, just keep read & send bytes, and error out when
reading more data than the file size.

Maybe something like the untested:

  with open(image_path, "rb") as disk:
      while True:
          # Send the next chunk to the proxy.
          chunk = disk.read(BUF_SIZE)
          if not chunk:
              transfer_service.pause()
              raise RuntimeError("Unexpected end of file at pos=%d" % file.tell())

          if disk.tell() > image_size:
              transfer_service.pause()
              # maybe even stat() image_path, and show more details in the exception message
              raise RuntimeError("Read more data than expected: pos=%d vs size=%d" % file.tell(), image_size)

          proxy_connection.send(chunk)

> +# Get the response
> +response = proxy_connection.getresponse()
> +if response.status != 200:
> +    transfer_service.pause()
> +    print(\"Upload failed: %%s %%s\" %%
> +          (response.status, response.reason))
> +    sys.exit(1)

sys.exit() vs raise?

> +(* Find the Python 3 binary. *)
> +let find_python3 () =
> +  let rec loop = function
> +    | [] ->
> +       error "could not locate Python 3 binary on the $PATH.  You may have to install Python 3.  If Python 3 is already installed then you may need to create a directory containing a binary called ‘python3’ which runs Python 3."
> +    | python :: rest ->
> +       (* Use shell_command first to check the binary exists. *)
> +       let cmd = sprintf "%s --help >/dev/null 2>&1" (quote python) in

Std_utils.which here fits better IMHO.  If the binary is broken, then
the run_python after this will fail anyway.

> +(* Parse the -oc URI. *)
> +let parse_output_conn oc =
> +  let uri = Xml.parse_uri oc in
> +  if uri.Xml.uri_scheme <> Some "https" then
> +    error (f_"rhv-upload: -oc: URI must start with https://...");
> +  if uri.Xml.uri_server = None then
> +    error (f_"rhv-upload: -oc: no remote server name in the URI");
> +  if uri.Xml.uri_path = None || uri.Xml.uri_path = Some "/" then
> +    error (f_"rhv-upload: -oc: URI path component looks incorrect");
> +  let username =
> +    match uri.Xml.uri_user with
> +    | None ->
> +       warning (f_"rhv-upload: -oc: username was missing from URI, assuming ‘admin at internal’");
> +       "admin at internal"
> +    | Some user -> user in
> +  (* Reconstruct the URI without the username. *)
> +  let url = sprintf "%s://%s%s"
> +                    (Option.default "https" uri.Xml.uri_scheme)
> +                    (Option.default "localhost" uri.Xml.uri_server)
> +                    (Option.default "" uri.Xml.uri_path) in

Since the checks above make sure uri.Xml.uri_server is not None, then
I'd use "invalid-hostname" or something like that instead of localhost,
just as safety measure.

> +               error (f_"rhv-upload: -of %s: Only output format ‘raw’ or ‘qcow2’ is supported.  If the input is in a different format then force one of these output formats by adding either ‘-of raw’ or ‘-of qcow’ on the command line.")

Typo, "qcow" -> "qcow2".

> +    (* Create the metadata. *)
> +    let ovf =
> +      Create_ovf.create_ovf source targets guestcaps inspect
> +                            Sparse
> +                            "domain"
> +                            (List.map (fun t -> "") targets)
> +                            target_disk_ids
> +                            "vm" in

Shouldn't real UUIDs (as in random UUIDs) be used instead of "domain",
and "vm"?

-- 
Pino Toscano
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part.
URL: <http://listman.redhat.com/archives/libguestfs/attachments/20180223/f74fd4c2/attachment.sig>


More information about the Libguestfs mailing list