[Libguestfs] [PATCH v7 6/6] v2v: Add -o rhv-upload output mode (RHBZ#1557273).

Sat Mar 24 22:36:06 UTC 2018

On Thu, Mar 22, 2018 at 5:25 PM Richard W.M. Jones <rjones at redhat.com>
wrote:

> PROBLEMS:
>  - -of qcow2 does not work, with multiple problems
>     * needs to set NBD size to something larger than virtual size
>

How is this related to the actual file size specified when you create a
disk?

In block storage, qcow2 image is always on a thin provisioned disk, which
in oVirt is a regular logical volume, created at the requested
"initial_size":

From:

https://github.com/oVirt/ovirt-engine-sdk/blob/aba2a83ec94ecac1594adfff62d3cfcfdbdef416/sdk/examples/upload_disk.py#L113

if image_info["format"] == "qcow2":
    disk_format = types.DiskFormat.COW
else:
    disk_format = types.DiskFormat.RAW

disks_service = connection.system_service().disks_service()
disk = disks_service.add(
    disk=types.Disk(
        name=os.path.basename(image_path),
        description='Uploaded disk',
        format=disk_format,
        initial_size=image_size,
        provisioned_size=image_info["virtual-size"],
        sparse=disk_format == types.DiskFormat.COW,
        storage_domains=[
            types.StorageDomain(
                name='mydata'
            )
        ]
    )
)

Internally we do allocated 10% more then the requested initial_size
to leave room for qcow2 metadata.

See:

https://github.com/oVirt/vdsm/blob/3de6f326d7e338b064b11d2a2269500a3857098b/lib/vdsm/storage/blockVolume.py#L328

https://github.com/oVirt/vdsm/blob/3de6f326d7e338b064b11d2a2269500a3857098b/lib/vdsm/storage/volume.py#L666

Do we have other problems?

>  - Cannot choose the datacenter.
>

The storage domain belongs to some data center, so I don't think you
need to select a data center.

[snipped]

 diff --git a/v2v/rhv-upload-plugin.py b/v2v/rhv-upload-plugin.py

> new file mode 100644
> index 000000000..979cbd63b
> --- /dev/null
> +++ b/v2v/rhv-upload-plugin.py
> @@ -0,0 +1,414 @@
> +# -*- python -*-
> +# oVirt or RHV upload nbdkit plugin used by ‘virt-v2v -o rhv-upload’
> +# Copyright (C) 2018 Red Hat Inc.
> +#
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 2 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License along
> +# with this program
> <https://maps.google.com/?q=with+this+program&entry=gmail&source=g>; if
> not, write to the Free Software Foundation, Inc.,
> +# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
> +
> +import builtins
> +import json
> +import logging
> +import ovirtsdk4 as sdk
> +import ovirtsdk4.types as types
>

It is good practice to separate stdlib imports and 3rd party imports
like ovirtsdk, helps to understand the dependencies in modules.

[snipped]

> +from http.client import HTTPSConnection
> +from urllib.parse import urlparse
>

These will not work in python 2, but you can use six.moves to have
code that works on both 2 and 3.

[snipped]

> +    # Connect to the server.
> +    connection = sdk.Connection(
> +        url = params['output_conn'],
> +        username = username,
> +        password = password,
> +        ca_file = params['rhv_cafile'],
>

Can this be None?

> +        log = logging.getLogger(),
> +        insecure = params['insecure'],
>

If ca_file cannot be None, then insecure is not needed, based
on Ondra review from earlier version.

[snipped]

> +    # Create the disk.
> +    disks_service = system_service.disks_service()
> +    if params['disk_format'] == "raw":
> +        disk_format = types.DiskFormat.RAW
> +    else:
> +        disk_format = types.DiskFormat.COW
> +    disk = disks_service.add(
> +        disk = types.Disk(
> +            name = params['disk_name'],
> +            description = "Uploaded by virt-v2v",
> +            format = disk_format,
> +            provisioned_size = params['disk_size'],
>

This must be the virtual size.

You don't specify initial_size - in this case you get 1G, and most
images will fail to upload.

> +            sparse = params['output_sparse'],
>

The user cannot configure that. This must be based on the image
format. The current coded may create images in unsupported
combinations, e.g. raw/sparse on block storage, or fail validation
in engine.

[snipped]

> +# Can we issue zero, trim or flush requests?
>
+def get_options(h):
> +    if h['got_options']:
> +        return
> +    h['got_options'] = True
> +
> +    http = h['http']
> +    transfer=h['transfer']
> +
> +    http.putrequest("OPTIONS", h['path'])
> +    http.putheader("Authorization", transfer.signed_ticket)
> +    http.endheaders()
> +
> +    r = http.getresponse()
> +    if r.status == 200:
> +        j = json.loads(r.read())
> +        h['can_zero'] = "zero" in j['features']
> +        h['can_trim'] = "trim" in j['features']
> +        h['can_flush'] = "flush" in j['features']
> +
> +        # If can_zero is true then it means we're using the new imageio
> +        # which never needs the Authorization header.
> +        if h['can_zero']:
> +            h['needs_auth'] = False
>

If we got here, we are working with new daemon or proxy, and both
of them do not need auth, so we can set 'needs_auth' to False
if OPTIONS returned 200 OK.

> +    # Old imageio servers returned 405 Method Not Allowed.  If
> +    # we see that we just say that no operations are allowed and
> +    # we will emulate them.
> +    elif r.status == 405:
> +        h['can_zero'] = False
> +        h['can_trim'] = False
> +        h['can_flush'] = False
>

I would set all the defaults when creating the sate dict. This ensures
that we don't forget needs_auth or other keys and crash later with
KeyError, and make it easier to understand what is the state managed
by the plugin.

[snipped]

+def pwrite(h, buf, offset):
> +    http = h['http']
> +    transfer=h['transfer']
> +    transfer_service=h['transfer_service']
> +
> +    count = len(buf)
> +    h['highestwrite'] = max(h['highestwrite'], offset+count)
> +
> +    http.putrequest("PUT", h['path'] + "?flush=n")
>

"?flush=n" has effect only if h["can_flush"] is True. Older daemon/proxy
do not know support disabling flushing. The docs mention that this
query string will be added in 1.3, we are now at 1.2.

> +    if h['needs_auth']:
> +        http.putheader("Authorization", transfer.signed_ticket)

+    # The oVirt server only uses the first part of the range, and the
> +    # content-length.

[snipped]

> +def zero(h, count, offset, may_trim):
> +    http = h['http']
> +    transfer=h['transfer']
> +    transfer_service=h['transfer_service']
> +
> +    # Unlike the trim and flush calls, there is no 'can_zero' method
> +    # so nbdkit could call this even if the server doesn't support
> +    # zeroing.  If this is the case we must emulate.
> +    if not h['can_zero']:
> +        emulate_zero(h, count, offset)
> +        return
> +
> +    # Construct the JSON request for zeroing.
> +    buf = json.dumps({'op': "zero",
> +                      'offset': offset,
> +                      'size': count,
> +                      'flush': False})
> +
> +    http.putrequest("PATCH", h['path'])
> +    http.putheader("Content-Type", "application/json")
> +    if h['needs_auth']:
> +        http.putheader("Authorization", transfer.signed_ticket)
>

Only GET and PUT on a server that does not implement OPTIONS need auth.

This will work if h['needs_auth'] is set correctly, but I think it is
better to include
this check only in pread() and pwrite(), and add a comment there that this
is
need to support legacy versions.

> +    http.putheader("Content-Length", len(buf))
> +    http.endheaders()
> +    http.send(buf)
> +
> +    r = http.getresponse()
> +    if r.status != 200:
> +        transfer_service.pause()
> +        h['failed'] = True
> +        raise RuntimeError("could not zero sector (%d, %d): %d: %s" %
> +                           (offset, count, r.status, r.reason))
> +
> +# qemu-img convert starts by trying to zero/trim the whole device.
> +# Since we've just created a new disk it's safe to ignore these
> +# requests as long as they are smaller than the highest write seen.
> +# After that we must emulate them with writes.
>

I think this comment is not related to this code. Maybe it belongs to
write() where we compute the highwrite?

> +def emulate_zero(h, count, offset):
> +    if offset+count < h['highestwrite']:
> +        http.putrequest("PUT", h['path'] + "?flush=n")
>

This is is used only on old daemon/proxy, the flush has no effect.

[snipped]

+def trim(h, count, offset):
> +    http = h['http']
> +    transfer=h['transfer']
> +    transfer_service=h['transfer_service']
> +
> +    # Construct the JSON request for trimming.
> +    buf = json.dumps({'op': "trim",
> +                      'offset': offset,
> +                      'size': count,
> +                      'flush': False})
> +
> +    http.putrequest("PATCH", h['path'])
> +    http.putheader("Content-Type", "application/json")
> +    if h['needs_auth']:
> +        http.putheader("Authorization", transfer.signed_ticket)
>

Never needed.

[snipped]

> +def flush(h):
> +    http = h['http']
> +    transfer=h['transfer']
> +    transfer_service=h['transfer_service']
> +
> +    # Construct the JSON request for flushing.  Note the offset
> +    # and count must be present but are ignored by imageio.
> +    buf = json.dumps({'op': "flush",
> +                      'offset': 0,
> +                      'size': 0,
> +                      'flush': True})
>

Should be (discussed in v6)

    buf = json.dumps({"op": "flush"})

> +
> +    http.putrequest("PATCH", h['path'])
> +    http.putheader("Content-Type", "application/json")
> +    if h['needs_auth']:
> +        http.putheader("Authorization", transfer.signed_ticket)
>

Never needed.

[snipped]

> +def close(h):
> +    http = h['http']
> +    connection = h['connection']
> +
> +    http.close()
> +
> +    # If we didn't fail, then finalize the transfer.
> +    if not h['failed']:
> +        disk = h['disk']
> +        transfer_service=h['transfer_service']
> +
> +        transfer_service.finalize()
>

If this raises, we never clean up

[snipped]

> +        # Write the disk ID file.  Only do this on successful completion.
> +        with builtins.open(params['diskid_file'], 'w') as fp:
> +            fp.write(disk.id)
>

If this raises, we will not remove the disk, or close the connection.

> +
> +    # Otherwise if we did fail then we should delete the disk.
> +    else:
> +        disk_service = h['disk_service']
> +        disk_service.remove()
> +
> +    connection.close()
>

[snipped]

I did not check all the SDK related code, I'm not very familiar with the
SDK.

Thanks for creating this, and sorry for the bad documentation on our side
:-)

Nir
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/libguestfs/attachments/20180324/1d5a87ed/attachment.htm>