[Libguestfs] Supporting sparse disks in nbdkit

Richard W.M. Jones rjones at redhat.com
Tue Mar 12 08:26:46 UTC 2019


On Tue, Mar 12, 2019 at 09:04:59AM +0100, Martin Kletzander wrote:
> On Mon, Mar 11, 2019 at 02:53:55PM -0500, Eric Blake wrote:
> >On 3/11/19 2:44 PM, Richard W.M. Jones wrote:
> >>On Fri, Mar 08, 2019 at 10:50:35AM -0600, Eric Blake wrote:
> >>>>  int extents (void *handle, uint32_t count, uint64_t offset,
> >>>>               uint32_t flags /* always 0? */,
> >>>
> >>>Unless we support the REQ_ONE flag (where the client specifically wants
> >>>only one extent returned).
> >>...
> >>>>nbdkit would need to do some massaging on this to get it into the
> >>>>right format for NBD_CMD_BLOCK_STATUS.  (I'm very confused about what
> >>>>NBD_CMD_FLAG_REQ_ONE is supposed to do.)
> >>>
> >>>REQ_ONE says that a successful answer has to be exactly one extent,
> >>>covering only the head of the region. (That is, force the server to
> >>>behave as if obtaining information beyond the first extent is
> >>>expensive). It also tells the server that it must not report extra
> >>>information beyond the initial request (because at least qemu 3.0 would
> >>>assert that the server supplied too much information).
> >>>
> >>>You're also right that nbdkit can do REQ_ONE massaging, so whether we
> >>>pass REQ_ONE on to the client or not makes it sound like the client
> >>>needs a tri-state opt-in (no block status support, block status but let
> >>>nbdkit handle REQ_ONE, block status and the plugin can handle REQ_ONE
> >>>efficiently itself)
> >>
> >>I had a think about the REQ_ONE flag a little.  What do you think
> >>about this plan?
> >>
> >>If NBD_CMD_FLAG_REQ_ONE is passed by the nbd client, nbdkit does not
> >>need to pass it to the plugin.  Instead nbdkit would turn the request
> >>into:
> >>
> >>extents (count = 1, offset = <offset>)
> >>
> >>The plugin could still respond with multiple extents, but nbdkit would
> >>pick only the one covering the <offset> to send back.
> >>
> >>In this case can_extents would return just a boolean since there is no
> >>special supported needed by the plugin AFAICT.
> >
>
> Is that needed then?  You would know whether the plugin can do
> extents() based on the function pointer being set or not.  I mean
> can_write makes sense as that is per file (volume/anything), but
> can_extents would be set per plugin and it would be set to ".extents
> != NULL".

We have can_* functions for most optional plugin features for a couple
of reasons:

(1) In non-C languages you can't just check the function pointer.

(2) It's possible (in fact, quite likely) that a plugin could support
extents only for certain disk images, but would always have an
‘extents’ function.

As an example of (2), on Linux different filesystems have different
levels of support for sparseness and sparse detection and so the
nbdkit-file-plugin might implement ‘extents’ conditionally.  As
another example, a generalized "archive" plugin would support
sparseness only for (say) tarballs but not for ZIP files.

> >Seems like it would work. Still, I worry that some plugin might be doing
> >some expensive work to compute a list of extents, only for nbdkit to
> >throw that work away, vs. being able to accept a flag to know that
> >nothing more than one extent will be used so the plugin might as well
> >quit work early if it wants. Maybe we just document that we pass the
> >flag through when the client requested it, but also permit the plugin to
> >ignore the flag (plugins that care have the flag to optimize by, and
> >plugins that don't care will not violate client expectations because
> >nbdkit clamps the answer appropriately).
>
> That would be visible by the `@count` parameter.  But if that count
> will only be 0 (give me all you've got) and 1 (basically translated
> REQ_ONE), maybe instead of passing the `count`, it could be just the
> flag, i.e.: extents(offset, flags).  When I think about it, what
> seems more usable might be a `length` parameter so that the plugin
> does not need to check the whole file if someone is asking for a
> part of the disk only.  Of course that thinking might be flawed as I
> don't know what the request looks like in the NBD protocol.

I think I'm going to pass the REQ_ONE flag to the plugin, but the
plugin is free to either ignore it or implement it as an optimization.
If the flag is present and the plugin returns more than one extent
then nbdkit will determine the single extent to return.

As a general approach we want plugins to be easy to write and for
nbdkit to do the hard work when necessary.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html




More information about the Libguestfs mailing list