[Libguestfs] Supporting sparse disks in nbdkit

Martin Kletzander mkletzan at redhat.com
Tue Mar 12 09:41:18 UTC 2019


On Tue, Mar 12, 2019 at 08:26:46AM +0000, Richard W.M. Jones wrote:
>On Tue, Mar 12, 2019 at 09:04:59AM +0100, Martin Kletzander wrote:
>> On Mon, Mar 11, 2019 at 02:53:55PM -0500, Eric Blake wrote:
>> >On 3/11/19 2:44 PM, Richard W.M. Jones wrote:
>> >>On Fri, Mar 08, 2019 at 10:50:35AM -0600, Eric Blake wrote:
>> >>>>  int extents (void *handle, uint32_t count, uint64_t offset,
>> >>>>               uint32_t flags /* always 0? */,
>> >>>
>> >>>Unless we support the REQ_ONE flag (where the client specifically wants
>> >>>only one extent returned).
>> >>...
>> >>>>nbdkit would need to do some massaging on this to get it into the
>> >>>>right format for NBD_CMD_BLOCK_STATUS.  (I'm very confused about what
>> >>>>NBD_CMD_FLAG_REQ_ONE is supposed to do.)
>> >>>
>> >>>REQ_ONE says that a successful answer has to be exactly one extent,
>> >>>covering only the head of the region. (That is, force the server to
>> >>>behave as if obtaining information beyond the first extent is
>> >>>expensive). It also tells the server that it must not report extra
>> >>>information beyond the initial request (because at least qemu 3.0 would
>> >>>assert that the server supplied too much information).
>> >>>
>> >>>You're also right that nbdkit can do REQ_ONE massaging, so whether we
>> >>>pass REQ_ONE on to the client or not makes it sound like the client
>> >>>needs a tri-state opt-in (no block status support, block status but let
>> >>>nbdkit handle REQ_ONE, block status and the plugin can handle REQ_ONE
>> >>>efficiently itself)
>> >>
>> >>I had a think about the REQ_ONE flag a little.  What do you think
>> >>about this plan?
>> >>
>> >>If NBD_CMD_FLAG_REQ_ONE is passed by the nbd client, nbdkit does not
>> >>need to pass it to the plugin.  Instead nbdkit would turn the request
>> >>into:
>> >>
>> >>extents (count = 1, offset = <offset>)
>> >>
>> >>The plugin could still respond with multiple extents, but nbdkit would
>> >>pick only the one covering the <offset> to send back.
>> >>
>> >>In this case can_extents would return just a boolean since there is no
>> >>special supported needed by the plugin AFAICT.
>> >
>>
>> Is that needed then?  You would know whether the plugin can do
>> extents() based on the function pointer being set or not.  I mean
>> can_write makes sense as that is per file (volume/anything), but
>> can_extents would be set per plugin and it would be set to ".extents
>> != NULL".
>
>We have can_* functions for most optional plugin features for a couple
>of reasons:
>
>(1) In non-C languages you can't just check the function pointer.
>

This makes sense, that's what I was missing.

>(2) It's possible (in fact, quite likely) that a plugin could support
>extents only for certain disk images, but would always have an
>‘extents’ function.
>
>As an example of (2), on Linux different filesystems have different
>levels of support for sparseness and sparse detection and so the
>nbdkit-file-plugin might implement ‘extents’ conditionally.  As
>another example, a generalized "archive" plugin would support
>sparseness only for (say) tarballs but not for ZIP files.
>

This does too, but I thought this would be "avoided" by just reporting the whole
zip as dense.

>> >Seems like it would work. Still, I worry that some plugin might be doing
>> >some expensive work to compute a list of extents, only for nbdkit to
>> >throw that work away, vs. being able to accept a flag to know that
>> >nothing more than one extent will be used so the plugin might as well
>> >quit work early if it wants. Maybe we just document that we pass the
>> >flag through when the client requested it, but also permit the plugin to
>> >ignore the flag (plugins that care have the flag to optimize by, and
>> >plugins that don't care will not violate client expectations because
>> >nbdkit clamps the answer appropriately).
>>
>> That would be visible by the `@count` parameter.  But if that count
>> will only be 0 (give me all you've got) and 1 (basically translated
>> REQ_ONE), maybe instead of passing the `count`, it could be just the
>> flag, i.e.: extents(offset, flags).  When I think about it, what
>> seems more usable might be a `length` parameter so that the plugin
>> does not need to check the whole file if someone is asking for a
>> part of the disk only.  Of course that thinking might be flawed as I
>> don't know what the request looks like in the NBD protocol.
>
>I think I'm going to pass the REQ_ONE flag to the plugin, but the
>plugin is free to either ignore it or implement it as an optimization.
>If the flag is present and the plugin returns more than one extent
>then nbdkit will determine the single extent to return.
>

I think that's the nicest approach.

>As a general approach we want plugins to be easy to write and for
>nbdkit to do the hard work when necessary.
>
>Rich.
>
>-- 
>Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
>Read my programming and virtualization blog: http://rwmj.wordpress.com
>virt-builder quickly builds VMs from scratch
>http://libguestfs.org/virt-builder.1.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/libguestfs/attachments/20190312/f3dc0883/attachment.sig>


More information about the Libguestfs mailing list