[Libguestfs] Supporting sparse disks in nbdkit

Tue Mar 12 08:04:59 UTC 2019

On Mon, Mar 11, 2019 at 02:53:55PM -0500, Eric Blake wrote:
>On 3/11/19 2:44 PM, Richard W.M. Jones wrote:
>> On Fri, Mar 08, 2019 at 10:50:35AM -0600, Eric Blake wrote:
>>>>   int extents (void *handle, uint32_t count, uint64_t offset,
>>>>                uint32_t flags /* always 0? */,
>>>
>>> Unless we support the REQ_ONE flag (where the client specifically wants
>>> only one extent returned).
>> ...
>>>> nbdkit would need to do some massaging on this to get it into the
>>>> right format for NBD_CMD_BLOCK_STATUS.  (I'm very confused about what
>>>> NBD_CMD_FLAG_REQ_ONE is supposed to do.)
>>>
>>> REQ_ONE says that a successful answer has to be exactly one extent,
>>> covering only the head of the region. (That is, force the server to
>>> behave as if obtaining information beyond the first extent is
>>> expensive). It also tells the server that it must not report extra
>>> information beyond the initial request (because at least qemu 3.0 would
>>> assert that the server supplied too much information).
>>>
>>> You're also right that nbdkit can do REQ_ONE massaging, so whether we
>>> pass REQ_ONE on to the client or not makes it sound like the client
>>> needs a tri-state opt-in (no block status support, block status but let
>>> nbdkit handle REQ_ONE, block status and the plugin can handle REQ_ONE
>>> efficiently itself)
>>
>> I had a think about the REQ_ONE flag a little.  What do you think
>> about this plan?
>>
>> If NBD_CMD_FLAG_REQ_ONE is passed by the nbd client, nbdkit does not
>> need to pass it to the plugin.  Instead nbdkit would turn the request
>> into:
>>
>> extents (count = 1, offset = <offset>)
>>
>> The plugin could still respond with multiple extents, but nbdkit would
>> pick only the one covering the <offset> to send back.
>>
>> In this case can_extents would return just a boolean since there is no
>> special supported needed by the plugin AFAICT.
>

Is that needed then?  You would know whether the plugin can do extents() based
on the function pointer being set or not.  I mean can_write makes sense as that
is per file (volume/anything), but can_extents would be set per plugin and it
would be set to ".extents != NULL".

>Seems like it would work. Still, I worry that some plugin might be doing
>some expensive work to compute a list of extents, only for nbdkit to
>throw that work away, vs. being able to accept a flag to know that
>nothing more than one extent will be used so the plugin might as well
>quit work early if it wants. Maybe we just document that we pass the
>flag through when the client requested it, but also permit the plugin to
>ignore the flag (plugins that care have the flag to optimize by, and
>plugins that don't care will not violate client expectations because
>nbdkit clamps the answer appropriately).
>

That would be visible by the `@count` parameter.  But if that count will only be
0 (give me all you've got) and 1 (basically translated REQ_ONE), maybe instead
of passing the `count`, it could be just the flag, i.e.: extents(offset, flags).
When I think about it, what seems more usable might be a `length` parameter so
that the plugin does not need to check the whole file if someone is asking for a
part of the disk only.  Of course that thinking might be flawed as I don't know
what the request looks like in the NBD protocol.

>--
>Eric Blake, Principal Software Engineer
>Red Hat, Inc.           +1-919-301-3226
>Virtualization:  qemu.org | libvirt.org
>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/libguestfs/attachments/20190312/1e1ec6f6/attachment.sig>