[Libguestfs] New extents structure proposal

Thu Mar 21 12:18:57 UTC 2019

On Wed, Mar 20, 2019 at 12:11:57PM -0500, Eric Blake wrote:
>On 3/20/19 11:57 AM, Richard W.M. Jones wrote:
>
>>
>>>> Also an observation: qemu's nbd client only ever issues block status
>>>> requests with the req-one flag set, so perhaps we should optimize for
>>>> that case.
>>>
>>> I hope to get to the point where future qemu doesn't send the req-one
>>> flag. There's several threads on the qemu list about how qemu-img is
>>> slower than it needs to be because it is throwing away useful
>>> information, and where it is aggravated by the kernel's abyssmal lseek()
>>> performance on tmpfs.  But until qemu learns useful caching, you're
>>> right that most existing NBD clients that request block status do so one
>>> extent at a time (because I don't know of any other existing NBD clients
>>> that use BLOCK_STATUS yet).
>>
>> Is it ever possible to cache block status results?  What happens in
>> the (admittedly unusual) case where two writers are hitting the same
>> NBD server?  For example if the server is implementing a cluster
>> filesystem.
>
>For a read-only client: caching 'data' regions is okay, caching 'zero'
>or 'hole' regions is bad (because even though you are not modifying the
>image, another writer might be; demoting 'hole' to 'data' is safe - it
>merely pessimizes into a read() instead of skipping work; but caching
>'hole' that is later promoted to 'data' is wrong - it can cause data
>lass if the client doesn't read the actual data).
>
>For a writing client: either you are an exclusive writer (and should
>know what you wrote, so the cache is the fact that you changed the state
>yourself) or you are on a cluster filesystem (at which point, your
>cluster system better have its own rules for how to resolve races
>inherent in multiple writers, where you shouldn't be relying on block
>status but on the cluster protocol in the first place).
>

Even for the reading client, you don't need to need to access a place on the
disk twice, even one access is racy because there can be a change between
BLOCK_STATUS and READ.  And that same thing happens in the plugins for files and
everything that someone else can access.  I don't think it is designed for
concurrent access.  Or is it?

>--
>Eric Blake, Principal Software Engineer
>Red Hat, Inc.           +1-919-301-3226
>Virtualization:  qemu.org | libvirt.org
>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/libguestfs/attachments/20190321/0e1da053/attachment.sig>