[Libguestfs] New extents structure proposal

Thu Mar 21 14:10:57 UTC 2019

On Thu, Mar 21, 2019 at 09:01:26AM -0500, Eric Blake wrote:
>On 3/21/19 7:18 AM, Martin Kletzander wrote:
>
>>> For a read-only client: caching 'data' regions is okay, caching 'zero'
>>> or 'hole' regions is bad (because even though you are not modifying the
>>> image, another writer might be; demoting 'hole' to 'data' is safe - it
>>> merely pessimizes into a read() instead of skipping work; but caching
>>> 'hole' that is later promoted to 'data' is wrong - it can cause data
>>> lass if the client doesn't read the actual data).
>>>
>>> For a writing client: either you are an exclusive writer (and should
>>> know what you wrote, so the cache is the fact that you changed the state
>>> yourself) or you are on a cluster filesystem (at which point, your
>>> cluster system better have its own rules for how to resolve races
>>> inherent in multiple writers, where you shouldn't be relying on block
>>> status but on the cluster protocol in the first place).
>>>
>>
>> Even for the reading client, you don't need to need to access a place on
>> the
>> disk twice, even one access is racy because there can be a change between
>> BLOCK_STATUS and READ.  And that same thing happens in the plugins for
>> files and
>> everything that someone else can access.  I don't think it is designed for
>> concurrent access.  Or is it?
>
>Indeed, there is always a TOCTTOU race when you rely on block status if
>there is ever a concurrent writer.  But, is it dangerous?  Without a
>block status, we can have either:
>
>reader           writer
>------------------------
>read sector X as A
>                 write sector X as B
>
>or:
>
>reader           writer
>------------------------
>                 write sector X as B
>read sector X as B
>
>with the two steps, we have one of:
>
>reader           writer
>------------------------
>learn sector X has status Y
>read sector X as A
>                 write sector X as B
>
>reader           writer
>------------------------
>learn sector X has status Y
>                 write sector X as B
>read sector X as B
>
>reader           writer
>------------------------
>                 write sector X as B
>learn sector X has status Y
>read sector X as B
>
>where the shortcut is that if the reader sees status 'hole', it skips a
>read. Had it done the read in spite of learning about a hole, it would
>either see all 0s (contents A - but that's no different than read
>winning the race without a status check), or the new content (contents B
>just written by the writer - proving the hole status is out of date, but
>no different from losing the race without a status check).
>
>In general, trying to copy an image while it is being modified is not
>going to work reliably; the main point of block status is to make
>copying more efficient, but when copying, you are assuming no concurrent
>modifications.
>

Yes, basically it should not be used for communication.  So we are on the same
page, I just wanted to confirm that.

>--
>Eric Blake, Principal Software Engineer
>Red Hat, Inc.           +1-919-301-3226
>Virtualization:  qemu.org | libvirt.org
>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/libguestfs/attachments/20190321/01589736/attachment.sig>