[Libguestfs] [nbdkit PATCH v2] Introduce cacheextents filter

Eric Blake eblake at redhat.com
Mon May 20 14:08:02 UTC 2019


On 5/20/19 8:23 AM, Martin Kletzander wrote:

>>> +  if (cache_extents &&
>>> +      offset >= cache_start && offset < cache_end) {
>>> +    nbdkit_debug ("cacheextents: returning from cache");
>>> +    return cacheextents_add (extents, err);
>>> +  }
>>> +
>>> +  nbdkit_debug ("cacheextents: cache miss");
>>> +  int r = next_ops->extents (nxdata, count, offset, flags, extents,
>>> err);
>>
>> This is a bit pessimistic.  Observe:
>> request A (offset=0, count=1G) populates extents (0-.5G data, .5G-1.5G
>> hole)
>> request B (offset=.5G, count=1G) serviced from the cache
>> compared to:
>> request A (offset=0, count=1G) populates extents (0-.5G data, .5G-1.0G
>> hole)
>> request B (offset=.5G, count=1G) treated as cache miss
>>
>> It should be possible to note that request B overlaps with the cache,
>> and to therefore do a three-step action: fill 'extents' with the tail of
>> the cached extents (filling .5-1.0G), call next_ops->extents for the
>> remainder (1.0G-wherever), extend the cache to fill out the additional
>> information learned.
>>
> 
> I'm not quite sure what you mean.  Are you suggesting keeping the previous
> information and just updating it with new ones?  If yes, then my
> response would
> be yes, sure, I can do that, but...

Okay, re-reading the code, you aren't checking whether offset+count lies
within the cache, but are instead truncating your answer back to the
client to the amount that was previously cached, and relying on the
client's next request to likely start at the first byte that was not
cached. So at most, you end up splitting what could have been a
contiguous answer that crossed the cache_end boundary from one extent
reply into two.  Still, I wonder if clearing the REQ_ONE flag before
calling into next_ops->extents(), or possibly even calling into
next_ops->extents() with count widened out to a larger value, to learn
as much information as possible on one pass, can result in fewer calls
into next_ops.

> 
> 1) I would have to change the way the cached data are stored

No, you'd still store the cache in the same way. It is more a matter of
noting that what you have stored has a start and end boundary, and your
incoming request partially overlaps what you already have stored, so use
the cached values for the first half as well as limiting the call to the
actual plugin to the second half. Something like:

bool append = false;
if (cache_extents &&
    offset >= cache_start && offset < cache_end) {
    /* Partially or completely cached */
    if (cacheextents_add (extents, err) == -1)
        return -1;
    if (offset + count <= cache_end)
        return 0;
    count -= cache_end - offset;
    offset = cache_end;
    append = true;
}

/* Miss or partially cached */
if (next_ops->extents (nxdata, count, offset, flags, extents, err) == -1)
    return -1;

return cacheextents_fill (extents, append, err);

and maybe even consider passing in a larger count to next_ops->extents,
or even intentionally clearing the REQ_ONE flag before calling into
next_ops, if our goal is to get as much information as feasible from
next_ops for saving in our cache.

> 2) Make sure that when updating the data nothing is rewritten
> 3) When returning data from cache check for continuity
> 
> and when we're at that, we can also:
> 
> 3) Make the responses to write-actions more granular instead of just
> throwing
>    away the cache.

If the point of the cache is to speed up one-time read passes, we aren't
writing. I think invalidating the entire cache on writes is just fine,
where the focus is to make reads faster when there are no competing writes.

> 
> Of course, none of the above is impossible or blocking, I just don't
> think it
> would provide much of an added value, since most of the time the
> requests would
> be sequential and if there is a quick response (or the client is not using
> req_one), this would not speed things up.  So this is quite a niche
> filter for
> now.

So maybe it's not worth the effort of trying anything fancier; so much
as a question to at least let us think about it.  I'm fine if we go with
your initial approach as the version we check in first.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/libguestfs/attachments/20190520/bced694e/attachment.sig>


More information about the Libguestfs mailing list