[Libguestfs] [PATCH nbdkit 0/2] Rewrite xz plugin as a filter.

Wed Nov 21 16:29:55 UTC 2018

On 11/21/18 10:17 AM, Richard W.M. Jones wrote:

> Actually I think we are going to need to retain the block cache.  It
> solves a slightly different problem from placing the cache filter on
> top (in fact both are useful).
> 
> Let's say you have an XZ file with a 100,000 byte block size.  Then
> reading two blocks at 0-1000 and 1000-2000 would result in reading and
> uncompressing a whole block twice.  The block cache in the xz
> plugin/filter avoids this; the cache on top does not.
> 
> Interesting factoid: www.mirrorsite.org rapidly throttles any
> connection that makes repeated range requests ...  However if you open
> a new connection it is unaffected by the throttling on the existing
> connection (I thought it would throttle based on IP address).  Anyway
> this, combined with the large block size in the Fedora Cloud image,
> makes xz + curl virtually unusable.
> 
> I also think the new filter would be better if it made larger reads.
> The plugin makes 8K reads (BUFSIZ) which is likely reasonable for
> reading from a local file.  But the overhead of reading from the curl
> plugin probably makes much larger reads sensible.  I wonder if the
> filter can intuit a good block size to use somehow?

Yes, we need to revisit adding block sizing into nbdkit, as filters may 
easily optimize based on preferred blocksize of the lower layer, while 
possibly advertising a different blocksize up to the client.  The 
existing nbdkit-blocksize-filter would then gain some smarts for being 
more useful for controlling sizes between layers (again, back to the 
question of whether we should improve nbdkit filters to allow multiple 
reuse of the same filter on a single plugin).

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org