[Libguestfs] [PATCH nbdkit v2 4/4] cache: Implement cache-max-size and method of reclaiming space from the cache.

Thu Jan 3 11:50:32 UTC 2019

On Wed, Jan 02, 2019 at 09:39:10PM -0600, Eric Blake wrote:
> > +The default thresholds are high 95% and low 80%.  The thresholds are
> > +expressed as percentages of C<cache-max-size>, and may be greater than
> > +100.
> 
> Do the thresholds support fractional percentages, like 90.5, or must
> they be integral?  Should we enforce that low threshold < high
> threshold?  What does it really mean to have a threshold > 100 - that we
> can never reach the high threshold?

We don't support fractional percentages, but we could do, although it
seems like a overengineering (what's the real difference between a
threshold of 50% and 50.1% ?!).  We do enforce low < high.  A
percentage > 100% just means that we have a higher threshold (eg.
low = 160%, high = 190% behaves the same if cache-max-size is halved).

In the next version I have clarified and simplified the wording in the
manual.

> > +/* Do we support reclaiming cache blocks? */
> > +#ifdef FALLOC_FL_PUNCH_HOLE
> > +#define HAVE_CACHE_RECLAIM 1
> > +#else
> > +#undef HAVE_CACHE_RECLAIM
> > +#endif
> > +
> 
> Should the man page mention that max cache size can only be enforced
> with kernel support?  Do we want to go further and probe whether
> FALLOC_FL_PUNCH_HOLE works on $TMPDIR?  (Even if the kernel supports
> punching holes, there are some filesystems that don't).

I'll add it to the manual, but it seems irksome to actually test it at
run time ...

> > +  cache_allocated = statbuf.st_blocks * UINT64_C(512);
> 
> Is that calculation always going to be accurate, even when more
> disks/file systems move towards 4k minimum access?

My reading of stat(2) is st_blocks is always expressed in 512 byte
blocks, independent of the sector size.

> > +  nbdkit_debug ("cache: reclaiming block %" PRIu64, reclaim_blk);
> 
> Where does this prioritize clean blocks over dirty blocks?  Or is the
> .pod change inaccurate given later changes to implementation?

Right, it doesn't.  I'll fix the manual.

> > +#ifdef FALLOC_FL_PUNCH_HOLE
> > +  if (fallocate (fd, FALLOC_FL_PUNCH_HOLE|FALLOC_FL_KEEP_SIZE,
> > +                 reclaim_blk * BLKSIZE, BLKSIZE) == -1) {
> > +    nbdkit_error ("cache: reclaiming cache blocks: "
> > +                  "fallocate: FALLOC_FL_PUNCH_HOLE: %m");
> > +    return;
> > +  }
> > +#else
> > +#error "no implementation for punching holes"
> 
> This #error should be unreachable, given the definition of
> HAVE_CACHE_RECLAIM based on FALLOC_FL_PUNCH_HOLE.

So I left this in, in case we change HAVE_CACHE_RECLAIM for
implementing hole punching on another platform but forget to update
this code.

> >> +static void
> >> +reclaim_any (void)
> >> +{
> >> +  /* Find the next block in the cache. */
> >> +  reclaim_blk = bitmap_next (&bm, reclaim_blk+1);
> >> +  if (reclaim_blk == -1)        /* wrap around */
> >> +    reclaim_blk = bitmap_next (&bm, 0);
> >> +
> >> +  reclaim_block ();
>
> Even when reclaiming any, should we try to prioritize blocks that are
> found in the LRU cache's bm[1] but not bm[0], as we do have at least
> that bit of knowledge about LRU patterns?  Otherwise, this can end up
> claiming the most recently used block, if that happens to be the next
> block in our cycling through the entire cache.  Or is that too much
> layering violation, where it would unnecessarily tie this code to the
> current lru implementation?

I think we'd end up needing 3 states.  I'll add a to-do comment ...

Thanks,

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  http://libguestfs.org