[Libguestfs] [PATCH nbdkit 8/8] file: Implement extents.

Richard W.M. Jones rjones at redhat.com
Mon Mar 25 17:45:16 UTC 2019


On Sat, Mar 23, 2019 at 02:45:44PM -0500, Eric Blake wrote:
> On 3/20/19 5:11 PM, Richard W.M. Jones wrote:
> > This uses lseek SEEK_DATA/SEEK_HOLE to search for allocated data and
> > holes in the underlying file.
> > ---
> >  plugins/file/file.c | 139 ++++++++++++++++++++++++++++++++++++++++----
> >  1 file changed, 129 insertions(+), 10 deletions(-)
> > 
> 
> > -int file_debug_zero;            /* to enable: -D file.zero=1 */
> > +/* Any callbacks using lseek must be protected by this lock. */
> > +static pthread_mutex_t lseek_lock = PTHREAD_MUTEX_INITIALIZER;
> > +
> > +/* to enable: -D file.zero=1 */
> > +int file_debug_zero;
> >  
> >  static void
> >  file_unload (void)
> > @@ -220,6 +227,21 @@ file_close (void *handle)
> >  
> >  #define THREAD_MODEL NBDKIT_THREAD_MODEL_PARALLEL
> >  
> > +/* For block devices, stat->st_size is not the true size. */
> > +static int64_t
> > +block_device_size (int fd)
> > +{
> > +  off_t size;
> > +
> > +  size = lseek (fd, 0, SEEK_END);
> 
> Calling lseek without the lock? I'm not sure if you can guarantee that
> .size won't be called in parallel with some other .extents.
> /me reads ahead
> Oh, the caller has the lock. I might add a comment to this function that
> it expects the caller to grab the lock.

Yes I'll add a comment as it wasn't even obvious to me when I looked
back at the code.

> > +#ifdef SEEK_HOLE
> > +/* Extents. */
> > +
> > +static int
> > +file_can_extents (void *handle)
> > +{
> > +  struct handle *h = handle;
> > +  off_t r;
> > +
> > +  /* A simple test to see whether SEEK_HOLE etc is likely to work on
> > +   * the current filesystem.
> > +   */
> > +  pthread_mutex_lock (&lseek_lock);
> > +  r = lseek (h->fd, 0, SEEK_HOLE);
> > +  pthread_mutex_unlock (&lseek_lock);
> > +  if (r == -1) {
> > +    nbdkit_debug ("extents disabled: lseek: SEEK_HOLE: %m");
> > +    return 0;
> 
> In fact, if lseek() succeeds but returns the current file size, you know
> there are no holes in the file at all, in which case, it might be more
> efficient to store a variable stating that .extents should avoid lseek()
> altogether and just report the entire file as data.
> 
> Hmm - if we just make .extents return false in this case, then nbdkit
> won't advertise block status to the guest even though we have valid
> status available (knowing there are no holes is nicer than merely
> assuming there are no holes because block status was unavailable). But
> if we return true, then WE have to do that optimization ourselves. Maybe
> .can_extent should be a tri-state return, just like .can_fua, where a
> plugin can choose NBDKIT_EXTENT_NONE to force no block status
> advertisement, NBDKIT_EXTENT_EMULATE (default if .extents is missing) to
> let nbdkit do the work of reporting the entire disk as data without
> calling .extents, and NBDKIT_EXTENT_NATIVE (requires .extents, and lets
> the plugin do all the work). Then, when lseek(SEEK_HOLE) returns EOF, we
> can return NBDKIT_EXTENT_EMULATE instead of having to track our
> optimization of skipping lseek in .extents ourselves.

I'm not really sure I understand this.  However in the latest
iteration of the series can_extents is disconnected from what we
advertise back to clients.  It is solely used to decide if the server
calls .extents or bypasses it (returning fully allocated).

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://people.redhat.com/~rjones/virt-df/




More information about the Libguestfs mailing list