[Libguestfs] [PATCH 3/3] file: Zero for block devices on old kernels

Nir Soffer nsoffer at redhat.com
Thu Aug 2 20:15:41 UTC 2018


On Thu, Aug 2, 2018 at 10:39 PM Eric Blake <eblake at redhat.com> wrote:

> On 08/02/2018 02:05 PM, Nir Soffer wrote:
> > fallocate(FALLOC_FL_ZERO_RANGE) is supportd for block devices with
> > modern kernel, but when it is not, we fall back to manual zeroing.
> >
> > Check if the underlying file is a block device when opening the file,
> > and fall back to ioctl(BLKZEROOUT) for aligned zero requests for a
> > block device.
> >
> > +++ b/plugins/file/file.c
> > @@ -45,6 +45,7 @@
> >
> >   #if defined(__linux__) && !defined(FALLOC_FL_PUNCH_HOLE)
> >   #include <linux/falloc.h>   /* For FALLOC_FL_*, glibc < 2.18 */
> > +#include <linux/fs.h>       /* For BLKZEROOUT */
>
> Will this pick up BLKZEROOUT in all cases where it is needed?  Or do we
> need to relax the !defined(FALLOC_FL_PUNCH_HOLE), and just blindly
> include both of these headers for all Linux compilations?
>

It works on RHEL 7.5, but it should not depend on FALLOC_FL_*.


> > +static bool
> > +is_aligned(struct handle *h, uint64_t n)
> > +{
> > +  return n % h->sector_size == 0;
>
> Since we know (but the compiler doesn't) that sector_size is a power of
> 2, it is slightly faster to use bitwise math:
>   return !(n & (h->sector_size - 1))
>

Right


>
> > +#ifdef BLKSSZGET
> > +  if (h->is_block_device) {
> > +    if (ioctl (h->fd, BLKSSZGET, &h->sector_size)) {
> > +      nbdkit_error ("ioctl(BLKSSZGET): %s: %m", filename);
> > +      free (h);
> > +      return NULL;
>
> If the ioctl() fails, would it be better to just fall back...
>
> > +    }
> > +  }
> > +#else
> > +  h->sector_size = 4096;  /* Safe guess */
>
> ...to the safe guess, instead of giving up entirely?  (Might matter on a
> system with newer headers that have the macro, but where the kernel does
> not support the ioctl).
>

Good idea.


>
> > @@ -329,6 +361,20 @@ file_zero (void *handle, uint32_t count, uint64_t
> offset, int may_trim)
> >     }
> >   #endif
> >
> > +#ifdef BLKZEROOUT
> > +  /* For aligned range and block devices, we can use BLKZEROOUT. */
> > +  if (h->is_block_device && is_aligned (h, offset) && is_aligned (h,
> count)) {
>
> Since alignment is a power of 2, you can compress this as:
>
> if (h->is_block_device && is_aligned (h, offset | count)) {
>

Clever!

Richard, do you like to maintain bitwise tricks like this?


> > +    uint64_t range[2] = {offset, count};
> > +
> > +    r = ioctl (h->fd, BLKZEROOUT, &range);
> > +    if (r == 0)
> > +      return r;
> > +
> > +    nbdkit_error ("zero: %m");
> > +    return r;
>
> Are we sure that treating ALL errors as fatal is worthwhile, or should
> we still attempt to trigger a fall back to writing?
>

I think we can assume that all block devices support BLKZEROOUT.
This is what oVirt does.

Since nbdkit is more general purpose, maybe it is better to fallback to
manual zero.

> +  }
> > +#endif
> > +
> >     /* Trigger a fall back to writing */
> >     errno = EOPNOTSUPP;
> >     return r;
> >
>
> --
> Eric Blake, Principal Software Engineer
> Red Hat, Inc.           +1-919-301-3266 <(919)%20301-3266>
> Virtualization:  qemu.org | libvirt.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/libguestfs/attachments/20180802/9bcc31a0/attachment.htm>


More information about the Libguestfs mailing list