[Libguestfs] [PATCH] file: Zero support for block devices and NFS 4.2

Mon Jul 30 16:50:48 UTC 2018

On Mon, Jul 30, 2018 at 6:01 PM Eric Blake <eblake at redhat.com> wrote:
...

> > @@ -42,6 +43,8 @@
>
>   #include <sys/stat.h>
> >   #include <errno.h>
> >   #include <linux/falloc.h>   /* For FALLOC_FL_* on RHEL, glibc < 2.18 */
> > +#include <sys/ioctl.h>
> > +#include <linux/fs.h>
>
> Does this need a configure-time probe to see if it exists, since it will
> break compilation on BSD systems?  Same question to linux/falloc.h.
> Actually, linux/falloc.h doesn't see any use in the current nbdkit.git;
> does this email depend on another thread being applied first?
>

Yes, this depends on
https://www.redhat.com/archives/libguestfs/2018-July/msg00083.html

I plan to protect both imports with #if defined(__linux__). Any reason to
use configure instead?

> +
> > +#ifdef FALLOC_FL_PUNCH_HOLE
> > +  /* If we can punch hole but may not trim, we can combine punching
> hole and
> > +     fallocate to zero a range. This is much more efficient than
> writing zeros
> > +     manually. */
>
> s/is/can be/ (it's two syscalls instead of one, and may not be as
> efficient as we'd like - but does indeed stand a chance of being more
> efficient than manual efforts)
>

"can be" is better, but I really mean "is typically", or "is expected to
be".

For example in imageio same change improved upload throughout by
450% with my poor NFS server, see:
https://gerrit.ovirt.org/c/92871/11//COMMIT_MSG

> > +  if (h->can_punch_hole && h->can_fallocate) {
> > +    r = do_fallocate (h->fd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
> > +                      offset, count);
> > +    if (r == 0) {
> > +      r = do_fallocate(h->fd, 0, offset, count);
> > +      if (r == 0)
> > +        return 0;
> > +
> > +      if (errno != EOPNOTSUPP) {
> > +        nbdkit_error ("zero: %m");
> > +        return r;
> > +      }
> > +
> > +      h->can_fallocate = false;
> > +    } else {
> > +      if (errno != EOPNOTSUPP) {
> > +        nbdkit_error ("zero: %m");
> > +        return r;
> > +      }
> > +
> > +      h->can_punch_hole = false;
> > +    }
> > +  }
> > +#endif
> > +
> > +  /* For block devices, we can use BLKZEROOUT.
> > +     NOTE: count and offset must be aligned to logical block size. */
> > +  if (h->is_block_device) {
> > +    uint64_t range[2] = {offset, count};
>
> Is it worth attempting the ioctl only when you have aligned values?
>

I think it does, but this requires getting the logical sector size
and keeping it in the handle.

Looking in plugin_zero, if we find that the offset and count are not
aligned and return EOPNOTSUPP, we will fallback to manual
zeroing for this call.

But this worries me:

549   int can_zero = 1; /* TODO cache this per-connection? */

Once can_zero is cached per connection, failing once because of
single unaligned call will prevent efficient zero for the rest of the image.

> +
> > +    r = ioctl(h->fd, BLKZEROOUT, &range);
>
> This portion of the code be conditional on whether BLKZEROOUT is defined.
>

Right.
...

Nir
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/libguestfs/attachments/20180730/0abbbf4e/attachment.htm>