[Libguestfs] [PATCH 1/3] file: Avoid unsupported fallocate() calls

Thu Aug 2 20:00:27 UTC 2018

On Thu, Aug 2, 2018 at 10:30 PM Eric Blake <eblake at redhat.com> wrote:

> On 08/02/2018 02:05 PM, Nir Soffer wrote:
> > When using file systems not supporting ZERO_RANGE (e.g. NFS 4.2) or
> > block device on kernel < 4.9, we used to call fallocate() for every
> > zero, fail with EOPNOTSUPP, and fallback to manual zeroing.  When
> > trimming, we used to try unsupported fallocate() on every call.
> >
> > Change file handle to remember if punching holes or zeroing range are
> > supported, and avoid unsupported calls.
> >
> > - can_trim changed to report the actual capability once we tried to
> >    punch a hole. I don't think this is useful yet.
>
> The documentation states that filters (or the nbdkit engine itself) are
> free to cache a per-connection value of .can_trim(), and thus the plugin
> should keep the value stable per connection - in part, because they
> affect what is advertised to the client, but the advertisement happens
> only once as part of the client's initial connection.  Changing it after
> the fact on a given connection won't change the client's behavior
> (turning the feature on is pointless as the client already remembers it
> was off; turning the feature off is detrimental as the client already
> assumes it works).  So the best you can do is make subsequent
> connections more efficient after the initial connection has learned state.
>

Sure disabling trim does not help the client. The only advantage is not
calling
trim when it does nothing.

Do you think it is better to leave the file_can_trim as is, to avoid
confusion?

>
> > - zero changed to:
> >    1. If we can punch hole and may trim, try PUNCH_HOLE
> >    2. If we can zero range, try ZERO_RANGE
> >    3. Fall back to manual writing
> >
> > - trim changed to:
> >    1. If we can punch hole, try PUNCH_HOLE
> >    2. Succeed
>
> Seems reasonable from the description.
>
> > @@ -118,6 +119,8 @@ file_config_complete (void)
> >   /* The per-connection handle. */
> >   struct handle {
> >     int fd;
> > +  bool can_punch_hole;
> > +  bool can_zero_range;
>
> Would it be better to make these tri-state rather than merely bool?
> (Indeterminate, supported, known to fail)
>

What is the advantage of having tri-state?

> >   };
> >
> >   /* Create the per-connection handle. */
> > @@ -146,6 +149,18 @@ file_open (int readonly)
> >       return NULL;
> >     }
> >
> > +#ifdef FALLOC_FL_PUNCH_HOLE
> > +  h->can_punch_hole = true;
> > +#else
> > +  h->can_punch_hole = false;
>
> If you use tri-state, then the existence of the macro results in an
> initialization of indeterminate, whereas the absence results in known to
> fail.
>
> > +#endif
> > +
> > +#ifdef FALLOC_FL_ZERO_RANGE
> > +  h->can_zero_range = true;
> > +#else
> > +  h->can_zero_range = false;
>
> Likewise.
>
> > +#endif
> > +
> >     return h;
> >   }
> >
> > @@ -189,19 +204,15 @@ file_get_size (void *handle)
> >     return statbuf.st_size;
> >   }
> >
> > +/* Trim is advisory, but we prefer to advertise it only when we can
> actually
> > + * (attempt to) punch holes. Before we tried to punch a hole, report
> true if
> > + * FALLOC_FL_PUNCH_HOLE is defined before we did the first call. Once
> we tried
> > + * to punch a hole, report the actual cappability of the underlying
> file. */
>
> s/cappability/capability/
>
> If you use a tri-state, then report true if the variable is
> indeterminate or works, false if known to fail.
>
> >   static int
> >   file_can_trim (void *handle)
> >   {
> > -  /* Trim is advisory, but we prefer to advertise it only when we can
> > -   * actually (attempt to) punch holes.  Since not all filesystems
> > -   * support all fallocate modes, it would be nice if we had a way
> > -   * from fpathconf() to definitively learn what will work on a given
> > -   * fd for a more precise answer; oh well.  */
>
> The comment about fpathconf() would still be nice to keep in some form.
>

But fpathconf does not tell anything abut this, no?

>
> > -#ifdef FALLOC_FL_PUNCH_HOLE
> > -  return 1;
> > -#else
> > -  return 0;
> > -#endif
> > +  struct handle *h = handle;
> > +  return h->can_punch_hole;
>
> I'm a bit worried about whether changing the return value within the
> context of a single connection is wise, or if we need to further
> maintain a per-connection state that is initialized according to the
> overall plugin state.
>

It seems like we should keep the current behavior.

Richard, what do you think?

>
> >   }
>
> Also, it might be nice to add a .can_zero() callback, so that nbdkit
> won't even waste time calling into .zero() if we know we will just be
> deferring right back to a full write (either because the macro is not
> available, or because we encountered a failure trying to use it on a
> previous connection).
>
> > @@ -301,27 +320,30 @@ file_flush (void *handle)
> >   static int
> >   file_trim (void *handle, uint32_t count, uint64_t offset)
> >   {
> > -  int r = -1;
> >   #ifdef FALLOC_FL_PUNCH_HOLE
> >     struct handle *h = handle;
> > +  int r = -1;
> > +
> > +  if (h->can_punch_hole) {
> > +    r = do_fallocate (h->fd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
> > +                      offset, count);
>
> Should we also use FALLOC_FL_NO_HIDE_STALE here, if it is available?
>

We can add h->can_leave_stale, and use it if available. But I don't think
it will give much with typical request size.

Do you think it worth the effort?

Nir
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/libguestfs/attachments/20180802/30e0de60/attachment.htm>