<div dir="ltr"><div class="gmail_quote"><div dir="ltr">On Thu, Aug 2, 2018 at 10:30 PM Eric Blake <<a href="mailto:eblake@redhat.com">eblake@redhat.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On 08/02/2018 02:05 PM, Nir Soffer wrote:<br> > When using file systems not supporting ZERO_RANGE (e.g. NFS 4.2) or<br> > block device on kernel < 4.9, we used to call fallocate() for every<br> > zero, fail with EOPNOTSUPP, and fallback to manual zeroing. When<br> > trimming, we used to try unsupported fallocate() on every call.<br> > <br> > Change file handle to remember if punching holes or zeroing range are<br> > supported, and avoid unsupported calls.<br> > <br> > - can_trim changed to report the actual capability once we tried to<br> > punch a hole. I don't think this is useful yet.<br> <br> The documentation states that filters (or the nbdkit engine itself) are <br> free to cache a per-connection value of .can_trim(), and thus the plugin <br> should keep the value stable per connection - in part, because they <br> affect what is advertised to the client, but the advertisement happens <br> only once as part of the client's initial connection. Changing it after <br> the fact on a given connection won't change the client's behavior <br> (turning the feature on is pointless as the client already remembers it <br> was off; turning the feature off is detrimental as the client already <br> assumes it works). So the best you can do is make subsequent <br> connections more efficient after the initial connection has learned state.<br></blockquote><div><br></div><div>Sure disabling trim does not help the client. The only advantage is not calling</div><div>trim when it does nothing.</div><div><br></div><div>Do you think it is better to leave the file_can_trim as is, to avoid confusion?</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> > <br> > - zero changed to:<br> > 1. If we can punch hole and may trim, try PUNCH_HOLE<br> > 2. If we can zero range, try ZERO_RANGE<br> > 3. Fall back to manual writing<br> > <br> > - trim changed to:<br> > 1. If we can punch hole, try PUNCH_HOLE<br> > 2. Succeed<br> <br> Seems reasonable from the description.<br> <br> > @@ -118,6 +119,8 @@ file_config_complete (void)<br> > /* The per-connection handle. */<br> > struct handle {<br> > int fd;<br> > + bool can_punch_hole;<br> > + bool can_zero_range;<br> <br> Would it be better to make these tri-state rather than merely bool? <br> (Indeterminate, supported, known to fail)<br></blockquote><div><br></div><div>What is the advantage of having tri-state?</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> > };<br> > <br> > /* Create the per-connection handle. */<br> > @@ -146,6 +149,18 @@ file_open (int readonly)<br> > return NULL;<br> > }<br> > <br> > +#ifdef FALLOC_FL_PUNCH_HOLE<br> > + h->can_punch_hole = true;<br> > +#else<br> > + h->can_punch_hole = false;<br> <br> If you use tri-state, then the existence of the macro results in an <br> initialization of indeterminate, whereas the absence results in known to <br> fail.<br> <br> > +#endif<br> > +<br> > +#ifdef FALLOC_FL_ZERO_RANGE<br> > + h->can_zero_range = true;<br> > +#else<br> > + h->can_zero_range = false;<br> <br> Likewise.<br> <br> > +#endif<br> > +<br> > return h;<br> > }<br> > <br> > @@ -189,19 +204,15 @@ file_get_size (void *handle)<br> > return statbuf.st_size;<br> > }<br> > <br> > +/* Trim is advisory, but we prefer to advertise it only when we can actually<br> > + * (attempt to) punch holes. Before we tried to punch a hole, report true if<br> > + * FALLOC_FL_PUNCH_HOLE is defined before we did the first call. Once we tried<br> > + * to punch a hole, report the actual cappability of the underlying file. */<br> <br> s/cappability/capability/<br> <br> If you use a tri-state, then report true if the variable is <br> indeterminate or works, false if known to fail.<br> <br> > static int<br> > file_can_trim (void *handle)<br> > {<br> > - /* Trim is advisory, but we prefer to advertise it only when we can<br> > - * actually (attempt to) punch holes. Since not all filesystems<br> > - * support all fallocate modes, it would be nice if we had a way<br> > - * from fpathconf() to definitively learn what will work on a given<br> > - * fd for a more precise answer; oh well. */<br> <br> The comment about fpathconf() would still be nice to keep in some form.<br></blockquote><div><br></div><div>But fpathconf does not tell anything abut this, no?</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> <br> > -#ifdef FALLOC_FL_PUNCH_HOLE<br> > - return 1;<br> > -#else<br> > - return 0;<br> > -#endif<br> > + struct handle *h = handle;<br> > + return h->can_punch_hole;<br> <br> I'm a bit worried about whether changing the return value within the <br> context of a single connection is wise, or if we need to further <br> maintain a per-connection state that is initialized according to the <br> overall plugin state.<br></blockquote><div><br></div><div>It seems like we should keep the current behavior.</div><div><br></div><div>Richard, what do you think?</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> <br> > }<br> <br> Also, it might be nice to add a .can_zero() callback, so that nbdkit <br> won't even waste time calling into .zero() if we know we will just be <br> deferring right back to a full write (either because the macro is not <br> available, or because we encountered a failure trying to use it on a <br> previous connection).<br> <br> > @@ -301,27 +320,30 @@ file_flush (void *handle)<br> > static int<br> > file_trim (void *handle, uint32_t count, uint64_t offset)<br> > {<br> > - int r = -1;<br> > #ifdef FALLOC_FL_PUNCH_HOLE<br> > struct handle *h = handle;<br> > + int r = -1;<br> > +<br> > + if (h->can_punch_hole) {<br> > + r = do_fallocate (h->fd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,<br> > + offset, count);<br> <br> Should we also use FALLOC_FL_NO_HIDE_STALE here, if it is available?<br></blockquote><div><br></div><div>We can add h->can_leave_stale, and use it if available. But I don't think</div><div>it will give much with typical request size.</div><div><br></div><div>Do you think it worth the effort?</div><div><br></div><div>Nir</div><div><br></div></div></div>