<div dir="ltr"><div class="gmail_quote"><div dir="ltr">On Fri, Aug 3, 2018 at 10:28 PM Nir Soffer <<a href="mailto:nirsof@gmail.com">nirsof@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">File systems not supporting FALLOC_FL_ZERO_RANGE yet fall back to manual<br>
zeroing.<br>
<br>
We can avoid this by combining two fallocate calls:<br>
<br>
fallocate(FALLOC_FL_PUNCH_HOLE)<br>
fallocate(0)<br>
<br>
Based on my tests this is much more efficient compared to manual<br>
zeroing. The idea came from this qemu patch:<br>
<a href="https://github.com/qemu/qemu/commit/1cdc3239f1bb" rel="noreferrer" target="_blank">https://github.com/qemu/qemu/commit/1cdc3239f1bb</a><br>
<br>
Here is an example run with NFS 4.2 without this change, converting<br>
fedora 27 image created with virt-builder:<br>
<br>
$ export SOCK=/tmp/nbd.sock<br>
$ export FILE=/nfs-mount/fedora-27.img<br>
$ src/nbdkit plugins/file/.libs/nbdkit-file-plugin.so file=$FILE -U $SOCK<br>
<br>
$ time qemu-img convert -n -f raw -O raw /var/tmp/fedora-27.img nbd:unix:/tmp/nbd.sock<br>
<br>
real 0m17.481s<br>
user 0m0.199s<br>
sys 0m0.691s<br>
<br>
$ time qemu-img convert -n -f raw -O raw -W /var/tmp/fedora-27.img nbd:unix:/tmp/nbd.sock<br>
<br>
real 0m17.072s<br>
user 0m0.191s<br>
sys 0m0.738s<br>
<br>
With this change:<br>
<br>
$ time qemu-img convert -n -f raw -O raw /var/tmp/fedora-27.img nbd:unix:/tmp/nbd.sock<br>
<br>
real 0m6.285s<br>
user 0m0.217s<br>
sys 0m0.829s<br>
<br>
$ time qemu-img convert -n -f raw -O raw -W /var/tmp/fedora-27.img nbd:unix:/tmp/nbd.sock<br>
<br>
real 0m3.967s<br>
user 0m0.193s<br>
sys 0m0.702s<br>
<br>
Note: the image is sparse, but nbdkit creates a fully allocated image.<br>
This may be a bug in nbdkit or qemu-img.<br>
---<br>
plugins/file/file.c | 32 ++++++++++++++++++++++++++++++++<br>
1 file changed, 32 insertions(+)<br>
<br>
diff --git a/plugins/file/file.c b/plugins/file/file.c<br>
index 5daab63..a2cea4a 100644<br>
--- a/plugins/file/file.c<br>
+++ b/plugins/file/file.c<br>
@@ -121,6 +121,7 @@ struct handle {<br>
int fd;<br>
bool can_punch_hole;<br>
bool can_zero_range;<br>
+ bool can_fallocate;<br>
};<br>
<br>
/* Create the per-connection handle. */<br>
@@ -161,6 +162,8 @@ file_open (int readonly)<br>
h->can_zero_range = false;<br>
#endif<br>
<br>
+ h->can_fallocate = true;<br>
+<br>
return h;<br>
}<br>
<br>
@@ -301,6 +304,35 @@ file_zero (void *handle, uint32_t count, uint64_t offset, int may_trim)<br>
}<br>
#endif<br>
<br>
+#ifdef FALLOC_FL_PUNCH_HOLE<br>
+ /* If we can punch hole but may not trim, we can combine punching hole and<br>
+ * fallocate to zero a range. This is expected to be more efficient than<br>
+ * writing zeros manually. */<br>
+ if (h->can_punch_hole && h->can_fallocate) {<br>
+ r = do_fallocate (h->fd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,<br>
+ offset, count);<br>
+ if (r == 0) {<br>
+ r = do_fallocate(h->fd, 0, offset, count);<br></blockquote><div><br></div><div>Space after the function name is missing here, fixing it in v3.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+ if (r == 0)<br>
+ return r;<br>
+<br>
+ if (errno != EOPNOTSUPP) {<br>
+ nbdkit_error ("zero: %m");<br>
+ return r;<br>
+ }<br>
+<br>
+ h->can_fallocate = false;<br>
+ } else {<br>
+ if (errno != EOPNOTSUPP) {<br>
+ nbdkit_error ("zero: %m");<br>
+ return r;<br>
+ }<br>
+<br>
+ h->can_punch_hole = false;<br>
+ }<br>
+ }<br>
+#endif<br>
+<br>
/* Trigger a fall back to writing */<br>
errno = EOPNOTSUPP;<br>
return r;<br>
-- <br>
2.17.1<br>
<br>
</blockquote></div></div>