[Libguestfs] [PATCH] file: Fix zero/trim with block device

Eric Blake eblake at redhat.com
Fri Jul 27 21:23:10 UTC 2018


On 07/27/2018 04:03 PM, Nir Soffer wrote:
> When using block device on RHEL 7.5, file plugin fails to zero with this
> error (copied from strace):
> 
> [pid 39551] fallocate(8, FALLOC_FL_ZERO_RANGE, 1536, 64000) = -1 ENODEV (No such device)
> 
> This is expected error according to the manual:
> 
> ENODEV fd does not refer to a regular file or a directory.  (If fd is a
> pipe or FIFO, a different error results.)

The man page is out-of-date; newer kernels support FALLOC_FL_ZERO_RANGE 
on block devices, in a manner that leaves the pages marked allocated 
(that is, no discard happens). The kernel also supports 
FALLOC_FL_PUNCH_HOLE to guarantee a read-zeroes result (using discard, 
if that works), and FALLOC_FL_PUNCH_HOLE|FALLOC_FL_NO_HIDE_STALE (which 
only discards, and no longer guarantees a read-zeroes result).  But you 
are also right that in all cases, the errno returned when the operation 
cannot be supported is not always EOPNOSUPP.

> 
> Treat this error as EOPNOSUPP.
> 
> Tested only on Fedora 28; I don't know how to build nbdkit on RHEL, but
> the change is pretty trivial.
> ---
>   plugins/file/file.c | 8 ++++----
>   1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/plugins/file/file.c b/plugins/file/file.c
> index b6e33de..a7c07fb 100644
> --- a/plugins/file/file.c
> +++ b/plugins/file/file.c
> @@ -243,7 +243,7 @@ file_zero (void *handle, uint32_t count, uint64_t offset, int may_trim)
>     if (may_trim) {
>       r = fallocate (h->fd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
>   		   offset, count);
> -    if (r == -1 && errno != EOPNOTSUPP) {
> +    if (r == -1 && errno != EOPNOTSUPP && errno != ENODEV) {
>         nbdkit_error ("zero: %m");
>       }
>       /* PUNCH_HOLE is older; if it is not supported, it is likely that

Given the recent thread on qemu about the different fallocate flags in 
relation to block devices, we may need to revisit the logic in this 
function to more closely follow the flags (use FALLOC_FL_ZERO_RANGE when 
may_trim is false, and FALLOC_FL_PUNCH_HOLE when it is true; as well as 
tweak the .discard() callback to use 
FALLOC_FL_PUNCH_HOLE|FALLOC_FL_NO_HIDE_STALE).

https://lists.gnu.org/archive/html/qemu-devel/2018-07/msg05135.html

> @@ -254,7 +254,7 @@ file_zero (void *handle, uint32_t count, uint64_t offset, int may_trim)
>   
>   #ifdef FALLOC_FL_ZERO_RANGE
>     r = fallocate (h->fd, FALLOC_FL_ZERO_RANGE, offset, count);
> -  if (r == -1 && errno != EOPNOTSUPP) {
> +  if (r == -1 && errno != EOPNOTSUPP && errno != ENODEV) {
>       nbdkit_error ("zero: %m");
>     }
>   #else

Are we sure that the caller still sees the correct EOPNOTSUPP errno 
value that it uses in deciding whether to attempt the manual fallbacks? 
You may need to explicitly set errno.

> @@ -288,11 +288,11 @@ file_trim (void *handle, uint32_t count, uint64_t offset)
>     struct handle *h = handle;
>   
>     /* Trim is advisory; we don't care if it fails for anything other
> -   * than EIO or EPERM. */
> +   * than EIO, EPERM, or ENODEV (kernel 3.10) */
>     r = fallocate (h->fd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
>                    offset, count);
>     if (r < 0) {
> -    if (errno != EPERM && errno != EIO) {
> +    if (errno != EPERM && errno != EIO && errno != ENODEV) {

This last hunk looks wrong.  We want to ignore ENODEV errors in this 
code path (the same way we are already ignoring EOPNOTSUPP errors).

>         nbdkit_debug ("ignoring failed fallocate during trim: %m");
>         r = 0;
>       }
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org




More information about the Libguestfs mailing list