[libvirt] [PATCH 6/8] save: let iohelper work on O_DIRECT fds

Daniel P. Berrange berrange at redhat.com
Tue Jul 19 16:20:33 UTC 2011


On Thu, Jul 14, 2011 at 08:24:33AM -0600, Eric Blake wrote:
> Required for a coming patch where iohelper will operate on O_DIRECT
> fds.  There, the user-space memory must be aligned to file system
> boundaries (at least page-aligned works best, and some file systems
> prefer 64k).  Made tougher by the fact that VIR_ALLOC won't work
> on void *, but posix_memalign won't work on char * and isn't
> available everywhere.
> 
> This patch makes some simplifying assumptions - namely, output
> to an O_DIRECT fd will only be attempted on an empty seekable
> file (hence, no need to worry about preserving existing data
> on a partial block, and ftruncate will work to undo the effects
> of having to round up the size of the last block written).
> 
> * configure.ac (AC_CHECK_FUNCS_ONCE): Check for posix_memalign.
> * src/util/iohelper.c (runIO): Use aligned memory, and handle
> quirks of O_DIRECT on last write.
> ---
> 
> This one took me the longest time to get right, so a careful
> review is appreciated.
> 
>  configure.ac        |    6 +++---
>  src/util/iohelper.c |   45 ++++++++++++++++++++++++++++++++++++++++++---
>  2 files changed, 45 insertions(+), 6 deletions(-)
> 
> diff --git a/configure.ac b/configure.ac
> index e9d5be4..9e39f44 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -121,9 +121,9 @@ AC_CHECK_SIZEOF([long])
> 
>  dnl Availability of various common functions (non-fatal if missing),
>  dnl and various less common threadsafe functions
> -AC_CHECK_FUNCS_ONCE([cfmakeraw regexec sched_getaffinity getuid getgid \
> - geteuid initgroups posix_fallocate mmap kill \
> - getmntent_r getgrnam_r getpwuid_r])
> +AC_CHECK_FUNCS_ONCE([cfmakeraw geteuid getgid getgrnam_r getmntent_r \
> +  getpwuid_r getuid initgroups kill mmap posix_fallocate posix_memalign \
> +  regexec sched_getaffinity])
> 
>  dnl Availability of pthread functions (if missing, win32 threading is
>  dnl assumed).  Because of $LIB_PTHREAD, we cannot use AC_CHECK_FUNCS_ONCE.
> diff --git a/src/util/iohelper.c b/src/util/iohelper.c
> index 1185bde..82f62e1 100644
> --- a/src/util/iohelper.c
> +++ b/src/util/iohelper.c
> @@ -74,17 +74,32 @@ cleanup:
>  static int
>  runIO(const char *path, int fd, int oflags, unsigned long long length)
>  {
> -    char *buf = NULL;
> +    void *base = NULL; /* Location to be freed */
> +    char *buf = NULL; /* Aligned location within base */
>      size_t buflen = 1024*1024;
> +    intptr_t alignMask = 64*1024 - 1;
>      int ret = -1;
>      int fdin, fdout;
>      const char *fdinname, *fdoutname;
>      unsigned long long total = 0;
> +    bool direct = O_DIRECT && ((oflags & O_DIRECT) != 0);
> +    bool shortRead = false; /* true if we hit a short read */
> +    off_t end = 0;
> 
> -    if (VIR_ALLOC_N(buf, buflen) < 0) {
> +#if HAVE_POSIX_MEMALIGN
> +    if (posix_memalign(&base, alignMask + 1, buflen)) {
>          virReportOOMError();
>          goto cleanup;
>      }
> +    buf = base;
> +#else
> +    if (VIR_ALLOC_N(buf, buflen + alignMask) < 0) {
> +        virReportOOMError();
> +        goto cleanup;
> +    }
> +    base = buf;
> +    buf = (char *) (((intptr_t) base + alignMask) & alignMask);
> +#endif
> 
>      switch (oflags & O_ACCMODE) {
>      case O_RDONLY:
> @@ -98,6 +113,13 @@ runIO(const char *path, int fd, int oflags, unsigned long long length)
>          fdinname = "stdin";
>          fdout = fd;
>          fdoutname = path;
> +        /* To make the implementation simpler, we give up on any
> +         * attempt to use O_DIRECT in a non-trivial manner.  */
> +        if (direct && (end = lseek(fd, 0, SEEK_END)) != 0) {
> +            virReportSystemError(end < 0 ? errno : EINVAL, "%s",
> +                                 _("O_DIRECT needs empty seekable file"));
> +            goto cleanup;
> +        }
>          break;
> 
>      case O_RDWR:
> @@ -124,12 +146,29 @@ runIO(const char *path, int fd, int oflags, unsigned long long length)
>          }
>          if (got == 0)
>              break; /* End of file before end of requested data */
> +        if (got < buflen || (buflen & alignMask)) {
> +            /* O_DIRECT can handle at most one short read, at end of file */
> +            if (direct && shortRead) {
> +                virReportSystemError(EINVAL, "%s",
> +                                     _("Too many short reads for O_DIRECT"));
> +            }
> +            shortRead = true;
> +        }
> 
>          total += got;
> +        if (fdout == fd && direct && shortRead) {
> +            end = total;
> +            memset(buf + got, 0, buflen - got);
> +            got = (got + alignMask) & ~alignMask;
> +        }
>          if (safewrite(fdout, buf, got) < 0) {
>              virReportSystemError(errno, _("Unable to write %s"), fdoutname);
>              goto cleanup;
>          }
> +        if (end && ftruncate(fd, end) < 0) {
> +            virReportSystemError(errno, _("Unable to truncate %s"), fdoutname);
> +            goto cleanup;
> +        }
>      }
> 
>      ret = 0;
> @@ -141,7 +180,7 @@ cleanup:
>          ret = -1;
>      }
> 
> -    VIR_FREE(buf);
> +    VIR_FREE(base);
>      return ret;
>  }

ACK,

I'm wondering if we should move the guts of iohelper out into a function
in src/util which we can unit test. Or perhaps its easy enough to just
unit test the fdstream code and thus iohelper


Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|




More information about the libvir-list mailing list