[libvirt RFCv4 01/20] iohelper: introduce new struct to carry copy operation parameters
Claudio Fontana
cfontana at suse.de
Fri Apr 29 09:24:29 UTC 2022
On 4/28/22 3:01 PM, Daniel P. Berrangé wrote:
> On Wed, Apr 27, 2022 at 11:13:20PM +0200, Claudio Fontana wrote:
>> this is in preparation for a minor refactoring of the copy
>> function itself out of runIO().
>>
>> Signed-off-by: Claudio Fontana <cfontana at suse.de>
>> ---
>> src/util/iohelper.c | 95 +++++++++++++++++++++++++--------------------
>> 1 file changed, 53 insertions(+), 42 deletions(-)
>>
>> diff --git a/src/util/iohelper.c b/src/util/iohelper.c
>> index 2c91bf4f93..c13746a547 100644
>> --- a/src/util/iohelper.c
>> +++ b/src/util/iohelper.c
>> @@ -45,6 +45,16 @@
>> # define O_DIRECT 0
>> #endif
>>
>> +struct runIOParams {
>> + bool isBlockDev;
>> + bool isDirect;
>> + bool isWrite;
>> + int fdin;
>> + const char *fdinname;
>> + int fdout;
>> + const char *fdoutname;
>> +};
>> +
>> static int
>> runIO(const char *path, int fd, int oflags)
>> {
>> @@ -53,13 +63,9 @@ runIO(const char *path, int fd, int oflags)
>> size_t buflen = 1024*1024;
>> intptr_t alignMask = 64*1024 - 1;
>> int ret = -1;
>> - int fdin, fdout;
>> - const char *fdinname, *fdoutname;
>> - unsigned long long total = 0;
>> - bool direct = O_DIRECT && ((oflags & O_DIRECT) != 0);
>> - off_t end = 0;
>> + off_t total = 0;
>> struct stat sb;
>> - bool isBlockDev = false;
>> + struct runIOParams p;
>>
>> #if WITH_POSIX_MEMALIGN
>> if (posix_memalign(&base, alignMask + 1, buflen))
>> @@ -77,34 +83,23 @@ runIO(const char *path, int fd, int oflags)
>> fd, path);
>> goto cleanup;
>> }
>> - isBlockDev = S_ISBLK(sb.st_mode);
>> + p.isBlockDev = S_ISBLK(sb.st_mode);
>> + p.isDirect = O_DIRECT && (oflags & O_DIRECT);
>>
>> switch (oflags & O_ACCMODE) {
>> case O_RDONLY:
>> - fdin = fd;
>> - fdinname = path;
>> - fdout = STDOUT_FILENO;
>> - fdoutname = "stdout";
>> - /* To make the implementation simpler, we give up on any
>> - * attempt to use O_DIRECT in a non-trivial manner. */
>> - if (!isBlockDev && direct && ((end = lseek(fd, 0, SEEK_CUR)) != 0)) {
>> - virReportSystemError(end < 0 ? errno : EINVAL, "%s",
>> - _("O_DIRECT read needs entire seekable file"));
>> - goto cleanup;
>> - }
>> + p.isWrite = false;
>> + p.fdin = fd;
>> + p.fdinname = path;
>> + p.fdout = STDOUT_FILENO;
>> + p.fdoutname = "stdout";
>> break;
>> case O_WRONLY:
>> - fdin = STDIN_FILENO;
>> - fdinname = "stdin";
>> - fdout = fd;
>> - fdoutname = path;
>> - /* To make the implementation simpler, we give up on any
>> - * attempt to use O_DIRECT in a non-trivial manner. */
>> - if (!isBlockDev && direct && (end = lseek(fd, 0, SEEK_END)) != 0) {
>> - virReportSystemError(end < 0 ? errno : EINVAL, "%s",
>> - _("O_DIRECT write needs empty seekable file"));
>> - goto cleanup;
>> - }
>> + p.isWrite = true;
>> + p.fdin = STDIN_FILENO;
>> + p.fdinname = "stdin";
>> + p.fdout = fd;
>> + p.fdoutname = path;
>> break;
>>
>> case O_RDWR:
>> @@ -114,6 +109,22 @@ runIO(const char *path, int fd, int oflags)
>> (oflags & O_ACCMODE));
>> goto cleanup;
>> }
>> + /* To make the implementation simpler, we give up on any
>> + * attempt to use O_DIRECT in a non-trivial manner. */
>> + if (!p.isBlockDev && p.isDirect) {
>> + off_t off;
>> + if (p.isWrite) {
>> + if ((off = lseek(fd, 0, SEEK_END)) != 0) {
>> + virReportSystemError(off < 0 ? errno : EINVAL, "%s",
>> + _("O_DIRECT write needs empty seekable file"));
>> + goto cleanup;
>> + }
>> + } else if ((off = lseek(fd, 0, SEEK_CUR)) != 0) {
>> + virReportSystemError(off < 0 ? errno : EINVAL, "%s",
>> + _("O_DIRECT read needs entire seekable file"));
>> + goto cleanup;
>> + }
>
> I'm puzzelled about the current code doing SEEK_END in one case
> and SEEK_CUR in the other case. I think this method should simply
> use SEEK_CUR only as a sanity check for being ablt to use O_DIRECT.
I think the answer is in the original commit:
commit 12291656b135ed788e41dadbd2d15e613ddea9b5
Author: Eric Blake <eblake at redhat.com>
Date: Tue Jul 12 08:35:05 2011 -0600
save: let iohelper work on O_DIRECT fds
Required for a coming patch where iohelper will operate on O_DIRECT
fds. There, the user-space memory must be aligned to file system
boundaries (at least 512, but using page-aligned works better, and
some file systems prefer 64k). Made tougher by the fact that
VIR_ALLOC won't work on void *, but posix_memalign won't work on
char * and isn't available everywhere.
This patch makes some simplifying assumptions - namely, output
to an O_DIRECT fd will only be attempted on an empty seekable
file (hence, no need to worry about preserving existing data
on a partial block, and ftruncate will work to undo the effects
of having to round up the size of the last block written), and
input from an O_DIRECT fd will only be attempted on a complete
seekable file with the only possible short read at EOF.
----
Here the relevant part being "empty seekable file".
The check not only ensures the file is seekable, but also double checks that it is empty.
I can add a comment to make it explicit.
>
> If we actually need to position the file offset, then the caller
> should do a SEEK_END itself.
>
>> + }
>>
>> while (1) {
>> ssize_t got;
>> @@ -124,16 +135,16 @@ runIO(const char *path, int fd, int oflags)
>> * writes will be aligned.
>> * In other cases using saferead reduces number of syscalls.
>> */
>> - if (fdin == fd && direct) {
>> - if ((got = read(fdin, buf, buflen)) < 0 &&
>> + if (!p.isWrite && p.isDirect) {
>> + if ((got = read(p.fdin, buf, buflen)) < 0 &&
>> errno == EINTR)
>> continue;
>> } else {
>> - got = saferead(fdin, buf, buflen);
>> + got = saferead(p.fdin, buf, buflen);
>> }
>
> The distinction for read vs saferead is for correctness with O_DIRECT.
>
> We only want to use read on the plain file / blockdev, and saferead
> on the socket/pipe.
Yes that's clear.
>
> We already call fstat to let us figure out the fd type, so can use
> that decide whether to use read vs saferead.
>
> This ties into my comment on the later patch about simplifying the
> parameters passed in, to remove the distinction of read vs write.
>
As mentioned elsewhere I don't think it can be simplified quite that way,
but I think there is an alternative that could work, will propose in the next spin.
Thanks
Claudio
More information about the libvir-list
mailing list