[libvirt RFCv4] virfile: set pipe size in virFileWrapperFdNew to improve throughput

Claudio Fontana cfontana at suse.de
Mon Mar 28 11:09:46 UTC 2022


On 3/28/22 1:06 PM, Michal Prívozník wrote:
> On 3/25/22 16:10, Claudio Fontana wrote:
>> currently the only user of virFileWrapperFdNew is the qemu driver;
>> virsh save is very slow with a default pipe size.
>> This change improves throughput by ~400% on fast nvme or ramdisk.
>>
>> Best value currently measured is 1MB, which happens to be also
>> the kernel default for the pipe-max-size.
>>
>> Signed-off-by: Claudio Fontana <cfontana at suse.de>
>> ---
>>  src/util/virfile.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 46 insertions(+)
>>
>> see v2 at
>> https://listman.redhat.com/archives/libvir-list/2022-March/229423.html
>>
>> Changes v3 -> v4:
>>
>> * changed INFO and WARN messages to DEBUG (Daniel)
>>
>> Changes v2 -> v3:
>>
>> * removed reading of max-pipe-size from procfs,
>>   instead make multiple attempts on EPERM with smaller sizes.
>>   In the regular case, this should succeed on the first try.
>>   (Daniel)
>>
>> Changes v1 -> v2:
>>
>> * removed VIR_FILE_WRAPPER_BIG_PIPE, made the new pipe resizing
>>   unconditional (Michal)
>>
>> * moved code to separate functions (Michal)
>>
>> * removed ternary op, disliked in libvirt (Michal)
>>
>> * added #ifdef __linux__ (Ani Sinha)
>>
>> * try smallest value between currently best measured value (1MB)
>>   and the pipe-max-size setting. If pipe-max-size cannot be read,
>>   try kernel default max (1MB). (Daniel)
>>
>>
>> diff --git a/src/util/virfile.c b/src/util/virfile.c
>> index a04f888e06..87539be0b9 100644
>> --- a/src/util/virfile.c
>> +++ b/src/util/virfile.c
>> @@ -201,6 +201,50 @@ struct _virFileWrapperFd {
>>  };
>>  
>>  #ifndef WIN32
>> +
>> +#ifdef __linux__
>> +
>> +/**
>> + * virFileWrapperSetPipeSize:
>> + * @fd: the fd of the pipe
>> + *
>> + * Set best pipe size on the passed file descriptor for bulk transfers of data.
>> + *
>> + * default pipe size (usually 64K) is generally not suited for large transfers
>> + * to fast devices. A value of 1MB has been measured to improve virsh save
>> + * by 400% in ideal conditions. We retry multiple times with smaller sizes
>> + * on EPERM to account for possible small values of /proc/sys/fs/pipe-max-size.
>> + *
>> + * OS note: only for linux, on other OS this is a no-op.
>> + */
>> +static void
>> +virFileWrapperSetPipeSize(int fd)
>> +{
>> +    int sz;
>> +
>> +    for (sz = 1024 * 1024; sz >= 64 * 1024; sz /= 2) {
>> +        int rv = fcntl(fd, F_SETPIPE_SZ, sz);
>> +        if (rv < 0 && errno == EPERM) {
>> +            VIR_DEBUG("EPERM trying to set fd %d pipe size to %d", fd, sz);
>> +            continue; /* retry with half the size */
>> +        }
>> +        if (rv < 0) {
>> +            break;
>> +        }
>> +        VIR_DEBUG("fd %d pipe size adjusted to %d", fd, sz);
>> +        return;
>> +    }
>> +    virReportSystemError(errno, "%s", _("unable to set pipe size, data transfer might be slow"));
> 
> This should have been VIR_WARN(). It's weird to report an error when the
> function returns void.
> 
>> +}
>> +
>> +#else /* !__linux__ */
>> +static void virFileWrapperSetPipeSize(int fd)
> The @fd argument is unused and thus has to be marked as such.
> 
>> +{
>> +    return;
>> +}
>> +#endif /* !__linux__ */
>> +
>> +
>>  /**
>>   * virFileWrapperFdNew:
>>   * @fd: pointer to fd to wrap
>> @@ -282,6 +326,8 @@ virFileWrapperFdNew(int *fd, const char *name, unsigned int flags)
>>  
>>      ret->cmd = virCommandNewArgList(iohelper_path, name, NULL);
>>  
>> +    virFileWrapperSetPipeSize(pipefd[!output]);
> 
> This feels weird, because just a few lines below the pipefd[!output]) is
> closed. As I said earlier, it doesn't matter what end of the pipe we set
> the size on, therefore, let's switch over to pipefd[output].
> 
>> +
>>      if (output) {
>>          virCommandSetInputFD(ret->cmd, pipefd[0]);
>>          virCommandSetOutputFD(ret->cmd, fd);
> 
> Reviewed-by: Michal Privoznik <mprivozn at redhat.com>
> 
> and pushed.
> 
> Michal
> 

Thanks a lot for fixing up the remaining few pieces!

Ciao,

Claudio



More information about the libvir-list mailing list