[libvirt RFCv4] virfile: set pipe size in virFileWrapperFdNew to improve throughput
Michal Prívozník
mprivozn at redhat.com
Mon Mar 28 11:25:11 UTC 2022
On 3/28/22 13:10, Daniel P. Berrangé wrote:
> On Mon, Mar 28, 2022 at 01:06:03PM +0200, Michal Prívozník wrote:
>> On 3/25/22 16:10, Claudio Fontana wrote:
>>> currently the only user of virFileWrapperFdNew is the qemu driver;
>>> virsh save is very slow with a default pipe size.
>>> This change improves throughput by ~400% on fast nvme or ramdisk.
>>>
>>> Best value currently measured is 1MB, which happens to be also
>>> the kernel default for the pipe-max-size.
>>>
>>> Signed-off-by: Claudio Fontana <cfontana at suse.de>
>>> ---
>>> src/util/virfile.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++
>>> 1 file changed, 46 insertions(+)
>>>
>>> see v2 at
>>> https://listman.redhat.com/archives/libvir-list/2022-March/229423.html
>>>
>>> Changes v3 -> v4:
>>>
>>> * changed INFO and WARN messages to DEBUG (Daniel)
>>>
>>> Changes v2 -> v3:
>>>
>>> * removed reading of max-pipe-size from procfs,
>>> instead make multiple attempts on EPERM with smaller sizes.
>>> In the regular case, this should succeed on the first try.
>>> (Daniel)
>>>
>>> Changes v1 -> v2:
>>>
>>> * removed VIR_FILE_WRAPPER_BIG_PIPE, made the new pipe resizing
>>> unconditional (Michal)
>>>
>>> * moved code to separate functions (Michal)
>>>
>>> * removed ternary op, disliked in libvirt (Michal)
>>>
>>> * added #ifdef __linux__ (Ani Sinha)
>>>
>>> * try smallest value between currently best measured value (1MB)
>>> and the pipe-max-size setting. If pipe-max-size cannot be read,
>>> try kernel default max (1MB). (Daniel)
>>>
>>>
>>> diff --git a/src/util/virfile.c b/src/util/virfile.c
>>> index a04f888e06..87539be0b9 100644
>>> --- a/src/util/virfile.c
>>> +++ b/src/util/virfile.c
>>> @@ -201,6 +201,50 @@ struct _virFileWrapperFd {
>>> };
>>>
>>> #ifndef WIN32
>>> +
>>> +#ifdef __linux__
>>> +
>>> +/**
>>> + * virFileWrapperSetPipeSize:
>>> + * @fd: the fd of the pipe
>>> + *
>>> + * Set best pipe size on the passed file descriptor for bulk transfers of data.
>>> + *
>>> + * default pipe size (usually 64K) is generally not suited for large transfers
>>> + * to fast devices. A value of 1MB has been measured to improve virsh save
>>> + * by 400% in ideal conditions. We retry multiple times with smaller sizes
>>> + * on EPERM to account for possible small values of /proc/sys/fs/pipe-max-size.
>>> + *
>>> + * OS note: only for linux, on other OS this is a no-op.
>>> + */
>>> +static void
>>> +virFileWrapperSetPipeSize(int fd)
>>> +{
>>> + int sz;
>>> +
>>> + for (sz = 1024 * 1024; sz >= 64 * 1024; sz /= 2) {
>>> + int rv = fcntl(fd, F_SETPIPE_SZ, sz);
>>> + if (rv < 0 && errno == EPERM) {
>>> + VIR_DEBUG("EPERM trying to set fd %d pipe size to %d", fd, sz);
>>> + continue; /* retry with half the size */
>>> + }
>>> + if (rv < 0) {
>>> + break;
>>> + }
>>> + VIR_DEBUG("fd %d pipe size adjusted to %d", fd, sz);
>>> + return;
>>> + }
>>> + virReportSystemError(errno, "%s", _("unable to set pipe size, data transfer might be slow"));
>>
>> This should have been VIR_WARN(). It's weird to report an error when the
>> function returns void.
>
> Actually I said to report an error in prvious version, as I figured we
> were handling the expect EPERM, but I guess we could even fail the
> last 64 KB iteration and stick with the default. So we need a slight
> tweak:
>
> static void
> virFileWrapperSetPipeSize(int fd)
> {
> int sz;
>
> for (sz = 1024 * 1024; sz >= 64 * 1024; sz /= 2) {
> int rv = fcntl(fd, F_SETPIPE_SZ, sz);
> if (rv < 0 && errno == EPERM) {
> VIR_DEBUG("EPERM trying to set fd %d pipe size to %d", fd, sz);
> continue; /* retry with half the size */
> }
> if (rv < 0) {
> virReportSystemError(errno, "%s", _("unable to set pipe size"));
> return -1;
> }
> VIR_DEBUG("fd %d pipe size adjusted to %d", fd, sz);
> return 0;
> }
> VIR_WARN("Could set pipe size to 64 KB, leaving on default size");
> return 0;
> }
>
> then the caller can treat -1 as fatal
Yes, in that case we could call virReportSystemError(), but the way the
code is currently written doesn't make much sense. Anyway, let me post a
follow up patch that does report error.
Michal
More information about the libvir-list
mailing list