[libvirt] [PATCH 01/12] conf: introduce domain XML element <polling> for iothread

Stefan Hajnoczi stefanha at gmail.com
Tue Feb 21 14:46:09 UTC 2017


On Tue, Feb 21, 2017 at 1:39 PM, Pavel Hrdina <phrdina at redhat.com> wrote:
> On Tue, Feb 21, 2017 at 01:26:25PM +0000, Daniel P. Berrange wrote:
>> On Tue, Feb 21, 2017 at 02:14:44PM +0100, Pavel Hrdina wrote:
>> > On Tue, Feb 21, 2017 at 12:55:51PM +0000, Daniel P. Berrange wrote:
>> > > On Tue, Feb 21, 2017 at 01:48:15PM +0100, Pavel Hrdina wrote:
>> > > > On Tue, Feb 21, 2017 at 12:26:02PM +0000, Daniel P. Berrange wrote:
>> > > > > On Tue, Feb 21, 2017 at 01:14:57PM +0100, Pavel Hrdina wrote:
>> > > > > > QEMU 2.9.0 will introduce polling feature for AioContext that instead
>> > > > > > of blocking syscalls polls for events without blocking.  This means
>> > > > > > that polling can be in most cases faster but it also increases CPU
>> > > > > > utilization.
>> > > > > >
>> > > > > > To address this issue QEMU implements self-tuning algorithm that
>> > > > > > modifies the current polling time to adapt to different workloads
>> > > > > > and it can also fallback to blocking syscalls.
>> > > > > >
>> > > > > > For each IOThread this all is controlled by three parameters,
>> > > > > > poll-max-ns, poll-grow and poll-shrink.  If parameter poll-max-ns
>> > > > > > is set to 0 it disables the polling, if it is omitted the default
>> > > > > > behavior is used and any value more than 0 enables polling.
>> > > > > > Parameters poll-grow and poll-shrink configure how the self-tuning
>> > > > > > algorithm will adapt the current polling time.  If they are omitted
>> > > > > > or set to 0 default values will be used.
>> > > > >
>> > > > > With my app developer hat on I have to wonder how an app is supposed
>> > > > > to figure out what to set these parameters to ? It has been difficult
>> > > > > enough figuring out existing QEMU block tunables, but at least most
>> > > > > of those can be set dependant on the type of storage use on the host
>> > > > > side. Tunables whose use depends on the guest workload are harder to
>> > > > > use since it largely involves predicting the unknown.  IOW is there
>> > > > > a compelling reason to add these low level parameters that are tightly
>> > > > > coupled to the specific algorithm that QEMU happens to use today.
>> > > >
>> > > > I agree that it's probably way to complicated for management applications,
>> > > > but there is a small issue with QEMU.  Currently it you don't specify
>> > > > anything the polling is enabled with some reasonable default value and base
>> > > > on experience with QEMU I'm not planning to count on that they will not change
>> > > > the default behavior in the future.  To explicitly enable polling you need
>> > > > to set poll-max-ns to some value more than 0.  We would have to check qemu
>> > > > source code, and define the default value in our code in order to let
>> > > > users explicitly enable polling.
>> > >
>> > > The QEMU commit says polling is now enabled by default without needing
>> > > to set poll-max-ns AFAICT
>> > >
>> > >   commit cdd7abfdba9287a289c404dfdcb02316f9ffee7d
>> > >   Author: Stefan Hajnoczi <stefanha at redhat.com>
>> > >   Date:   Thu Jan 26 17:01:19 2017 +0000
>> > >
>> > >     iothread: enable AioContext polling by default
>> > >
>> > >     IOThread AioContexts are likely to consist only of event sources like
>> > >     virtqueue ioeventfds and LinuxAIO completion eventfds that are pollable
>> > >     from userspace (without system calls).
>> > >
>> > >     We recently merged the AioContext polling feature but didn't enable it
>> > >     by default yet.  I have gone back over the performance data on the
>> > >     mailing list and picked a default polling value that gave good results.
>> > >
>> > >     Let's enable AioContext polling by default so users don't have another
>> > >     switch they need to set manually.  If performance regressions are found
>> > >     we can still disable this for the QEMU 2.9 release.
>> > >
>> > > > > The QEMU commits say the tunables all default to sane parameters so
>> > > > > I'm inclined to say we ignore them at the libvirt level entirely.
>> > > >
>> > > > Yes, it would be way batter to have only <polling enable='yes|no'> end let
>> > > > QEMU deal with the sane values for all parameters but that would mean to come
>> > > > up with the sane values ourselves or modify QEMU to add another property that
>> > > > would simply control whether it is enabled or not.
>> > >
>> > > I'm saying don't even add that.
>> > >
>> > > Do exactly nothing and just rely on the QEMU defaults here. This is not
>> > > affecting guest ABI at all so it doesn't matter if QEMU changes its
>> > > defaults later. In fact if QEMU changes defaults based on newer performance
>> > > measurements, it is a good thing if libvirt hasn't hardcoded all its VM
>> > > configs to the old default.
>> >
>> > What if someone would like to disable it even if QEMU thinks that the
>> > performance is good?  This patch series doesn't hardcode anything into
>> > VM config.  If you don't set the polling element at all Libvirt will follow
>> > the QEMU defaults and only the live XML would contain current state of
>> > polling with default values loaded from QEMU.
>> >
>> > This patch series ads a possibility to explicitly configure the polling if
>> > someone want's to do that for some reason, but it also preserve the benefit
>> > when you just don't care about it and you want to use the default.
>> >
>> > If you still think that we should not export this feature at all, well we
>> > don't have to.  The use-case that you've described is still possible with
>> > this series, it only adds extra functionality on top of that.
>>
>> I'm very wary of adding config parameters in libvirt just because they
>> exist in QEMU, particularly when the parameters are totally specific to
>> an algorithm that just happens to be the one implemented right now. We've
>> no idea if QEMU will stick with this algorithm or change it entirely, and
>> if the latter, then any config parameters will be likely meaningless for
>> any other algorithm. I can understand why QEMU would expose them on its
>> CLI, but I don't feel they are a good fit for exposing up the stack,
>> particularly given a lack of any guidance as to how people might consider
>> changing the values, other than random uninformed guesswork.
>
> Yes, that's true and I had the same worrying feeling about the parameters
> specifically tied to QEMU algorithm, but I thought that it would be nice
> to export them anyway.  Let's ignore this series for now and if someone asks
> for a feature to disable polling we can revive it.

libvirt doesn't need a dedicated API for <iothread> polling parameters
but the QEMU command-line passthrough feature must make it possible
using <qemu:arg value='-newarg'/>.  I have tested the following QEMU
command-line:

-object iothread,id=iothread0 \ # assume libvirt defines the iothreads
-object iothread,id=iothread1 \
-set object.iothread0.poll-max-ns=0 # override poll-max-ns using -set

This disables polling in iothread0 and leaves the default value in iothread1.

I'm fine if libvirt doesn't add a dedicated API for setting <iothread>
polling parameters.  It's unlikely that users will need to change the
setting.  In an emergency (e.g. disabling it due to a performance
regression) they can use <qemu:arg value='-newarg'/>.

Stefan




More information about the libvir-list mailing list