[libvirt] [PATCH v2] qemu: Add option to enable/disable IOEventFD feature

Stefan Hajnoczi stefanha at gmail.com
Thu May 19 22:09:54 UTC 2011


On Thu, May 19, 2011 at 11:20 AM, Daniel Veillard <veillard at redhat.com> wrote:
> On Thu, May 19, 2011 at 09:44:35AM +0100, Stefan Hajnoczi wrote:
>> On Thu, May 19, 2011 at 8:26 AM, Daniel Veillard <veillard at redhat.com> wrote:
>> > On Wed, May 18, 2011 at 04:07:30PM +0800, Daniel Veillard wrote:
>> >> On Tue, May 17, 2011 at 03:56:11PM +0100, Daniel P. Berrange wrote:
>> >> > On Tue, May 17, 2011 at 04:49:17PM +0200, Michal Privoznik wrote:
>> >> > > This feature allows QEMU to achieve higher throughput, but is available
>> >> > > only in recent versions. It is accessible via ioeventfd attribute
>> >> > > with accepting values 'on', 'off'. Only experienced users needs to set
>> >> > > this, because QEMU defaults to 'on', meaning higher performance.
>> >> > > Translates into virtio-{blk|net}-pci.ioeventfd option.
>> >> [...]
>> >> > > +          <li>
>> >> > > +           The optional <code>ioeventfd</code> attribute enables or disables
>> >> > > +           IOEventFD feature for virtqueue notify. The value can be either
>> >> > > +           'on' or 'off'.
>> >> > > +            <span class="since">Since 0.9.2 (QEMU and KVM only)</span>
>> >> >
>> >> > This is a qemu specific attribute name & description. IMHO we shouldn't
>> >> > be exposing that directly. Who even knows what effect it actually has
>> >> > on the guests...
>> >>
>> >>   Agreed, what is the semantic of this flag, beside allowing to switch
>> >> something in qemu ?
>> >
>> >  Just to clarify my answer a bit, the problem here is that the patch
>> > does not explain what the ioeventfd qemu flag does in practice and how
>> > it influence the virtualization. To be able to provide a good API and
>> > maintain it long term we need to be able to explain the semantic of
>> > the API (be it a function of the library or part of the XML being used),
>> > only then we can guarantee that there is no misunderstanding about what
>> > it does, and also allow us to reuse it in case the same functionality
>> > is provided by another hypervisor.
>> >  So instead of explaining the option using terms from QEmu, let's
>> > explain what it does in general terms and use those general terms to
>> > model the API,
>>
>> I don't think there is a general API here, ioeventfd is specific to
>> QEMU's architecture.  It allows you to switch between two internal
>> threading models for handling I/O emulation.  It could change in the
>> future if QEMU's architecture changes.  This is not an end-user
>> feature, it's more an internal performance tunable.
>
>  Actually reading about it at
>   https://patchwork.kernel.org/patch/43390/
>
> it seems that can be described as
>   "domain I/O asynchronous handling",
> it's a shortcut because it's not for the whole I/O only a part of it
> but that in itself is sufficiently generic to be potentially useful
> for something else.
>
>  I would just suggest to rename the attribute "asyncio" with value
> on or off, document the fact that it allows to force on or off some of
> the asynchronous I/O handling for the device, and that the default is
> left to the discretion of the hypervisor.
>
>  In case we need to refine later, we can still provide a larger set of
> accepted values for that attribute, assuming people really want to
> make more distinctive tuning,

Inventing a different name makes life harder for everyone.  There is a
need for a generic API/notation that covers all virtualization
software but this is a hypervisor-specific performance tunable that
does not benefit from abstraction.

When I ask a user to try disabling ioeventfd I need to first search
through libvirt documentation and/or source code to reverse-engineer
this artificial mapping.  This creates an extra source of errors for
people who are trying to configure or troubleshoot their systems.  The
"I know what the hypervisor-specific setting is but have no idea how
to express it in libvirt domain XML" problem is really common and
creates a gap between the hypervisor and libvirt communities.

The next time an optimization is added to QEMU you'll have to pick a
new name, "asyncio" (already overloaded terminology today) won't be
available anymore.  We're going to end up with increasingly contrived
or off-base naming.

Regarding semantics:

Ioeventfd decouples vcpu execution from I/O emulation, allowing the VM
to execute guest code while a separate thread handles I/O.  This
results in reduced steal time and lowers spinlock contention inside
the guest.  Typically guests that are experiencing high system cpu
utilization during I/O will benefit from ioeventfd.  On an
overcommitted host it could increase guest I/O latency though.  The
ioeventfd option is currently only supported on virtio-blk (default:
on) and virtio-net (default: off) devices.

Please call it ioeventfd.  Also, it can always be toggled using the
<qemu:commandline> tag if you don't want to expose it natively in
domain XML.

Stefan




More information about the libvir-list mailing list