[libvirt] [RFC PATCH 2/2] lib: Add note that bulk stats API queries may overrun RPC buffers

Daniel P. Berrange berrange at redhat.com
Tue May 23 09:04:44 UTC 2017


On Tue, May 23, 2017 at 10:51:47AM +0200, Martin Kletzander wrote:
> On Tue, May 23, 2017 at 09:32:03AM +0100, Daniel P. Berrange wrote:
> > On Mon, May 22, 2017 at 06:00:13PM +0200, Peter Krempa wrote:
> > > Hint that the users should limit the number of VMs queried in the bulk
> > > stats API.
> > > ---
> > >  src/libvirt-domain.c | 7 +++++++
> > >  1 file changed, 7 insertions(+)
> > > 
> > > diff --git a/src/libvirt-domain.c b/src/libvirt-domain.c
> > > index 310b91b37..b01f2705c 100644
> > > --- a/src/libvirt-domain.c
> > > +++ b/src/libvirt-domain.c
> > > @@ -11315,6 +11315,10 @@ virConnectGetDomainCapabilities(virConnectPtr conn,
> > >   * VIR_CONNECT_GET_ALL_DOMAINS_STATS_SHUTOFF and/or
> > >   * VIR_CONNECT_GET_ALL_DOMAINS_STATS_OTHER for all other states.
> > >   *
> > > + * Note that this API is prone to exceeding maximum RPC message size on hosts
> > > + * with lots of VMs so it's suggested to use virDomainListGetStats with a
> > > + * reasonable list of VMs as the argument.
> > > + *
> > >   * Returns the count of returned statistics structures on success, -1 on error.
> > >   * The requested data are returned in the @retStats parameter. The returned
> > >   * array should be freed by the caller. See virDomainStatsRecordListFree.
> > > @@ -11386,6 +11390,9 @@ virConnectGetAllDomainStats(virConnectPtr conn,
> > >   * Note that any of the domain list filtering flags in @flags may be rejected
> > >   * by this function.
> > >   *
> > > + * Note that this API is prone to exceeding maximum RPC message size on hosts
> > > + * with lots of VMs so it's suggested to limit the number of VMs queried.
> > 
> > With the way this is worded, applications have no guidance on what is a
> > sensible max number of VMs they can safely use, so I don't think it is
> > particularly useful, except as a way to say "we told you so" when apps
> > report a bug.
> > 
> > I'd be inclined to explicitly put a limit of 100 VMs[1] in the API, and
> > document that hard limit, so apps immediately know to write their code
> > to chunk in 100 VM blocks.
> > 
> 
> Or we can give a hint on what the limit is and let users figure out the
> sensible number.  It is not only based on the number of VMs.  I believe
> you can hit the limit with one or two VMs if you have lot of devices
> that we report statistics for.  Limiting the number of VMs to a
> particular number would not help as much in this case.  But we can
> combine both approaches.

Urgh, that's even worse. Apps can't simply split queries into blocks of
N vms, because any single VM in that list might have huge number of disks.
So to use this at all reliably you have to query the XML config of every
guest to see what devices are present and then split up the queries into
variable number of VMs :-(

This just adds to my feeling that we should consider this API a failed
experiment

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|




More information about the libvir-list mailing list