[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [libvirt] [RFC PATCH 2/2] lib: Add note that bulk stats API queries may overrun RPC buffers



On Tue, May 23, 2017 at 10:04:44AM +0100, Daniel P. Berrange wrote:
On Tue, May 23, 2017 at 10:51:47AM +0200, Martin Kletzander wrote:
On Tue, May 23, 2017 at 09:32:03AM +0100, Daniel P. Berrange wrote:
> On Mon, May 22, 2017 at 06:00:13PM +0200, Peter Krempa wrote:
> > Hint that the users should limit the number of VMs queried in the bulk
> > stats API.
> > ---
> >  src/libvirt-domain.c | 7 +++++++
> >  1 file changed, 7 insertions(+)
> >
> > diff --git a/src/libvirt-domain.c b/src/libvirt-domain.c
> > index 310b91b37..b01f2705c 100644
> > --- a/src/libvirt-domain.c
> > +++ b/src/libvirt-domain.c
> > @@ -11315,6 +11315,10 @@ virConnectGetDomainCapabilities(virConnectPtr conn,
> >   * VIR_CONNECT_GET_ALL_DOMAINS_STATS_SHUTOFF and/or
> >   * VIR_CONNECT_GET_ALL_DOMAINS_STATS_OTHER for all other states.
> >   *
> > + * Note that this API is prone to exceeding maximum RPC message size on hosts
> > + * with lots of VMs so it's suggested to use virDomainListGetStats with a
> > + * reasonable list of VMs as the argument.
> > + *
> >   * Returns the count of returned statistics structures on success, -1 on error.
> >   * The requested data are returned in the @retStats parameter. The returned
> >   * array should be freed by the caller. See virDomainStatsRecordListFree.
> > @@ -11386,6 +11390,9 @@ virConnectGetAllDomainStats(virConnectPtr conn,
> >   * Note that any of the domain list filtering flags in @flags may be rejected
> >   * by this function.
> >   *
> > + * Note that this API is prone to exceeding maximum RPC message size on hosts
> > + * with lots of VMs so it's suggested to limit the number of VMs queried.
>
> With the way this is worded, applications have no guidance on what is a
> sensible max number of VMs they can safely use, so I don't think it is
> particularly useful, except as a way to say "we told you so" when apps
> report a bug.
>
> I'd be inclined to explicitly put a limit of 100 VMs[1] in the API, and
> document that hard limit, so apps immediately know to write their code
> to chunk in 100 VM blocks.
>

Or we can give a hint on what the limit is and let users figure out the
sensible number.  It is not only based on the number of VMs.  I believe
you can hit the limit with one or two VMs if you have lot of devices
that we report statistics for.  Limiting the number of VMs to a
particular number would not help as much in this case.  But we can
combine both approaches.

Urgh, that's even worse. Apps can't simply split queries into blocks of
N vms, because any single VM in that list might have huge number of disks.
So to use this at all reliably you have to query the XML config of every
guest to see what devices are present and then split up the queries into
variable number of VMs :-(


I meant something along the lines of:

 This API might transfer lot of data over the connection which needs to
 be limited.  The limit is currently 32MB with all the headers and the
 output format is described here, so one should be able to make a
 pretty good assumptions based on the average deployment they have.  In
 case you get error message saying "RPC: message to big", you need to
 limit the number of VMs you are requesting statistics for.

And then put a limit in the API that returns additional error if more
than 100 VMs are in the list.

This just adds to my feeling that we should consider this API a failed
experiment


I'm not saying no...  But we should be able to do such things, just in a
different way, let's say.  The idea with shared memory is nice.  Even
remotely the libvirt API could expose it as such and underneath we could
be changing how that update routine works.

Regards,
Daniel
--
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Attachment: signature.asc
Description: Digital signature


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]