[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [libvirt] [PATCH v2] lib: Add note that bulk stats API queries may overrun RPC buffers

On Tue, May 23, 2017 at 05:29:28PM +0200, Michal Privoznik wrote:
On 05/23/2017 05:19 PM, Martin Kletzander wrote:
On Tue, May 23, 2017 at 05:07:40PM +0200, Michal Privoznik wrote:
On 05/23/2017 04:35 PM, Martin Kletzander wrote:
On Tue, May 23, 2017 at 04:23:30PM +0200, Peter Krempa wrote:
Hint that the users should limit the number of VMs queried in the bulk
stats API.
- added a suggestion of the number of queried VMs (valid after bump to
32M message)

src/libvirt-domain.c | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/src/libvirt-domain.c b/src/libvirt-domain.c
index 310b91b37..aef061943 100644
--- a/src/libvirt-domain.c
+++ b/src/libvirt-domain.c
@@ -11315,6 +11315,10 @@ virConnectGetDomainCapabilities(virConnectPtr
+ * Note that this API is prone to exceeding maximum RPC message size
on hosts
+ * with lots of VMs so it's suggested to use virDomainListGetStats
with a
+ * reasonable list of VMs as the argument.
+ *
 * Returns the count of returned statistics structures on success, -1
on error.
 * The requested data are returned in the @retStats parameter. The
 * array should be freed by the caller. See
@@ -11386,6 +11390,10 @@ virConnectGetAllDomainStats(virConnectPtr
 * Note that any of the domain list filtering flags in @flags may be
 * by this function.
+ * Note that this API is prone to exceeding maximum RPC if querying
too many VMs
+ * with lots of statistics. It's suggested to query in batches of
10VMs, which
+ * should be good enough for VMs with 3000 disks + networks.
+ *

Coming to think about it... Why don't we just batch this ourselves under
the hood and just return the merged result?



Not on the RPC level, the API would just be syntax sugar to
virDomainListGetStats() if a flag was passed

We can't guarantee atomicity of such call, so I don't see a reason for

Yes, that's why it would *only* happen if that flag was present.  That's
precisely why I added info about the flag.

us to implement it. On the other hand, it's us who knows the limits for
RPC, not users. So they might have some troubles writing it. But then
again, we would have some troubles too because we might be talking to an
older server with lower limits thus we'd need some guessing too.

We would definitely need guessing.  You cannot predict the size very

Also, the flag would expose what we want to hide - RPC layer. It should
be no concern for users how we pass data around.

No, this has nothing to do with RPC.  Let me "pseudocode" this:

virConnectGetAllDomainStats(virConnectPtr conn,
                           unsigned int flags)


   ret = conn->driver->connectGetAllDomainStats(conn, ..., flags);
   if (ret < 0 && lastError == RPC_SIZE_ERROR) {
       if (legacy || !conn->driver->domainListGetStats)
           goto cleanup;


       ndomains = conn->driver->connectListAllDomains(conn, ..., flags);
       if (ndomains < 0)
           goto cleanup;

       /* halve the number of domains requested until the RPC error is
        * avoided and next batches start from the number that worked
        * and for each batch call domainListGetStats() and then merge
        * the results.
        * I was trying to implement it, but since it looks like nobody
        * else likes the idea, it would be wasting my time. */


Or we can expose it as a new helper function.  I see no RPC exposition

Attachment: signature.asc
Description: Digital signature

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]