[EnMasse] multitenancy and density (was Re: EnMasse multitenancy roles)

Fri Mar 24 21:12:12 UTC 2017

On 23/03/17 13:34, Ulf Lilleengen wrote:
> Hi,
>
> Resending this as 3 of us were not subscribed to the list.
>
> After our discussion yesterday, I've tried to collect my thoughts on
> multitenancy. In our past discussions there have been sort of 2 views on
> multitenancy: one where multitenancy is handled within the dispatch
> router, and one with multiple isolated router networks. As Rob mentioned
> (and I agree) we should think of supporting both.
>
> I don't think took into account supporting isolated and non-isolated
> tenants when we discussed this earlier. And I'm not sure if we should
> think of it as just 1 role or 2 roles externally:
>
> * Client - connects to the messaging endpoint
> * Tenant - Manages one address space
> (* Instance - Have 1 or more tenants)
> * Messaging operator - Manages EnMasse instances and tenants
> * OpenShift operator - Manages OpenShift
>
> Instances are isolated into separate OpenShift namespaces, while a
> tenant may share the same instance (routers and possibly brokers) with
> other tenants.
>
> Does it make sense to think of it this way? With this definition we have
> support for multiple instances today, but not multiple tenants within
> the same instance.

I have also been trying to collect my thoughts, specifically on the 
motivation for density and how best to address it. (The following is a 
bit of a ramble)

The desire for greater density is a desire for more efficient 
utilisation of resources: cpu, memory, file handles and disk space.

My assumption is that virtualisation provides efficient *cpu* 
utilisation without requiring shared processes.

The way the broker journal works, I doubt there is any gain in the 
efficient use of *disk space* from sharing a broker between tenants as 
opposed to having each tenant use their own broker.

I'm also assuming the bulk of the file handles used will be from by the 
applications own connections into the messaging service, so there would 
be no significant gain in efficiency in file handle utilisation from 
sharing infrastructure between tenants either.

So I think the issue of density boils down to memory use.

The argument for sharing a broker (or router) between tenants being that 
there is some minimum memory overhead for a broker (or router) process, 
independent of how much work it is actually doing, and that this 
overhead is significant.

Clearly there *is* some overhead, but perhaps then it would be worth 
experimenting a little to see if we can determine what it is, whether it 
can be reduced or tuned down in any way, and how it compares to the 
amount of memory consumed by different workloads.

Focusing just on the core messaging components to begin with, the 
minimal install of the current architecture would be a single router and 
single broker (since a broker can now host multiple queues).

For a single broker, the router only adds any value if the 
application(s) require the direct semantics that it offers and is only 
needed if those semantics are required over a connection that also 
accesses brokered addresses.

If an application/tenant needs more than a single broker/router, it 
would seem to me that there would be little benefit from trying to share 
with other tenants.

So the most compelling use case for shared infrastructure is where there 
are a lot of very small applications that could share a broker. Perhaps 
this use case would be better catered for by Rob (Godfrey)'s 'virtual 
broker' concept? I.e. maybe we have quite different underlying 
infrastructure for different service+plan combinations?