[EnMasse] AddressController performance / ConfigMap lookups

Ulf Lilleengen lulf at redhat.com
Mon Mar 12 14:28:16 UTC 2018


Hi Carsten,

Thanks for starting the discussion, I think you raise some important 
points. Some comments inline.

On 03/12/2018 02:48 PM, Lohmann Carsten (INST/ECS4) wrote:
> Hi Ulf,
> 
> in general, we were thinking whether it wouldn't be better to use a 
> database (e.g. MongoDB) instead of ConfigMaps for address-persistence.
> 

Etcd or zookeeper type of stores are the types of persistence that fits 
our use case quite well IMO. They provide write and read ordering 
guarantees, atomicity for keys, and quorums for availability. I'm sure 
mongodb could work as well.

There is, however an operational impact of deploying another stateful 
component which is what we have tried to avoid and thereby 'piggyback' 
on the kubernetes etc through its API. We get authentication and 
authorization of those resources within each address space out of the 
box as well.

> When thinking about a scenario with 10000+ addresses, having that many 
> ConfigMaps seems .. odd. Kind of going beyond what the ConfigMap 
> mechanism as Pod configuration data is probably intended for (although 
> there don't seem to be hard limits in that sense).
> 

I'm not fully convinced that this is a problem at 10k, but it depends of 
course on what other components are running on the same cluster. I think 
we should benchmark this before drawing any conclusions (or maybe you 
have some numbers?) I don't know in detail how a configmap maps to etcd, 
but I think the difference would only be in the amount of data that is 
written.

> Database-persistence could possibly offer better performance and 
> simplify backup-strategies. >

My impression is that etcd is quite performant for this kind of use, but 
I don't have any numbers to back that up with. Again, should be benchmarked.

> Also when updating EnMasse by re-deploying the EnMasse components 
> (having deleted the K8s-namespace first), it seems easier to have the 
> addresses in the database untouched by this instead of having to 
> re-create the addresses/ConfigMaps.
> 

One of the goals for EnMasse is to support rolling upgrades of 
components like the router, so deleting a recreating should not be the 
long-term strategy IMO. I understand that this is a problem for you at 
present though.

> I think from an architectural point of view, the question is, whether 
> the addresses are considered cluster-state information (therefore 
> belonging in the etcd datastore) or application-specific data.
> 
> With the current address ConfigMap content structure modelled after K8s 
> resources, it's obviously handled more as cluster-state information.
> > (In that sense, is it planned to switch addresses to be proper K8s
> custom resources instead of ConfigMaps?).
> 

That was be the plan, yes, but there are some restrictions typically on 
OpenShift that prevents custom resource definitions from being deployed 
without cluster-admin access. Ideally enmasse would support being 
deployed without cluster-admin for some use cases.

> But, I think the addresses could also be viewed as application-specific 
> data and in that sense better be stored externally.
> 
> WDYT? Have you thought about replacing the ConfigMap persistence? Do you 
> see limitations with the current ConfigMap-based approach thinking about 
> 10000+ addresses? >
> Or would you see other persistence options?
>

I remember exploring that in the early phases of enmasse just before we 
introduced the configmap-based configuration, but we didn't explore that 
on the basis that we believed k8s would meet our demands and didn't want 
to introduce another stateful component. Maybe the time has come to 
rethink that.


>  From an implementation standpoint, having a separate persistence 
> implementation seems quite straightforward in the address/standard 
> controller with the AddressApi interface already in place.
> 
> Then the agent component would have to be changed as well (maybe 
> changing it to use the AddressController REST API?).
> 

Funny you should mention that. We just made it so it does not use the 
REST API for the purposes of making it more independent of the address 
controller. I.e. it would allow you to deploy the standard address space 
without the address controller if you would so wish.


I'm not opposed to make the persistence configurable, but I think it 
would be a significant undertaking.

Thanks,

Ulf


> Best regards
> 
> Carsten
> 
> 
> *Von:*Ulf Lilleengen [mailto:lulf at redhat.com]
> *Gesendet:* Montag, 5. März 2018 15:00
> *An:* Lohmann Carsten (INST/ECS4) <Carsten.Lohmann at bosch-si.com>
> *Cc:* enmasse at redhat.com
> *Betreff:* Re: [EnMasse] AddressController performance / ConfigMap lookups
> 
> Hi Carsten,
> 
> Yes, the getSchema() will lookup the configmaps every time. I suspected 
> that we might need to cache it, but decided to see how often this would 
> actually be used before optimizing it. Sounds like it should be 
> optimized  :). I'm a bit surprised that it takes this long to get this 
> information from the kubernetes master though.
> 
> Using watchResources instead to cache it sounds like a sensible thing to 
> do to me.
> 
> Best regards,
> 
> Ulf
> 
> On Mon, Mar 5, 2018 at 2:17 PM, Lohmann Carsten (INST/ECS4) 
> <Carsten.Lohmann at bosch-si.com <mailto:Carsten.Lohmann at bosch-si.com>> wrote:
> 
>     Hi,
> 
>     We have noticed some performance issues when using the
>     AddressController REST API.
> 
>     Here is a log excerpt with added debug output concerning the
>     addition of 2 addresses:
> 
>     ---------------------
> 
>     2018-03-02 16:31:43.281 [vert.x-worker-thread-15] DEBUG
>     HttpAddressService:94 - appendAddresses:
>     [telemetry/tst_8432b9a6d2194c3f8c6328706eb455a0,
>     event/tst_8432b9a6d2194c3f8c6328706eb455a0]
> 
>     2018-03-02 16:31:43.290 [vert.x-worker-thread-15] DEBUG
>     ConfigMapAddressSpaceApi:38 - getAddressSpaceWithName: get() took 8ms
> 
>     2018-03-02 16:31:43.291 [vert.x-worker-thread-15] DEBUG
>     AddressApiHelper:47 - verifyAuthorized took 0ms
> 
>     2018-03-02 16:31:43.305 [vert.x-worker-thread-15] DEBUG
>     ConfigMapSchemaApi:55 - listConfigMaps type=address-space-plan,
>     resultList.size=1 took 14ms
> 
>     2018-03-02 16:31:43.321 [vert.x-worker-thread-15] DEBUG
>     ConfigMapSchemaApi:55 - listConfigMaps type=address-plan,
>     resultList.size=6 took 15ms
> 
>     2018-03-02 16:31:43.336 [vert.x-worker-thread-15] DEBUG
>     ConfigMapSchemaApi:55 - listConfigMaps type=resource-definition,
>     resultList.size=4 took 14ms
> 
>     2018-03-02 16:31:43.351 [vert.x-worker-thread-15] DEBUG
>     ConfigMapSchemaApi:55 - listConfigMaps type=address-plan,
>     resultList.size=6 took 14ms
> 
>     2018-03-02 16:31:43.365 [vert.x-worker-thread-15] DEBUG
>     ConfigMapSchemaApi:55 - listConfigMaps type=resource-definition,
>     resultList.size=4 took 14ms
> 
>     2018-03-02 16:31:43.380 [vert.x-worker-thread-15] DEBUG
>     ConfigMapSchemaApi:55 - listConfigMaps type=resource-definition,
>     resultList.size=4 took 14ms
> 
>     2018-03-02 16:31:43.397 [vert.x-worker-thread-15] DEBUG
>     ConfigMapSchemaApi:55 - listConfigMaps type=resource-definition,
>     resultList.size=4 took 16ms
> 
>     2018-03-02 16:31:43.412 [vert.x-worker-thread-15] DEBUG
>     ConfigMapSchemaApi:55 - listConfigMaps type=resource-definition,
>     resultList.size=4 took 15ms
> 
>     2018-03-02 16:31:43.427 [vert.x-worker-thread-15] DEBUG
>     ConfigMapSchemaApi:55 - listConfigMaps type=resource-definition,
>     resultList.size=4 took 14ms
> 
>     2018-03-02 16:31:43.442 [vert.x-worker-thread-15] DEBUG
>     ConfigMapSchemaApi:55 - listConfigMaps type=resource-definition,
>     resultList.size=4 took 14ms
> 
>     2018-03-02 16:31:43.456 [vert.x-worker-thread-15] DEBUG
>     ConfigMapSchemaApi:55 - listConfigMaps type=resource-definition,
>     resultList.size=4 took 14ms
> 
>     2018-03-02 16:31:43.456 [vert.x-worker-thread-15] DEBUG
>     ConfigMapSchemaApi:166 - getSchema took 165ms
> 
>     2018-03-02 16:31:43.473 [vert.x-worker-thread-15] DEBUG
>     ConfigMapAddressApi:89 - listAddresses: list() took 16ms
> 
>     2018-03-02 16:31:43.494 [vert.x-worker-thread-15] DEBUG
>     ConfigMapAddressApi:103 - createAddress: create() took 20ms
> 
>     2018-03-02 16:31:43.506 [vert.x-worker-thread-15] DEBUG
>     ConfigMapAddressApi:103 - createAddress: create() took 12ms
> 
>     2018-03-02 16:31:43.524 [vert.x-worker-thread-15] DEBUG
>     ConfigMapAddressApi:89 - listAddresses: list() took 17ms
> 
>     2018-03-02 16:31:43.524 [vert.x-worker-thread-15] DEBUG
>     HttpAddressService:48 - appendAddresses
>     [telemetry/tst_8432b9a6d2194c3f8c6328706eb455a0,
>     event/tst_8432b9a6d2194c3f8c6328706eb455a0] end (result: 56 items) -
>     requestProcessing took 242ms
> 
>     --------------------
> 
>     What becomes obvious here is that the "
>     ConfigMapSchemaApi.getSchema" invocation is quite expensive with its
>     "listConfigMaps" calls.
> 
>     The duration of 160ms is quite typical in our environment.
> 
>     We even had times where the API server took longer for the requests
>     and where the output looked like this:
> 
>     --------------------
> 
>     2018-03-02 17:20:07.197 [vert.x-worker-thread-13] DEBUG
>     HttpAddressService:94 - appendAddresses:
>     [telemetry/tst_a77831ab936849c4b8a1ba8d15d2e018,
>     event/tst_a77831ab936849c4b8a1ba8d15d2e018]
> 
>     2018-03-02 17:20:07.345 [vert.x-worker-thread-13] DEBUG
>     ConfigMapAddressSpaceApi:38 - getAddressSpaceWithName: get() took 147ms
> 
>     2018-03-02 17:20:07.345 [vert.x-worker-thread-13] DEBUG
>     AddressApiHelper:47 - verifyAuthorized took 0ms
> 
>     2018-03-02 17:20:07.444 [vert.x-worker-thread-13] DEBUG
>     ConfigMapSchemaApi:55 - listConfigMaps type=address-space-plan,
>     resultList.size=1 took 98ms
> 
>     2018-03-02 17:20:07.591 [vert.x-worker-thread-13] DEBUG
>     ConfigMapSchemaApi:55 - listConfigMaps type=address-plan,
>     resultList.size=6 took 147ms
> 
>     2018-03-02 17:20:07.714 [vert.x-worker-thread-13] DEBUG
>     ConfigMapSchemaApi:55 - listConfigMaps type=resource-definition,
>     resultList.size=4 took 122ms
> 
>     2018-03-02 17:20:07.834 [vert.x-worker-thread-13] DEBUG
>     ConfigMapSchemaApi:55 - listConfigMaps type=address-plan,
>     resultList.size=6 took 120ms
> 
>     2018-03-02 17:20:07.981 [vert.x-worker-thread-13] DEBUG
>     ConfigMapSchemaApi:55 - listConfigMaps type=resource-definition,
>     resultList.size=4 took 146ms
> 
>     2018-03-02 17:20:08.131 [vert.x-worker-thread-13] DEBUG
>     ConfigMapSchemaApi:55 - listConfigMaps type=resource-definition,
>     resultList.size=4 took 149ms
> 
>     2018-03-02 17:20:08.254 [vert.x-worker-thread-13] DEBUG
>     ConfigMapSchemaApi:55 - listConfigMaps type=resource-definition,
>     resultList.size=4 took 122ms
> 
>     2018-03-02 17:20:08.374 [vert.x-worker-thread-13] DEBUG
>     ConfigMapSchemaApi:55 - listConfigMaps type=resource-definition,
>     resultList.size=4 took 120ms
> 
>     2018-03-02 17:20:08.494 [vert.x-worker-thread-13] DEBUG
>     ConfigMapSchemaApi:55 - listConfigMaps type=resource-definition,
>     resultList.size=4 took 119ms
> 
>     2018-03-02 17:20:08.641 [vert.x-worker-thread-13] DEBUG
>     ConfigMapSchemaApi:55 - listConfigMaps type=resource-definition,
>     resultList.size=4 took 146ms
> 
>     2018-03-02 17:20:08.761 [vert.x-worker-thread-13] DEBUG
>     ConfigMapSchemaApi:55 - listConfigMaps type=resource-definition,
>     resultList.size=4 took 120ms
> 
>     2018-03-02 17:20:08.761 [vert.x-worker-thread-13] DEBUG
>     ConfigMapSchemaApi:166 - getSchema took 1416ms
> 
>     2018-03-02 17:20:08.913 [vert.x-worker-thread-13] DEBUG
>     ConfigMapAddressApi:89 - listAddresses: list() took 151ms
> 
>     2018-03-02 17:20:09.145 [vert.x-worker-thread-13] DEBUG
>     ConfigMapAddressApi:103 - createAddress: create() took 231ms
> 
>     2018-03-02 17:20:09.265 [vert.x-worker-thread-13] DEBUG
>     ConfigMapAddressApi:103 - createAddress: create() took 119ms
> 
>     2018-03-02 17:20:09.426 [vert.x-worker-thread-13] DEBUG
>     ConfigMapAddressApi:89 - listAddresses: list() took 160ms
> 
>     2018-03-02 17:20:09.426 [vert.x-worker-thread-13] DEBUG
>     HttpAddressService:48 - appendAddresses
>     [telemetry/tst_a77831ab936849c4b8a1ba8d15d2e018,
>     event/tst_a77831ab936849c4b8a1ba8d15d2e018] end (result: 66 items) -
>     requestProcessing took 2229ms
> 
>     --------------------
> 
>     Possible performance improvements concerning the API server aside,
>     there is the question whether there is room to make "getSchema" faster.
> 
>     To me it looks like the K8s resources requested there (address space
>     plans, address plans, resource definitions) are fairly static and
>     could therefore be cached/kept in memory.
> 
>     I guess updates on these K8s resources could be handled via
>     ConfigMapAddressAPI.watchResources (?).
> 
>     WDYT? Would that be feasible?
> 
>     Best regards
> 
>     *Carsten Lohmann
>     *
>     (INST/ECS4)
>     Bosch Software Innovations GmbH | Ullsteinstr. 128 | 12109 Berlin |
>     GERMANY| www.bosch-si.com <http://www.bosch-si.com>
> 
>     Sitz: Berlin, Registergericht: Amtsgericht Charlottenburg; HRB 148411 B
>     Aufsichtsratsvorsitzender: Dr.-Ing. Thorsten Lücke;
>     Geschäftsführung: Dr. Stefan Ferber, Michael Hahn
> 
> 
> 
>     _______________________________________________
>     enmasse mailing list
>     enmasse at redhat.com <mailto:enmasse at redhat.com>
>     https://www.redhat.com/mailman/listinfo/enmasse
> 




More information about the enmasse mailing list