<div dir="ltr">I'm cross posting this on the AeroGear(<a href="https://groups.google.com/forum/#!forum/aerogear">https://groups.google.com/forum/#!forum/aerogear</a>) and the enmasse (<a href="https://www.redhat.com/mailman/listinfo/enmasse">https://www.redhat.com/mailman/listinfo/enmasse</a>) mailing lists. <br><br>In the past few months we've been adding the ability to use an external broker with the Unified Push Server[UPS] (<a href="https://github.com/aerogear/aerogear-unifiedpush-server">https://github.com/aerogear/aerogear-unifiedpush-server</a>). We've added support for external AMQP connections to the UPS container image, as well as added support for connecting to enmasse using the UPS operator (<a href="https://github.com/aerogear/unifiedpush-operator">https://github.com/aerogear/unifiedpush-operator</a>). Following are our experiences with enmasse as well as some problems we encountered.<div><br></div><div>1: The creation of enmasse resources using the UPS operator (based on client-go <a href="https://github.com/kubernetes/client-go">https://github.com/kubernetes/client-go</a>) worked pretty much as expected. The only downside was that the golang bindings from enmasse were out of date, but the team accepted our PR's and released a new version with updated bindings. Much appreciated!</div><div><br></div><div>2: The enmasse controllers for address, addresspaces, and messagingusers do not implement k8s watches, nor do they throw an exception if you try to watch a resource. This causes a lot of errors to be logged in the UPS operator as we need to watch and maintain those resources to keep our service running. There is an open enmasse issue here that describes the issue : <a href="https://github.com/EnMasseProject/enmasse/issues/1280" target="_blank">https://github.com/EnMasseProject/enmasse/issues/1280</a></div><div><br></div><div>3: Enmasse doesn't block owner deletion nor implement a finalizer to handle being orphaned. This means that when our operator deletes a UPS server, the addresses and addressspace resources we create don't get deleted, nor do we get a notification that they aren't being deleted. </div><div><br></div><div>4: Enmasse has no way to configure delivery failures. Usually with a message broker we want failed messages to be retried, then retried with a backoff, and eventually retired to a dead letter queue. In enmasse the default behavior is to retry immediately and infinitely. There is very little we can use the broker for to get an alert that this is happening. In the case of a production error this means that enmasse will death star our service with an infinite stream of doomed messages. We are researching workarounds, but per this issue (<a href="https://github.com/EnMasseProject/enmasse/issues/2927">https://github.com/EnMasseProject/enmasse/issues/2927</a>) it seems like there is little we can do at the level of address resource configuration.</div><div><br></div><div>5: Creation of resources (addresses, addressspaces, messaging users) and getting their respective information into our deployment using our operator was really straight forward. Like amazingly simple and directly straight forward.<br><br>Feel free to reply inline to the points. Now for my actual question : <br><br>We're trying to find workaround to #4 and are actively soliciting ideas. Right now we're looking at implementing more aggressive messaging handling so messages will always be consumed even if we would prefer them to be retried (for instance if the network connection to our push service was unavailable for a moment, a credential expired etc). <br><br>This doesn't help for unexpected errors in the UPS code (who doesn't love a good NPE), and for that we might want to have the UPS operator keep an eye out for extreme message re deliveries and auto-manually delete those messages, but that would be better handled by a DLQ mechanism.<br><br>Thoughts on the potential workarounds?</div></div>