[EnMasse] Enmasse + UnifiedPush + Operator Feedback and request for help

Summers Pittman supittma at redhat.com
Wed Jul 31 15:10:04 UTC 2019


On Wed, Jul 31, 2019 at 8:41 AM Jens Reimann <jreimann at redhat.com> wrote:

>
> On Wed, Jul 31, 2019 at 1:27 PM Camila Macedo <cmacedo at redhat.com> wrote:
>
>> Hi @Summers,
>>
>> Regards the items 2 and 3, could we not implement these actions in the
>> UPS operators instead of the expected implementation on enmasse?
>> Regards item 4 shows that a new release was made 10 hours ago with this
>> fix[1](0.29.0[2]). So, could not it be solved by using this version?
>>
>
> I would hope we would not do that. Simply because that means a bit (or a
> lot) of effort, but only solves to the problem for one consumer of enmasse.
> However others (even the IoT components of enmasse) face the same issue.
> And we would need to invest some time there as well.
>
> So I would prefer some solution, which fixes the root cause, rather than
> working around the issue in various other locations.
>

For the watch issue there are probably some low hanging fruits to add to
enmasse.  The easiest and probably pretty important one I would say is make
the get operations for the various controllers return an error if a watch
is requested.  Currently the endpoints return a list of their resources.

In k8s itself, here is the error they return is a watch isn't supported :
1:
https://github.com/kubernetes/kubernetes/blob/1ff857e1269fc100c8dda64287180cbfb5b2e657/staging/src/k8s.io/apiserver/pkg/endpoints/handlers/get.go#L239
2:
https://github.com/kubernetes/apimachinery/blob/master/pkg/api/errors/errors.go#L283

I'm not sure how that will interact with client-go, but an explicit error
will make it easier for developers to know this is not a problem in their
code.



For configuring redelivery and DLQs, that should probably be some
properties in Address custom resources.  But I'm not really a messaging
expert so I don't have a lot of strong ideas or opinions.


>
>
>>
>> [1] - https://github.com/EnMasseProject/enmasse/issues/2927
>> [2] - https://github.com/EnMasseProject/enmasse/releases/tag/0.29.0
>>
>>
>> CAMILA MACEDO
>>
>> SR. SOFTWARE ENGINEER, RED HAT CLOUD SERVICES
>>
>> Red Hat UK <https://www.redhat.com/>
>>
>> IM: cmacedo
>>
>> Phone: +44 7853500035
>> <https://red.ht/sig>
>>
>>
>> On Mon, Jul 29, 2019 at 4:07 PM Summers Pittman <supittma at redhat.com>
>> wrote:
>>
>>> I'm cross posting this on the AeroGear(
>>> https://groups.google.com/forum/#!forum/aerogear) and the enmasse (
>>> https://www.redhat.com/mailman/listinfo/enmasse) mailing lists.
>>>
>>> In the past few months we've been adding the ability to use an external
>>> broker with the Unified Push Server[UPS] (
>>> https://github.com/aerogear/aerogear-unifiedpush-server).  We've added
>>> support for external AMQP connections to the UPS container image, as well
>>> as added support for connecting to enmasse using the UPS operator (
>>> https://github.com/aerogear/unifiedpush-operator).  Following are our
>>> experiences with enmasse as well as some problems we encountered.
>>>
>>> 1: The creation of enmasse resources using the UPS operator (based on
>>> client-go https://github.com/kubernetes/client-go) worked pretty much
>>> as expected.  The only downside was that the golang bindings from enmasse
>>> were out of date, but the team accepted our PR's and released a new version
>>> with updated bindings.  Much appreciated!
>>>
>>> 2: The enmasse controllers for address, addresspaces, and messagingusers
>>> do not implement k8s watches, nor do they throw an exception if you try to
>>> watch a resource.  This causes a lot of errors to be logged in the UPS
>>> operator as we need to watch and maintain those resources to keep our
>>> service running.  There is an open enmasse issue here that describes the
>>> issue : https://github.com/EnMasseProject/enmasse/issues/1280
>>>
>>> 3: Enmasse doesn't block owner deletion nor implement a finalizer to
>>> handle being orphaned.  This means that when our operator deletes a UPS
>>> server, the addresses and addressspace resources we create don't get
>>> deleted, nor do we get a notification that they aren't being deleted.
>>>
>>> 4: Enmasse has no way to configure delivery failures.  Usually with a
>>> message broker we want failed messages to be retried, then retried with a
>>> backoff, and eventually retired to a dead letter queue.  In enmasse the
>>> default behavior is to retry immediately and infinitely.  There is very
>>> little we can use the broker for to get an alert that this is happening.
>>> In the case of a production error this means that enmasse will death star
>>> our service with an infinite stream of doomed messages.  We are researching
>>> workarounds, but per this issue (
>>> https://github.com/EnMasseProject/enmasse/issues/2927) it seems like
>>> there is little we can do at the level of address resource configuration.
>>>
>>> 5: Creation of resources (addresses, addressspaces, messaging users) and
>>> getting their respective information into our deployment using our operator
>>> was really straight forward.  Like amazingly simple and directly straight
>>> forward.
>>>
>>> Feel free to reply inline to the points.  Now for my actual question :
>>>
>>> We're trying to find workaround to #4 and are actively soliciting
>>> ideas.  Right now we're looking at implementing more aggressive messaging
>>> handling so messages will always be consumed even if we would prefer them
>>> to be retried (for instance if the network connection to our push service
>>> was unavailable for a moment, a credential expired etc).
>>>
>>> This doesn't help for unexpected errors in the UPS code (who doesn't
>>> love a good NPE), and for that we might want to have the UPS operator keep
>>> an eye out for extreme message re deliveries and auto-manually delete those
>>> messages, but that would be better handled by a DLQ mechanism.
>>>
>>> Thoughts on the potential workarounds?
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Aerogear" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to aerogear+unsubscribe at googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/aerogear/CAEQz2CtEDGth1cWj3nKPNmZDBg6cSvLW_0D6nfhgmyX6ZW6t3w%40mail.gmail.com
>>> <https://groups.google.com/d/msgid/aerogear/CAEQz2CtEDGth1cWj3nKPNmZDBg6cSvLW_0D6nfhgmyX6ZW6t3w%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>> _______________________________________________
>> enmasse mailing list
>> enmasse at redhat.com
>> https://www.redhat.com/mailman/listinfo/enmasse
>>
>
>
> --
> Jens Reimann
> Principal Software Engineer / EMEA ENG Middleware
> Werner-von-Siemens-Ring 14
> 85630 Grasbrunn
> Germany
> phone: +49 89 2050 71286
>
> _____________________________________________________________________________
>
> Red Hat GmbH, www.de.redhat.com,
> Registered seat: Grasbrunn, Commercial register: Amtsgericht Muenchen, HRB
> 153243,
> Managing Directors: Paul Argiry, Charles Cachera, Tom Savage, Michael
> O'Neill
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/enmasse/attachments/20190731/af64ebf5/attachment.htm>


More information about the enmasse mailing list