[EnMasse] Dispatch router metrics to hawkular

Tue Mar 28 09:07:32 UTC 2017

On 27. mars 2017 16:15, Gordon Sim wrote:
> On 27/03/17 10:33, Ulf Lilleengen wrote:
>> I think the general approach with hawkular (and similar tools i've used
>> in the past) is that components reports non-aggregated values that are
>> tagged so that the various dashboards can aggregate as they wish based
>> on the tags. For instance, for the broker, I report the messageCount
>> metric from all queues, and it is tagged with broker name, queue name
>> and address. Then you can created dashboards that display the metric
>> aggregated accross multiple brokers, multiple queues or not depending on
>> what you want.
>>
>> So with the router I would collect per-link metrics and tag them with
>> i.e. connection id, link id, target address, source address.
>
> Makes sense. We would also need role to be tagged. The direction could
> be either a tag or imply a different metric (i.e. incoming-deliveries,
> outgoing-deliveries). The logic for aggregating also depends on the type
> of the address but I guess that can be retrieved separately.
>
> What is the cost of the number and/or size of these tags, do you know?
> For connections, the router doesn't really track any metric, but there
> *is* lots of potentially useful information: the peer hostname/ip
> address, container id, whether its ssl/encrypted (certain properties
> might be as well). Would it be sensible to record information like this
> as a tag? (Or would logging be a better option for that?)
>

I don't know and haven't found any limitations in the docs. If I were to 
guess, depending on how it is stored, the cost would either be an extra 
forward index for each tag (to make querying and filtering fast) or 
expensive querying. We can reach out to the hawkular project to get more 
info on this.

I think the type of information suitable for metrics are information 
that will be useful for the operations team that are monitoring the 
availability and performance of the system. IMO logging is a better 
alternative for metrics that are mainly used for debugging, and I think 
(though i've never operated messaging infrastructure..) what you refer 
to as potentially useful information is mainly interesting for 
debugging. I also think that the set of metrics to report is something 
to be expanded gradually rather than putting too much there, because 
there is a cost for the metric storage.

I'm not sure of the cost model of hawkular on OpenShift online, but it 
could be that each tenant is only allowed a certain number of metrics 
with N dimensions/tags.

-- 
Ulf