[EnMasse] Dispatch router metrics to hawkular

Tue Mar 28 10:05:56 UTC 2017

On 28. mars 2017 11:32, Gordon Sim wrote:
> On 28/03/17 10:07, Ulf Lilleengen wrote:
>> I think the type of information suitable for metrics are information
>> that will be useful for the operations team that are monitoring the
>> availability and performance of the system. IMO logging is a better
>> alternative for metrics that are mainly used for debugging, and I think
>> (though i've never operated messaging infrastructure..) what you refer
>> to as potentially useful information is mainly interesting for
>> debugging.
>
> As an example, imagine if the statistics show bursts in the frequency of
> message rejections on occasion and this is coming largely from a single
> client. Being able to drill into the data and get the user/ip of the
> client in question would be useful.
>
> So, yes, on one level it is 'debugging' of a sort. I think of it more as
> understanding how the system is being used. The monitoring console as I
> see it is for this sort of general purpose troubleshooting, observation.
> It does cover performance, but isn't limited to that.
>

If you compare this to a typical HTTP server, details of a request would 
typically end up in an access that would also be stored in a central 
logging facility like logstash for later debugging or post-mortem analysis.

 From a http server perspective I think a nice way to distinguish 
metrics from logs is how many values a dimension/tag may have. For 
instance, the value range the request type is small (GET, PUT etc.), 
while the value range of client IPs can be very big. Graphs with many 
values per tag doesn't look nice at all, and creating one graph per 
value is tedious.

On the other hand, I guess in most messaging use cases, connections are 
long lived and you may only have a few known clients, so maybe having 
host /container id as a tag would work just fine. Lets try it out!

In any case, this post somewhat sums up how I think about this: 
https://grafana.com/blog/2016/01/05/logs-and-metrics-and-graphs-oh-my/

>> I also think that the set of metrics to report is something
>> to be expanded gradually rather than putting too much there, because
>> there is a cost for the metric storage.
>
> Makes sense. So we should define what we actually want to show in the
> first instance and then figure out what we need to record.

Agreed, and at least for brokers, it can be adjusted and configured in 
the template. So if we do this right, the set of metrics to report can 
always be expanded if required.

-- 
Ulf