When to use gauge or histogram in prometheus in recording request duration?

Issue

I’m new to metric monitoring.

If we want to record the duration of the requests, I think we should use gauge, but in practise, someone would use histogram.

for example, in grpc-ecosystem/go-grpc-prometheus, they prefer to use histogram to record duration. Are there agreed best practices for the use of metric types? Or it is just their own preference.

// ServerMetrics represents a collection of metrics to be registered on a
// Prometheus metrics registry for a gRPC server.
type ServerMetrics struct {
    serverStartedCounter          *prom.CounterVec
    serverHandledCounter          *prom.CounterVec
    serverStreamMsgReceived       *prom.CounterVec
    serverStreamMsgSent           *prom.CounterVec
    serverHandledHistogramEnabled bool
    serverHandledHistogramOpts    prom.HistogramOpts
    serverHandledHistogram        *prom.HistogramVec
}

Thanks~

Solution

I am new to this but le me try to answer your answer your question. So take my answer with a grain of salt or maybe someone with experience in using metrics to observe their systems jumps in.

as stated in https://prometheus.io/docs/concepts/metric_types/

A gauge is a metric that represents a single numerical value that can arbitrarily go up and down. So if your goal would be to display the current value (duration time of requests) you could use a gauge. But I think the goal of using metrics is to find problems within your system or generate alerts if and when certain vaules aren’t in a predefined range or getting a performance value (like the Apdex score) for your system.

From https://prometheus.io/docs/concepts/metric_types/#histogram

Use the histogram_quantile() function to calculate quantiles from histograms or even aggregations of histograms. A histogram is also suitable to calculate an Apdex score.

From https://en.wikipedia.org/wiki/Apdex

Apdex (Application Performance Index) is an open standard developed by an alliance of companies for measuring performance of software applications in computing. Its purpose is to convert measurements into insights about user satisfaction, by specifying a uniform way to analyze and report on the degree to which measured performance meets user expectations.

Read up on Quantiles and the calculations in histograms and summaries https://prometheus.io/docs/practices/histograms/#quantiles

Two rules of thumb:

  1. If you need to aggregate, choose histograms.
  2. Otherwise, choose a histogram if you have an idea of the range and distribution of values that will be observed. Choose a summary if you need an accurate quantile, no matter what the range and distribution of the values is.

Or like Adam Woodbeck in his book "Network programming with Go" said:

The general advice is to use summaries when you don’t know the range of expected values, but I’d advise you to use histograms whenever possible
so that you can aggregate histograms on the metrics server.

Answered By – IamSoClueless

Answer Checked By – Clifford M. (GoLangFix Volunteer)

Leave a Reply

Your email address will not be published.