12 KiB
MetricsQL
VictoriaMetrics implements MetricsQL - query language inspired by PromQL. It is backwards compatible with PromQL, so Grafana dashboards backed by Prometheus datasource should work the same after switching from Prometheus to VictoriaMetrics. Standalone MetricsQL package can be used for parsing MetricsQL in external apps.
The following functionality is implemented differently in MetricsQL comparing to PromQL in order to improve user experience:
- MetricsQL takes into account the previous point before the window in square brackets for range functions such as
rate
andincrease
. It also doesn't extrapolate range function results. This addresses this issue from Prometheus. - MetricsQL returns the expected non-empty responses for requests with
step
values smaller than scrape interval. This addresses this issue from Grafana. - MetricsQL treats
scalar
type the same asinstant vector
without labels, since subtle difference between these types usually confuses users. See the corresponding Prometheus docs for details.
Other PromQL functionality should work the same in MetricsQL. File an issue if you notice discrepancies between PromQL and MetricsQL results other than mentioned above.
MetricsQL provides additional functionality mentioned below, which is aimed towards solving practical cases. Feel free filing a feature request if you think MetricsQL misses certain useful functionality.
Note that the functionality mentioned below doesn't work in PromQL, so it is impossible switching back to Prometheus after you start using it.
This functionality can be tried at an editable Grafana dashboard.
WITH
templates. This feature simplifies writing and managing complex queries. Go toWITH
templates playground and try it.- Metric names and metric labels may contain escaped chars. For instance,
foo\-bar{baz\=aa="b"}
is valid expression. It returns time series with namefoo-bar
containing labelbaz=aa
with valueb
. Additionally,\xXX
escape sequence is supported, whereXX
is hexadecimal representation of escaped char. offset
, range duration and step value for range vector may refer to the current step aka$__interval
value from Grafana. For instance,rate(metric[10i] offset 5i)
would return per-second rate over a range covering 10 previous steps with the offset of 5 steps.offset
may be put anywere in the query. For instance,sum(foo) offset 24h
.offset
may be negative. For example,q offset -1h
.default
binary operator.q1 default q2
substitutesNaN
values fromq1
with the corresponding values fromq2
.histogram_quantile
accepts optional third arg -boundsLabel
. In this case it returnslower
andupper
bounds for the estimated percentile. See this issue for details.if
binary operator.q1 if q2
removes values fromq1
forNaN
values fromq2
.ifnot
binary operator.q1 ifnot q2
removes values fromq1
for non-NaN
values fromq2
.- Trailing commas on all the lists are allowed - label filters, function args and with expressions. For instance, the following queries are valid:
m{foo="bar",}
,f(a, b,)
,WITH (x=y,) x
. This simplifies maintenance of multi-line queries. - String literals may be concatenated. This is useful with
WITH
templates:WITH (commonPrefix="long_metric_prefix_") {__name__=commonPrefix+"suffix1"} / {__name__=commonPrefix+"suffix2"}
. - Range duration in functions such as rate may be omitted. VictoriaMetrics automatically selects range duration depending on the current step used for building the graph. For instance, the following query is valid in VictoriaMetrics:
rate(node_network_receive_bytes_total)
. - Range duration and offset may be fractional. For instance,
rate(node_network_receive_bytes_total[1.5m] offset 0.5d)
. - Comments starting with
#
and ending with newline. For instance,up # this is a comment for 'up' metric
. - Rollup functions -
rollup(m[d])
,rollup_rate(m[d])
,rollup_deriv(m[d])
,rollup_increase(m[d])
,rollup_delta(m[d])
- returnmin
,max
andavg
values for all them
data points overd
duration. rollup_candlestick(m[d])
- returnsopen
,close
,low
andhigh
values (OHLC) for all them
data points overd
duration. This function is useful for financial applications.union(q1, ... qN)
function for building multiple graphs forq1
, ...qN
subqueries with a single query. Theunion
function name may be skipped - the following queries are equivalent:union(q1, q2)
and(q1, q2)
.ru(freeResources, maxResources)
function for returning resource utilization percentage in the range0% - 100%
. For instance,ru(node_memory_MemFree_bytes, node_memory_MemTotal_bytes)
returns memory utilization over node_exporter metrics.ttf(slowlyChangingFreeResources)
function for returning the time in seconds when the givenslowlyChangingFreeResources
expression reaches zero. For instance,ttf(node_filesystem_avail_byte)
returns the time to storage space exhaustion. This function may be useful for capacity planning.- Functions for label manipulation:
alias(q, name)
for setting metric name across all the time seriesq
.label_set(q, label1, value1, ... labelN, valueN)
for setting the given values for the given labels onq
.label_del(q, label1, ... labelN)
for deleting the given labels fromq
.label_keep(q, label1, ... labelN)
for deleting all the labels except the given labels fromq
.label_copy(q, src_label1, dst_label1, ... src_labelN, dst_labelN)
for copying label values fromsrc_*
todst_*
.label_move(q, src_label1, dst_label1, ... src_labelN, dst_labelN)
for moving label values fromsrc_*
todst_*
.label_transform(q, label, regexp, replacement)
for replacing all theregexp
occurences withreplacement
in thelabel
values fromq
.label_value(q, label)
- returns numeric values for the givenlabel
fromq
.
step()
function for returning the step in seconds used in the query.start()
andend()
functions for returning the start and end timestamps of the[start ... end]
range used in the query.integrate(m[d])
for returning integral over the given durationd
for the given metricm
.ideriv(m)
- for calculatinginstant
derivative form
.deriv_fast(m[d])
- for calculatingfast
derivative form
based on the first and the last points from durationd
.running_
functions -running_sum
,running_min
,running_max
,running_avg
- for calculating running values on the selected time range.range_
functions -range_sum
,range_min
,range_max
,range_avg
,range_first
,range_last
,range_median
,range_quantile
- for calculating global value over the selected time range.smooth_exponential(q, sf)
- smoothsq
using exponential moving average with the given smooth factorsf
.remove_resets(q)
- removes counter resets fromq
.lag(q[d])
- returns lag between the current timestamp and the timestamp from the previous data point inq
overd
.lifetime(q[d])
- returns lifetime ofq
overd
in seconds. It is expected thatd
exceeds the lifetime ofq
.scrape_interval(q[d])
- returns the average interval in seconds between data points ofq
overd
akascrape interval
.- Trigonometric functions -
sin(q)
,cos(q)
,asin(q)
,acos(q)
andpi()
. median_over_time(m[d])
- calculates median values form
overd
time window. Shorthand toquantile_over_time(0.5, m[d])
.median(q)
- median aggregate. Shorthand toquantile(0.5, q)
.limitk(k, q)
- limits the number of time series returned fromq
tok
.keep_last_value(q)
- fills missing data (gaps) inq
with the previous value.distinct_over_time(m[d])
- returns distinct number of values form
data points overd
duration.distinct(q)
- returns a time series with the number of unique values for each timestamp inq
.sum2_over_time(m[d])
- returns sum of squares for all them
values overd
duration.sum2(q)
- returns a time series with sum of square values for each timestamp inq
.geomean_over_time(m[d])
- returns geomean value for all them
value overd
duration.geomean(q)
- returns a time series with geomean value for each timestamp inq
.rand()
,rand_normal()
andrand_exponential()
functions - for generating pseudo-random series with even, normal and exponential distribution.increases_over_time(m[d])
anddecreases_over_time(m[d])
- returns the number ofm
increases or decreases over the given durationd
.prometheus_buckets(q)
- converts VictoriaMetrics histogram buckets to Prometheus buckets withle
labels.histogram(q)
- calculates aggregate histogram overq
time series for each point on the graph. See this article for more details.histogram_over_time(m[d])
- calculates VictoriaMetrics histogram form
overd
. For example, the following query calculates median temperature by country over the last 24 hours:histogram_quantile(0.5, sum(histogram_over_time(temperature[24h])) by (vmbucket, country))
.histogram_share(le, buckets)
- returns share (in the range 0..1) forbuckets
. Useful for calculating SLI and SLO. For instance, the following query returns the share of requests which are performed under 1.5 seconds:histogram_share(1.5, sum(request_duration_seconds_bucket) by (le))
.topk_*
andbottomk_*
aggregate functions, which return up to K time series. Note that the standardtopk
function may return more than K time series - see this article for details.topk_min(k, q)
- returns top K time series with the max minimums on the given time rangetopk_max(k, q)
- returns top K time series with the max maximums on the given time rangetopk_avg(k, q)
- returns top K time series with the max averages on the given time rangetopk_median(k, q)
- returns top K time series with the max medians on the given time rangebottomk_min(k, q)
- returns bottom K time series with the min minimums on the given time rangebottomk_max(k, q)
- returns bottom K time series with the min maximums on the given time rangebottomk_avg(k, q)
- returns bottom K time series with the min averages on the given time rangebottomk_median(k, q)
- returns bottom K time series with the min medians on the given time range
share_le_over_time(m[d], le)
- returns share (in the range 0..1) of values inm
overd
, which are smaller or equal tole
. Useful for calculating SLI and SLO. Example:share_le_over_time(memory_usage_bytes[24h], 100*1024*1024)
returns the share of time series values for the last 24 hours when memory usage was below or equal to 100MB.share_gt_over_time(m[d], gt)
- returns share (in the range 0..1) of values inm
overd
, which are bigger thangt
. Useful for calculating SLI and SLO. Example:share_gt_over_time(up[24h], 0)
- returns service availability for the last 24 hours.