From 3507e1e27b546bc22ab6b64ec8469f63cb95cd48 Mon Sep 17 00:00:00 2001 From: Hui Wang Date: Fri, 1 Dec 2023 12:17:24 +0100 Subject: [PATCH] vmalert-tool: fix alert_rule_test case when eval_time is not multiple of evaluation_interval (#5387) Co-authored-by: hagen1778 (cherry picked from commit 1911320c8687e4a36e7f33be16b51ab119ad8fab) Signed-off-by: hagen1778 --- docs/CHANGELOG.md | 3 ++- docs/vmalert-tool.md | 25 +++++++++++++++++++++++++ 2 files changed, 27 insertions(+), 1 deletion(-) diff --git a/docs/CHANGELOG.md b/docs/CHANGELOG.md index 6be9b31aa2..244e7ae681 100644 --- a/docs/CHANGELOG.md +++ b/docs/CHANGELOG.md @@ -45,7 +45,8 @@ The sandbox cluster installation is running under the constant load generated by * BUGFIX: [vmagent](https://docs.victoriametrics.com/vmagent.html): prevent from `FATAL: cannot flush metainfo` panic when [`-remoteWrite.multitenantURL`](https://docs.victoriametrics.com/vmagent.html#multitenancy) command-line flag is set. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5357). * BUGFIX: [vmagent](https://docs.victoriametrics.com/vmagent.html): properly decode zstd-encoded data blocks received via [VictoriaMetrics remote_write protocol](https://docs.victoriametrics.com/vmagent.html#victoriametrics-remote-write-protocol). See [this issue comment](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5301#issuecomment-1815871992). -* BUGFIX: [vmagent](https://docs.victoriametrics.com/vmagent.html): properly add new labels at `output_relabel_configs` during [stream aggregation](https://docs.victoriametrics.com/stream-aggregation.html). Previously this could lead to corrupted labels in output samples. Thanks to @ChengChung for providing [detailed report for this bug](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5402). +* BUGFIX: [vmagent](https://docs.victoriametrics.com/vmagent.html): properly add new labels at `output_relabel_configs` during [stream aggregation](https://docs.victoriametrics.com/stream-aggregation.html). Previously this could lead to corrupted labels in output samples. Thanks to @ChengChung for providing [detailed report for this bug](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5402). +* BUGFIX: [vmalert-tool](https://docs.victoriametrics.com/#vmalert-tool): allow using arbitrary `eval_time` in [alert_rule_test](https://docs.victoriametrics.com/vmalert-tool.html#alert_test_case) case. Previously, test cases with `eval_time` not being a multiple of `evaluation_interval` would fail. ## [v1.95.1](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.95.1) diff --git a/docs/vmalert-tool.md b/docs/vmalert-tool.md index 83b681cc39..ef832cdfc9 100644 --- a/docs/vmalert-tool.md +++ b/docs/vmalert-tool.md @@ -33,6 +33,31 @@ except `promql_expr_test` field. Use `metricsql_expr_test` field name instead. T validates and executes [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html) expressions, which aren't always backward compatible with [PromQL](https://prometheus.io/docs/prometheus/latest/querying/basics/). +### Limitations + +* vmalert-tool evaluates all the groups defined in `rule_files` using `evaluation_interval`(default `1m`) instead of `interval` under each rule group. +* vmalert-tool shares the same limitation with [vmalert](https://docs.victoriametrics.com/vmalert.html#limitations) on chaining rules under one group: + +>by default, rules execution is sequential within one group, but persistence of execution results to remote storage is asynchronous. Hence, user shouldn’t rely on chaining of recording rules when result of previous recording rule is reused in the next one; + +For example, you have recording rule A and alerting rule B in the same group, and rule B's expression is based on A's results. +Rule B won't get the latest data of A, since data didn't persist to remote storage yet. +The workaround is to divide them in two groups and put groupA in front of groupB (or use `group_eval_order` to define the evaluation order). +In this way, vmalert-tool makes sure that the results of groupA must be written to storage before evaluating groupB: + +```yaml +groups: +- name: groupA + rules: + - record: A + expr: sum(xxx) +- name: groupB + rules: + - alert: B + expr: A >= 0.75 + for: 1m +``` + ### Test file format The configuration format for files specified in `--files` cmd-line flag is the following: