docs/vmalert.md: follow-up after 6d5a8c28cd

This commit is contained in:
Aliaksandr Valialkin 2021-06-14 11:37:26 +03:00
parent c5f493db8e
commit 52efd5a05c

View File

@ -4,9 +4,10 @@ sort: 4
# vmalert # vmalert
`vmalert` executes a list of given [alerting](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/) `vmalert` executes a list of the given [alerting](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/)
or [recording](https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/) or [recording](https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/)
rules against configured address. rules against configured address. It is heavily inspired by [Prometheus](https://prometheus.io/docs/alerting/latest/overview/)
implementation and aims to be compatible with its syntax.
## Features ## Features
* Integration with [VictoriaMetrics](https://github.com/VictoriaMetrics/VictoriaMetrics) TSDB; * Integration with [VictoriaMetrics](https://github.com/VictoriaMetrics/VictoriaMetrics) TSDB;
@ -44,21 +45,23 @@ To start using `vmalert` you will need the following things:
* datasource address - reachable VictoriaMetrics instance for rules execution; * datasource address - reachable VictoriaMetrics instance for rules execution;
* notifier address - reachable [Alert Manager](https://github.com/prometheus/alertmanager) instance for processing, * notifier address - reachable [Alert Manager](https://github.com/prometheus/alertmanager) instance for processing,
aggregating alerts and sending notifications. aggregating alerts and sending notifications.
* remote write address - [remote write](https://prometheus.io/docs/prometheus/latest/storage/#remote-storage-integrations) * remote write address [optional] - [remote write](https://prometheus.io/docs/prometheus/latest/storage/#remote-storage-integrations)
compatible storage address for storing recording rules results and alerts state in for of timeseries. This is optional. compatible storage address for storing recording rules results and alerts state in for of timeseries.
Then configure `vmalert` accordingly: Then configure `vmalert` accordingly:
``` ```
./bin/vmalert -rule=alert.rules \ ./bin/vmalert -rule=alert.rules \ # Path to the file with rules configuration. Supports wildcard
-datasource.url=http://localhost:8428 \ # PromQL compatible datasource -datasource.url=http://localhost:8428 \ # PromQL compatible datasource
-notifier.url=http://localhost:9093 \ # AlertManager URL -notifier.url=http://localhost:9093 \ # AlertManager URL
-notifier.url=http://127.0.0.1:9093 \ # AlertManager replica URL -notifier.url=http://127.0.0.1:9093 \ # AlertManager replica URL
-remoteWrite.url=http://localhost:8428 \ # remote write compatible storage to persist rules -remoteWrite.url=http://localhost:8428 \ # Remote write compatible storage to persist rules
-remoteRead.url=http://localhost:8428 \ # PromQL compatible datasource to restore alerts state from -remoteRead.url=http://localhost:8428 \ # MetricsQL compatible datasource to restore alerts state from
-external.label=cluster=east-1 \ # External label to be applied for each rule -external.label=cluster=east-1 \ # External label to be applied for each rule
-external.label=replica=a # Multiple external labels may be set -external.label=replica=a # Multiple external labels may be set
``` ```
See the fill list of configuration flags in [configuration](#configuration) section.
If you run multiple `vmalert` services for the same datastore or AlertManager - do not forget If you run multiple `vmalert` services for the same datastore or AlertManager - do not forget
to specify different `external.label` flags in order to define which `vmalert` generated rules or alerts. to specify different `external.label` flags in order to define which `vmalert` generated rules or alerts.
@ -66,7 +69,7 @@ Configuration for [recording](https://prometheus.io/docs/prometheus/latest/confi
and [alerting](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/) rules is very and [alerting](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/) rules is very
similar to Prometheus rules and configured using YAML. Configuration examples may be found similar to Prometheus rules and configured using YAML. Configuration examples may be found
in [testdata](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmalert/config/testdata) folder. in [testdata](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmalert/config/testdata) folder.
Every `rule` belongs to `group` and every configuration file may contain arbitrary number of groups: Every `rule` belongs to a `group` and every configuration file may contain arbitrary number of groups:
```yaml ```yaml
groups: groups:
[ - <rule_group> ] [ - <rule_group> ]
@ -74,15 +77,15 @@ groups:
### Groups ### Groups
Each group has following attributes: Each group has the following attributes:
```yaml ```yaml
# The name of the group. Must be unique within a file. # The name of the group. Must be unique within a file.
name: <string> name: <string>
# How often rules in the group are evaluated. # How often rules in the group are evaluated.
[ interval: <duration> | default = global.evaluation_interval ] [ interval: <duration> | default = -evaluationInterval flag ]
# How many rules execute at once. Increasing concurrency may speed # How many rules execute at once within a group. Increasing concurrency may speed
# up round execution speed. # up round execution speed.
[ concurrency: <integer> | default = 1 ] [ concurrency: <integer> | default = 1 ]
@ -102,20 +105,25 @@ rules:
### Rules ### Rules
Every rule contains `expr` field for [PromQL](https://prometheus.io/docs/prometheus/latest/querying/basics/)
or [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html) expression. Vmalert will execute the configured
expression and then act according to the Rule type.
There are two types of Rules: There are two types of Rules:
* [alerting](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/) - * [alerting](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/) -
Alerting rules allows to define alert conditions via [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html) Alerting rules allows to define alert conditions via `expr` field and to send notifications
and to send notifications about firing alerts to [Alertmanager](https://github.com/prometheus/alertmanager). [Alertmanager](https://github.com/prometheus/alertmanager) if execution result is not empty.
* [recording](https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/) - * [recording](https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/) -
Recording rules allow you to precompute frequently needed or computationally expensive expressions Recording rules allows to define `expr` which result will be than backfilled to configured
and save their result as a new set of time series. `-remoteWrite.url`. Recording rules are used to precompute frequently needed or computationally
expensive expressions and save their result as a new set of time series.
`vmalert` forbids to define duplicates - rules with the same combination of name, expression and labels `vmalert` forbids to define duplicates - rules with the same combination of name, expression and labels
within one group. within one group.
#### Alerting rules #### Alerting rules
The syntax for alerting rule is following: The syntax for alerting rule is the following:
```yaml ```yaml
# The name of the alert. Must be a valid metric name. # The name of the alert. Must be a valid metric name.
alert: <string> alert: <string>
@ -125,12 +133,14 @@ alert: <string>
[ type: <string> ] [ type: <string> ]
# The expression to evaluate. The expression language depends on the type value. # The expression to evaluate. The expression language depends on the type value.
# By default MetricsQL expression is used. If type="graphite", then the expression # By default PromQL/MetricsQL expression is used. If type="graphite", then the expression
# must contain valid Graphite expression. # must contain valid Graphite expression.
expr: <string> expr: <string>
# Alerts are considered firing once they have been returned for this long. # Alerts are considered firing once they have been returned for this long.
# Alerts which have not yet fired for long enough are considered pending. # Alerts which have not yet fired for long enough are considered pending.
# If param is omitted or set to 0 then alerts will be immediately considered
# as firing once they return.
[ for: <duration> | default = 0s ] [ for: <duration> | default = 0s ]
# Labels to add or overwrite for each alert. # Labels to add or overwrite for each alert.
@ -168,12 +178,12 @@ labels:
[ <labelname>: <labelvalue> ] [ <labelname>: <labelvalue> ]
``` ```
For recording rules to work `-remoteWrite.url` must specified. For recording rules to work `-remoteWrite.url` must be specified.
### Alerts state on restarts ### Alerts state on restarts
`vmalert` has no local storage, so alerts state is stored in the process memory. Hence, after reloading of `vmalert` `vmalert` has no local storage, so alerts state is stored in the process memory. Hence, after restart of `vmalert`
the process alerts state will be lost. To avoid this situation, `vmalert` should be configured via the following flags: the process alerts state will be lost. To avoid this situation, `vmalert` should be configured via the following flags:
* `-remoteWrite.url` - URL to VictoriaMetrics (Single) or vminsert (Cluster). `vmalert` will persist alerts state * `-remoteWrite.url` - URL to VictoriaMetrics (Single) or vminsert (Cluster). `vmalert` will persist alerts state
into the configured address in the form of time series named `ALERTS` and `ALERTS_FOR_STATE` via remote-write protocol. into the configured address in the form of time series named `ALERTS` and `ALERTS_FOR_STATE` via remote-write protocol.
@ -183,17 +193,27 @@ The state stored to the configured address on every rule evaluation.
from configured address by querying time series with name `ALERTS_FOR_STATE`. from configured address by querying time series with name `ALERTS_FOR_STATE`.
Both flags are required for the proper state restoring. Restore process may fail if time series are missing Both flags are required for the proper state restoring. Restore process may fail if time series are missing
in configured `-remoteRead.url`, weren't updated in the last `1h` or received state doesn't match current `vmalert` in configured `-remoteRead.url`, weren't updated in the last `1h` (controlled by `-remoteRead.lookback`)
rules configuration. or received state doesn't match current `vmalert` rules configuration.
### Multitenancy ### Multitenancy
There are the following approaches for alerting and recording rules across [multiple tenants](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#multitenancy) exist: There are the following approaches for alerting and recording rules across
[multiple tenants](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#multitenancy):
* To run a separate `vmalert` instance per each tenant. The corresponding tenant must be specified in `-datasource.url` command-line flag according to [these docs](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#url-format). For example, `/path/to/vmalert -datasource.url=http://vmselect:8481/select/123/prometheus` would run alerts against `AccountID=123`. For recording rules the `-remoteWrite.url` command-line flag must contain the url for the specific tenant as well. For example, `-remoteWrite.url=http://vminsert:8480/insert/123/prometheus` would write recording rules to `AccountID=123`. * To run a separate `vmalert` instance per each tenant.
The corresponding tenant must be specified in `-datasource.url` command-line flag
according to [these docs](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#url-format).
For example, `/path/to/vmalert -datasource.url=http://vmselect:8481/select/123/prometheus`
would run alerts against `AccountID=123`. For recording rules the `-remoteWrite.url` command-line
flag must contain the url for the specific tenant as well.
For example, `-remoteWrite.url=http://vminsert:8480/insert/123/prometheus` would write recording
rules to `AccountID=123`.
* To specify `tenant` parameter per each alerting and recording group if [enterprise version of vmalert](https://victoriametrics.com/enterprise.html) is used with `-clusterMode` command-line flag. For example: * To specify `tenant` parameter per each alerting and recording group if
[enterprise version of vmalert](https://victoriametrics.com/enterprise.html) is used
with `-clusterMode` command-line flag. For example:
```yaml ```yaml
groups: groups:
@ -208,9 +228,13 @@ groups:
# Rules for accountID=456, projectID=789 # Rules for accountID=456, projectID=789
``` ```
If `-clusterMode` is enabled, then `-datasource.url`, `-remoteRead.url` and `-remoteWrite.url` must contain only the hostname without tenant id. For example: `-datasource.url=http://vmselect:8481` . `vmselect` automatically adds the specified tenant to urls per each recording rule in this case. If `-clusterMode` is enabled, then `-datasource.url`, `-remoteRead.url` and `-remoteWrite.url` must
contain only the hostname without tenant id. For example: `-datasource.url=http://vmselect:8481`.
`vmselect` automatically adds the specified tenant to urls per each recording rule in this case.
The enterprise version of vmalert is available in `vmutils-*-enterprise.tar.gz` files at [release page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) and in `*-enterprise` tags at [Docker Hub](https://hub.docker.com/r/victoriametrics/vmalert/tags). The enterprise version of vmalert is available in `vmutils-*-enterprise.tar.gz` files
at [release page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) and in `*-enterprise`
tags at [Docker Hub](https://hub.docker.com/r/victoriametrics/vmalert/tags).
### WEB ### WEB
@ -322,6 +346,9 @@ See full description for these flags in `./vmalert --help`.
## Configuration ## Configuration
Pass `-help` to `vmalert` in order to see the full list of supported
command-line flags with their descriptions.
The shortlist of configuration flags is the following: The shortlist of configuration flags is the following:
``` ```
-datasource.appendTypePrefix -datasource.appendTypePrefix
@ -514,9 +541,6 @@ The shortlist of configuration flags is the following:
Show VictoriaMetrics version Show VictoriaMetrics version
``` ```
Pass `-help` to `vmalert` in order to see the full list of supported
command-line flags with their descriptions.
`vmalert` supports "hot" config reload via the following methods: `vmalert` supports "hot" config reload via the following methods:
* send SIGHUP signal to `vmalert` process; * send SIGHUP signal to `vmalert` process;
* send GET request to `/-/reload` endpoint; * send GET request to `/-/reload` endpoint;