mirror of
https://github.com/VictoriaMetrics/VictoriaMetrics.git
synced 2025-01-05 22:32:20 +01:00
docs/vmalert.md: follow-up after 6d5a8c28cd
This commit is contained in:
parent
c5f493db8e
commit
52efd5a05c
@ -4,9 +4,10 @@ sort: 4
|
|||||||
|
|
||||||
# vmalert
|
# vmalert
|
||||||
|
|
||||||
`vmalert` executes a list of given [alerting](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/)
|
`vmalert` executes a list of the given [alerting](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/)
|
||||||
or [recording](https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/)
|
or [recording](https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/)
|
||||||
rules against configured address.
|
rules against configured address. It is heavily inspired by [Prometheus](https://prometheus.io/docs/alerting/latest/overview/)
|
||||||
|
implementation and aims to be compatible with its syntax.
|
||||||
|
|
||||||
## Features
|
## Features
|
||||||
* Integration with [VictoriaMetrics](https://github.com/VictoriaMetrics/VictoriaMetrics) TSDB;
|
* Integration with [VictoriaMetrics](https://github.com/VictoriaMetrics/VictoriaMetrics) TSDB;
|
||||||
@ -44,21 +45,23 @@ To start using `vmalert` you will need the following things:
|
|||||||
* datasource address - reachable VictoriaMetrics instance for rules execution;
|
* datasource address - reachable VictoriaMetrics instance for rules execution;
|
||||||
* notifier address - reachable [Alert Manager](https://github.com/prometheus/alertmanager) instance for processing,
|
* notifier address - reachable [Alert Manager](https://github.com/prometheus/alertmanager) instance for processing,
|
||||||
aggregating alerts and sending notifications.
|
aggregating alerts and sending notifications.
|
||||||
* remote write address - [remote write](https://prometheus.io/docs/prometheus/latest/storage/#remote-storage-integrations)
|
* remote write address [optional] - [remote write](https://prometheus.io/docs/prometheus/latest/storage/#remote-storage-integrations)
|
||||||
compatible storage address for storing recording rules results and alerts state in for of timeseries. This is optional.
|
compatible storage address for storing recording rules results and alerts state in for of timeseries.
|
||||||
|
|
||||||
Then configure `vmalert` accordingly:
|
Then configure `vmalert` accordingly:
|
||||||
```
|
```
|
||||||
./bin/vmalert -rule=alert.rules \
|
./bin/vmalert -rule=alert.rules \ # Path to the file with rules configuration. Supports wildcard
|
||||||
-datasource.url=http://localhost:8428 \ # PromQL compatible datasource
|
-datasource.url=http://localhost:8428 \ # PromQL compatible datasource
|
||||||
-notifier.url=http://localhost:9093 \ # AlertManager URL
|
-notifier.url=http://localhost:9093 \ # AlertManager URL
|
||||||
-notifier.url=http://127.0.0.1:9093 \ # AlertManager replica URL
|
-notifier.url=http://127.0.0.1:9093 \ # AlertManager replica URL
|
||||||
-remoteWrite.url=http://localhost:8428 \ # remote write compatible storage to persist rules
|
-remoteWrite.url=http://localhost:8428 \ # Remote write compatible storage to persist rules
|
||||||
-remoteRead.url=http://localhost:8428 \ # PromQL compatible datasource to restore alerts state from
|
-remoteRead.url=http://localhost:8428 \ # MetricsQL compatible datasource to restore alerts state from
|
||||||
-external.label=cluster=east-1 \ # External label to be applied for each rule
|
-external.label=cluster=east-1 \ # External label to be applied for each rule
|
||||||
-external.label=replica=a # Multiple external labels may be set
|
-external.label=replica=a # Multiple external labels may be set
|
||||||
```
|
```
|
||||||
|
|
||||||
|
See the fill list of configuration flags in [configuration](#configuration) section.
|
||||||
|
|
||||||
If you run multiple `vmalert` services for the same datastore or AlertManager - do not forget
|
If you run multiple `vmalert` services for the same datastore or AlertManager - do not forget
|
||||||
to specify different `external.label` flags in order to define which `vmalert` generated rules or alerts.
|
to specify different `external.label` flags in order to define which `vmalert` generated rules or alerts.
|
||||||
|
|
||||||
@ -66,7 +69,7 @@ Configuration for [recording](https://prometheus.io/docs/prometheus/latest/confi
|
|||||||
and [alerting](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/) rules is very
|
and [alerting](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/) rules is very
|
||||||
similar to Prometheus rules and configured using YAML. Configuration examples may be found
|
similar to Prometheus rules and configured using YAML. Configuration examples may be found
|
||||||
in [testdata](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmalert/config/testdata) folder.
|
in [testdata](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmalert/config/testdata) folder.
|
||||||
Every `rule` belongs to `group` and every configuration file may contain arbitrary number of groups:
|
Every `rule` belongs to a `group` and every configuration file may contain arbitrary number of groups:
|
||||||
```yaml
|
```yaml
|
||||||
groups:
|
groups:
|
||||||
[ - <rule_group> ]
|
[ - <rule_group> ]
|
||||||
@ -74,15 +77,15 @@ groups:
|
|||||||
|
|
||||||
### Groups
|
### Groups
|
||||||
|
|
||||||
Each group has following attributes:
|
Each group has the following attributes:
|
||||||
```yaml
|
```yaml
|
||||||
# The name of the group. Must be unique within a file.
|
# The name of the group. Must be unique within a file.
|
||||||
name: <string>
|
name: <string>
|
||||||
|
|
||||||
# How often rules in the group are evaluated.
|
# How often rules in the group are evaluated.
|
||||||
[ interval: <duration> | default = global.evaluation_interval ]
|
[ interval: <duration> | default = -evaluationInterval flag ]
|
||||||
|
|
||||||
# How many rules execute at once. Increasing concurrency may speed
|
# How many rules execute at once within a group. Increasing concurrency may speed
|
||||||
# up round execution speed.
|
# up round execution speed.
|
||||||
[ concurrency: <integer> | default = 1 ]
|
[ concurrency: <integer> | default = 1 ]
|
||||||
|
|
||||||
@ -102,20 +105,25 @@ rules:
|
|||||||
|
|
||||||
### Rules
|
### Rules
|
||||||
|
|
||||||
|
Every rule contains `expr` field for [PromQL](https://prometheus.io/docs/prometheus/latest/querying/basics/)
|
||||||
|
or [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html) expression. Vmalert will execute the configured
|
||||||
|
expression and then act according to the Rule type.
|
||||||
|
|
||||||
There are two types of Rules:
|
There are two types of Rules:
|
||||||
* [alerting](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/) -
|
* [alerting](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/) -
|
||||||
Alerting rules allows to define alert conditions via [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html)
|
Alerting rules allows to define alert conditions via `expr` field and to send notifications
|
||||||
and to send notifications about firing alerts to [Alertmanager](https://github.com/prometheus/alertmanager).
|
[Alertmanager](https://github.com/prometheus/alertmanager) if execution result is not empty.
|
||||||
* [recording](https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/) -
|
* [recording](https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/) -
|
||||||
Recording rules allow you to precompute frequently needed or computationally expensive expressions
|
Recording rules allows to define `expr` which result will be than backfilled to configured
|
||||||
and save their result as a new set of time series.
|
`-remoteWrite.url`. Recording rules are used to precompute frequently needed or computationally
|
||||||
|
expensive expressions and save their result as a new set of time series.
|
||||||
|
|
||||||
`vmalert` forbids to define duplicates - rules with the same combination of name, expression and labels
|
`vmalert` forbids to define duplicates - rules with the same combination of name, expression and labels
|
||||||
within one group.
|
within one group.
|
||||||
|
|
||||||
#### Alerting rules
|
#### Alerting rules
|
||||||
|
|
||||||
The syntax for alerting rule is following:
|
The syntax for alerting rule is the following:
|
||||||
```yaml
|
```yaml
|
||||||
# The name of the alert. Must be a valid metric name.
|
# The name of the alert. Must be a valid metric name.
|
||||||
alert: <string>
|
alert: <string>
|
||||||
@ -125,12 +133,14 @@ alert: <string>
|
|||||||
[ type: <string> ]
|
[ type: <string> ]
|
||||||
|
|
||||||
# The expression to evaluate. The expression language depends on the type value.
|
# The expression to evaluate. The expression language depends on the type value.
|
||||||
# By default MetricsQL expression is used. If type="graphite", then the expression
|
# By default PromQL/MetricsQL expression is used. If type="graphite", then the expression
|
||||||
# must contain valid Graphite expression.
|
# must contain valid Graphite expression.
|
||||||
expr: <string>
|
expr: <string>
|
||||||
|
|
||||||
# Alerts are considered firing once they have been returned for this long.
|
# Alerts are considered firing once they have been returned for this long.
|
||||||
# Alerts which have not yet fired for long enough are considered pending.
|
# Alerts which have not yet fired for long enough are considered pending.
|
||||||
|
# If param is omitted or set to 0 then alerts will be immediately considered
|
||||||
|
# as firing once they return.
|
||||||
[ for: <duration> | default = 0s ]
|
[ for: <duration> | default = 0s ]
|
||||||
|
|
||||||
# Labels to add or overwrite for each alert.
|
# Labels to add or overwrite for each alert.
|
||||||
@ -168,12 +178,12 @@ labels:
|
|||||||
[ <labelname>: <labelvalue> ]
|
[ <labelname>: <labelvalue> ]
|
||||||
```
|
```
|
||||||
|
|
||||||
For recording rules to work `-remoteWrite.url` must specified.
|
For recording rules to work `-remoteWrite.url` must be specified.
|
||||||
|
|
||||||
|
|
||||||
### Alerts state on restarts
|
### Alerts state on restarts
|
||||||
|
|
||||||
`vmalert` has no local storage, so alerts state is stored in the process memory. Hence, after reloading of `vmalert`
|
`vmalert` has no local storage, so alerts state is stored in the process memory. Hence, after restart of `vmalert`
|
||||||
the process alerts state will be lost. To avoid this situation, `vmalert` should be configured via the following flags:
|
the process alerts state will be lost. To avoid this situation, `vmalert` should be configured via the following flags:
|
||||||
* `-remoteWrite.url` - URL to VictoriaMetrics (Single) or vminsert (Cluster). `vmalert` will persist alerts state
|
* `-remoteWrite.url` - URL to VictoriaMetrics (Single) or vminsert (Cluster). `vmalert` will persist alerts state
|
||||||
into the configured address in the form of time series named `ALERTS` and `ALERTS_FOR_STATE` via remote-write protocol.
|
into the configured address in the form of time series named `ALERTS` and `ALERTS_FOR_STATE` via remote-write protocol.
|
||||||
@ -183,17 +193,27 @@ The state stored to the configured address on every rule evaluation.
|
|||||||
from configured address by querying time series with name `ALERTS_FOR_STATE`.
|
from configured address by querying time series with name `ALERTS_FOR_STATE`.
|
||||||
|
|
||||||
Both flags are required for the proper state restoring. Restore process may fail if time series are missing
|
Both flags are required for the proper state restoring. Restore process may fail if time series are missing
|
||||||
in configured `-remoteRead.url`, weren't updated in the last `1h` or received state doesn't match current `vmalert`
|
in configured `-remoteRead.url`, weren't updated in the last `1h` (controlled by `-remoteRead.lookback`)
|
||||||
rules configuration.
|
or received state doesn't match current `vmalert` rules configuration.
|
||||||
|
|
||||||
|
|
||||||
### Multitenancy
|
### Multitenancy
|
||||||
|
|
||||||
There are the following approaches for alerting and recording rules across [multiple tenants](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#multitenancy) exist:
|
There are the following approaches for alerting and recording rules across
|
||||||
|
[multiple tenants](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#multitenancy):
|
||||||
|
|
||||||
* To run a separate `vmalert` instance per each tenant. The corresponding tenant must be specified in `-datasource.url` command-line flag according to [these docs](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#url-format). For example, `/path/to/vmalert -datasource.url=http://vmselect:8481/select/123/prometheus` would run alerts against `AccountID=123`. For recording rules the `-remoteWrite.url` command-line flag must contain the url for the specific tenant as well. For example, `-remoteWrite.url=http://vminsert:8480/insert/123/prometheus` would write recording rules to `AccountID=123`.
|
* To run a separate `vmalert` instance per each tenant.
|
||||||
|
The corresponding tenant must be specified in `-datasource.url` command-line flag
|
||||||
|
according to [these docs](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#url-format).
|
||||||
|
For example, `/path/to/vmalert -datasource.url=http://vmselect:8481/select/123/prometheus`
|
||||||
|
would run alerts against `AccountID=123`. For recording rules the `-remoteWrite.url` command-line
|
||||||
|
flag must contain the url for the specific tenant as well.
|
||||||
|
For example, `-remoteWrite.url=http://vminsert:8480/insert/123/prometheus` would write recording
|
||||||
|
rules to `AccountID=123`.
|
||||||
|
|
||||||
* To specify `tenant` parameter per each alerting and recording group if [enterprise version of vmalert](https://victoriametrics.com/enterprise.html) is used with `-clusterMode` command-line flag. For example:
|
* To specify `tenant` parameter per each alerting and recording group if
|
||||||
|
[enterprise version of vmalert](https://victoriametrics.com/enterprise.html) is used
|
||||||
|
with `-clusterMode` command-line flag. For example:
|
||||||
|
|
||||||
```yaml
|
```yaml
|
||||||
groups:
|
groups:
|
||||||
@ -208,9 +228,13 @@ groups:
|
|||||||
# Rules for accountID=456, projectID=789
|
# Rules for accountID=456, projectID=789
|
||||||
```
|
```
|
||||||
|
|
||||||
If `-clusterMode` is enabled, then `-datasource.url`, `-remoteRead.url` and `-remoteWrite.url` must contain only the hostname without tenant id. For example: `-datasource.url=http://vmselect:8481` . `vmselect` automatically adds the specified tenant to urls per each recording rule in this case.
|
If `-clusterMode` is enabled, then `-datasource.url`, `-remoteRead.url` and `-remoteWrite.url` must
|
||||||
|
contain only the hostname without tenant id. For example: `-datasource.url=http://vmselect:8481`.
|
||||||
|
`vmselect` automatically adds the specified tenant to urls per each recording rule in this case.
|
||||||
|
|
||||||
The enterprise version of vmalert is available in `vmutils-*-enterprise.tar.gz` files at [release page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) and in `*-enterprise` tags at [Docker Hub](https://hub.docker.com/r/victoriametrics/vmalert/tags).
|
The enterprise version of vmalert is available in `vmutils-*-enterprise.tar.gz` files
|
||||||
|
at [release page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) and in `*-enterprise`
|
||||||
|
tags at [Docker Hub](https://hub.docker.com/r/victoriametrics/vmalert/tags).
|
||||||
|
|
||||||
|
|
||||||
### WEB
|
### WEB
|
||||||
@ -322,6 +346,9 @@ See full description for these flags in `./vmalert --help`.
|
|||||||
|
|
||||||
## Configuration
|
## Configuration
|
||||||
|
|
||||||
|
Pass `-help` to `vmalert` in order to see the full list of supported
|
||||||
|
command-line flags with their descriptions.
|
||||||
|
|
||||||
The shortlist of configuration flags is the following:
|
The shortlist of configuration flags is the following:
|
||||||
```
|
```
|
||||||
-datasource.appendTypePrefix
|
-datasource.appendTypePrefix
|
||||||
@ -514,9 +541,6 @@ The shortlist of configuration flags is the following:
|
|||||||
Show VictoriaMetrics version
|
Show VictoriaMetrics version
|
||||||
```
|
```
|
||||||
|
|
||||||
Pass `-help` to `vmalert` in order to see the full list of supported
|
|
||||||
command-line flags with their descriptions.
|
|
||||||
|
|
||||||
`vmalert` supports "hot" config reload via the following methods:
|
`vmalert` supports "hot" config reload via the following methods:
|
||||||
* send SIGHUP signal to `vmalert` process;
|
* send SIGHUP signal to `vmalert` process;
|
||||||
* send GET request to `/-/reload` endpoint;
|
* send GET request to `/-/reload` endpoint;
|
||||||
|
Loading…
Reference in New Issue
Block a user