Commit Graph

58 Commits

Author SHA1 Message Date
Haleygo
3c297e0253
vmalert: add keep_firing_for field for alerting rule (#4669)
vmalert: support `keep_firing_for` field for alerting rule

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4529

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
2023-07-27 13:00:45 -07:00
Roman Khavronenko
fc623e2b84
Vmalert UI updates (#4276)
* vmalert: expand rule groups on anchor click

before, anchor click was only updating the URL.
To expand the group, user had to click on rule's block.
Now, group will toggle automatically.

* vmalert: allow filtering group in web UI

The new filter allows to filter groups and rules within
groups by: errors only or noMatch only.

The filtering supposed to help navigating big numbers of groups/rules.
Filtering is reflected in URL, so can be shared as a link.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-05-10 13:05:05 -07:00
Roman Khavronenko
54bfacdfde
vmalert: follow-up after cae87da (#4269)
* vmalert: follow-up after cae87da

cae87da4bb
Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: update struct comments

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: rm typo

Signed-off-by: hagen1778 <roman@victoriametrics.com>

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-05-09 22:07:26 -07:00
Haleygo
d1c68888bd
vmalert: support reading rule from http url (#4212)
vmalert: support reading rule's config from HTTP URL
2023-05-09 21:59:21 -07:00
Roman Khavronenko
1c3bf0d0d8
app/vmalert: follow-up after 6c322b4a00 (#4214)
6c322b4a00

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-05-08 17:20:49 -07:00
Haleygo
4b0db17bec
vmalert: allow configuring custom notifier headers per group (#4088)
vmalert: allow configuring custom notifier headers per group
2023-05-08 17:07:44 -07:00
Zakhar Bessarab
19eaf17e11
app/vmalert: add support of recursive path globs for rules and templates (#4148)
Supports using `**` for `-rule` and `-rule.templates`: `dir/**/*.tpl` loads contents of dir and all subdirectories recursively.

See: #4041

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Artem Navoiev <tenmozes@gmail.com>
Co-authored-by: Nikolay <nik@victoriametrics.com>
2023-05-08 16:22:30 -07:00
Roman Khavronenko
6a7de761f4
vmalert: support logs suppressing during config reloads (#3973)
* vmalert: support logs suppressing during config reloads

The change is mostly required for ENT version of vmalert,
since it supports object-storage for config files.
Reading data from object storage could be time-consuming,
so vmalert emits logs to track the progress.

However, these logs are mostly needed on start or on
manual config reload. Printing these logs each time
`rule.configCheckInterval` is triggered would too verbose.
So the change allows to control logs emitting during
config reloads.

Now, logs are emitted during start up or when SIGHUP is receieved.
For periodicall config checks logs emitted by config pkg are suppressed.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: review fixes

Signed-off-by: hagen1778 <roman@victoriametrics.com>

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-03-20 14:25:26 -07:00
Roman Khavronenko
310b380a03
app/vmalert: log number of configration files found for each specified -rule (#3936)
The change also introduces `List` method to `FS` interface.
The `List` method can be used for wildcard support in object storage FS.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Nikolay <nik@victoriametrics.com>
2023-03-11 23:40:40 -08:00
Roman Khavronenko
2eb9ca1889
vmalert: support object storage for rules (#519)
* vmalert: support object storage for rules

Support loading of alerting and recording rules from object
storages `gcs://`, `gs://`, `s3://`.

* review fixes
2023-02-09 19:10:34 -08:00
Roman Khavronenko
5cf2998af8
vmalert: allow configuring the default number of stored rule's update states (#3556)
Allow configuring the default number of stored rule's update states in memory
 via global `-rule.updateEntriesLimit` command-line flag or per-rule via rule's
 `update_entries_limit` configuration param.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-12-29 10:41:51 -08:00
Aliaksandr Valialkin
450a32970a
lib/envtemplate: allow referring env vars from other env vars via %{ENV_VAR} syntax
This is a follow-up for 02096e06d0
2022-10-26 14:51:02 +03:00
Aliaksandr Valialkin
d0288ea417
all: log error when environment variables referred from -promscrape.config are missing
This should prevent from using incorrect config files
2022-10-18 10:29:59 +03:00
Roman Khavronenko
a2ded58600
vmalert: allow using extra labels in annotations (#3181)
According to Ruler specification, only labels returned within time series
should be available for use in annotations.

For long time, vmalert didn't respect this rule. And in PR
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2403
this was fixed for the sake of compatibility. However, this resulted
into users confusion, as they expected all configured and extra labels
to be available - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3013

This fix allows to use extra labels in Annotations. But in the case of conflicts
the original labels (extracted from time series) are preferred.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-30 07:48:59 +03:00
Roman Khavronenko
74e81d31a7
vmalert: prodvide more details on duplicates (#3136)
Now vmalert will print the following messages on dupliсates:
```
"recording rule \"record\"; expr: \"up == 1\"; labels: summary={{ value|query }}" is a duplicate within the group "test"
"alerting rule \"alert\"; expr: \"up == 1\"; labels: description={{ value|query }}" is a duplicate within the group "test"
```

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3127
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-21 11:11:40 +03:00
Roman Khavronenko
a887c1bc07
vmalert: add debug mode for alerting rules (#3055)
* vmalert: add `debug` mode for alerting rules

Debug information includes alerts state changes and requests
sent to the datasource. Debug can be enabled only on rule's
level. It might be useful for debugging unexpected
behaviour of alerting rule.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3025

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: review fixes

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* Update app/vmalert/alerting.go

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>

* vmalert: go fmt

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-09-13 16:36:30 +03:00
Aliaksandr Valialkin
06f6de6d47
all: use os.{Read|Write}File instead of ioutil.{Read|Write}File
The ioutil.{Read|Write}File is deprecated since Go1.16 -
see https://tip.golang.org/doc/go1.16#ioutil

VictoriaMetrics needs at least Go1.18, so it is safe to remove ioutil usage
from source code.

This is a follow-up for 02ca2342ab
2022-08-21 23:55:20 +03:00
Aliaksandr Valialkin
5d62f5a324
app/vmalert/config: add missing docs for ValidateTplFn 2022-07-25 09:22:28 +03:00
Roman Khavronenko
970f36de17
vmalert: remove notifier dependency from config (#2906)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-07-25 09:22:28 +03:00
Roman Khavronenko
01755fac38
vmalert: remove dependency on datasource pkg from config (#2905)
* vmalert: remove dependency on datasource pkg from config

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-07-22 13:38:25 +03:00
Roman Khavronenko
d0abdc2b5b
vmalert: allow configuring custom headers per group (#2901)
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2860

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-07-21 20:48:05 +03:00
Roman Khavronenko
80de6face8
vmalert: drop support of deprecated extra_filter_labels param (#2870)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-07-14 16:10:46 +03:00
Roman Khavronenko
3c09d25039
vmalert: followup for 76f05f8670 (#2706)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-06-09 13:15:35 +03:00
Howie
4afd7aa695
feat: rule limit (#2676)
vmalert: support `limit` param in groups definition

`limit` param limits number of time series samples produced by a single rule
during execution.
On reaching the limit rule will return an err.

Signed-off-by: lihaowei <haoweili35@gmail.com>
2022-06-09 13:15:33 +03:00
Roman Khavronenko
2aeb00f98f
vmalert: support scalar type in response (#2610)
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2607

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-05-20 14:08:19 +03:00
Andrii Chubatiuk
7789c47e41
added reusable templates support (#2532)
Signed-off-by: Andrii Chubatiuk <andrew.chubatiuk@gmail.com>
2022-05-20 12:25:11 +03:00
Aliaksandr Valialkin
c50e48a74c
lib/promscrape: follow-up after baa1c24b36 2022-04-16 14:26:38 +03:00
Dmytro Kozlov
2f350c200a
Added resendDelay for alerts (#2296)
* vmalert: add support of `resendDelay` flag for alerts

Co-authored-by: dmitryk-dk <dmitry.kozlov@brightlocal.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
2022-03-17 20:09:18 +02:00
hagen1778
11345c3fd1
vmalert: support $externalLabels and $externalURL in templates
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2193
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-02-15 21:13:32 +02:00
Roman Khavronenko
791cad8c2e
lib/promscrape: support prometheus-like duration in scrape configs (#2169)
* lib/promscrape: support prometheus-like duration in scrape configs

The change allows to specify duration values like `1d`, `1w`
for fields `scrape_interval`, `scrape_timeout`, etc.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/817#issuecomment-1033384766
Signed-off-by: hagen1778 <roman@victoriametrics.com>

* lib/blockcache: make linter happy

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* lib/promscrape: support prometheus-like duration in scrape configs

* add support for extra fields `scrape_align_interval` and `scrape_offset`;
* support Prometheus duration parsing for `__scrape_interval__`
and `__scrape_duration__` labels;

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* wip

* wip

* docs/CHANGELOG.md: document the feature

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-02-11 16:17:51 +02:00
Aliaksandr Valialkin
7d8c481960
app/vmalert/config: sort extra_filter labels before passing them to query args in order to get consistent order of query args across runs
This fixes TestGroupParams test - see https://github.com/VictoriaMetrics/VictoriaMetrics/runs/4432510244?check_suite_focus=true#step:5:288
2021-12-08 13:04:08 +02:00
Roman Khavronenko
582c063698
vmalert: introduce additional HTTP URL params per-group configuration (#1892)
* vmalert: introduce additional HTTP URL params per-group configuration

The new group field `params` allows to configure custom HTTP URL params
per each group. These params will be applied to every request before
executing rule's expression. Hot config reload is also supported.

Field `extra_filter_labels` was deprecated in favour of `params` field.
vmalert will print deprecation log message if config file contains
the deprecated field.

`params` fields are supported by both Prometheus and Graphite datasource types.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: provide more examples for `params` field

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: set higher priority for `params` setting

If there would be a conflict between URL params set in `datasource.url` flag
and params in group definition the latter will have higher priority.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-12-02 14:51:54 +02:00
Aliaksandr Valialkin
31ac507bee
app/vmalert: remove rule.type config, since it doesnt play well with the upcoming default tenants for -clusterMode
It is better from the consistency point of view to set up rule types at group level where tenant config is set up.
2021-11-05 19:52:41 +02:00
Roman Khavronenko
5321127add
vmalert: allow groups with empty rules for compatibility reasons (#1742)
Prometheus allows to have groups with no rules, so we should support
it in vmalert as well for compatibility reasons.
It is also allowed to hot-reload empty groups by adding or removing rules.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-10-25 14:46:51 +03:00
Roman Khavronenko
1bdccfc2bf
app/vmalert: remove unnecessary omitempty tag for interval param (#1649)
`omitempty` tag resulted into skipping this param on marshaling,
which was used as a checksum for groups configuration. Since on
config reload checksums are compared before applying changes,
any change to `interval` only didn't trigger config reload.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1641
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-09-23 20:47:56 +03:00
Nikolay
210c76b51e adds external_labels per group for vmalert (#1485)
* adds external_label per group for vmalert
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1471
2021-09-01 12:19:49 +03:00
Aliaksandr Valialkin
d98e22fe50 app/vmalert: accept Prometheus-like durations in interval config option inside group section 2021-07-12 12:36:22 +03:00
Roman Khavronenko
5aa7846900 vmalert: support rules backfilling (aka replay) (#1358)
* vmalert: support rules backfilling (aka `replay`)

vmalert can `replay` configured rules in the past
and backfill results via remote write protocol.
It supports MetricsQL/PromQL storage as data source,
and can backfill data to remote write compatible
storage.

Supports recording and alerting rules `replay`. See more
details in README.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/836

* vmalert: review fixes

* vmalert: readme fixes
2021-06-09 12:30:54 +03:00
Roman Khavronenko
beee24ecee vmalert: support extra_filter_labels setting per-group (#1319)
The new setting `extra_filter_labels` may be assigned to group.
If it is, then all rules within a group will automatically filter
for configured labels. The feature is well-described here
https://docs.victoriametrics.com#prometheus-querying-api-enhancements

New setting is compatible only with VM datasource.
2021-05-23 14:15:49 +03:00
Nikolay
23a6c9c016 changes vmalert query function (#1307)
* changes vmalert query function
for prometheus rules compatibility its better to use labels as map.
it simplifies template evaluation and allow to ignore can't evaluate field error
because map will return default value.
fixes https://github.com/VictoriaMetrics/operator/issues/243
2021-05-21 16:38:20 +03:00
Nikolay
b8bc1c2e0f Graphite vmalert wip (#112)
* init implementation for graphite alerts

* adds graphite support for vmalert

* small fix

* changes vmalert graphite api with type

* updates tests

* small fix

* fixes graphite parse

* Fixes graphite from time
2021-02-01 15:28:30 +02:00
Roman Khavronenko
304512b668 vmalert-989: return non-empty result in template func query stub to pass validation (#1002)
On templates validation stage vmalert does not acutally send queries, so for complex
chained expression validation may fail. To avoid this, we add a blank sample in response
so validation can pass successfully. Later, during the rule execution, stub will be replaced
with real `query` function.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/989
2021-01-11 12:59:33 +02:00
Roman Khavronenko
9ce8b36d2a vmalert-974: fix order for labels templating (#975)
The change fixes bug caused by 3adf8c5a6f.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/974
2020-12-19 14:21:27 +02:00
Roman Khavronenko
9f578e389c vmalert: add function "query", "first" and "value" to alert templates functions (#960)
The commit adds a support for template function `query`,
`first` and `value`. The function `query` executes
a MetricsQL query for active alerts. In vmalert we
update templates on every evaluation for active alerts
to keep them up to date. With `query` func it may become
a perf issue since it will fire a query on every execution.
We should keep it in mind for now.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/539
2020-12-14 20:12:16 +02:00
Nikolay
e4e33cb757 fixes checksum calculation (#928)
* fixes checksum calculation,
'for' rule param wasnt marshal properly during checksum calculation

* fixes error
2020-11-29 09:50:57 +02:00
kreedom
4526cf92d3 vmalert - add dryRun (#842)
vmalert: add `dryRun` flag for rules validation without running the service
2020-10-20 10:49:22 +03:00
Aliaksandr Valialkin
4b1c401790 app/vmalert: accept days, weeks and years in for: part of config like Prometheus does
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/817
2020-10-08 20:13:20 +03:00
Aliaksandr Valialkin
543f3aea97 all: consistently use "%w" formatting in fmt.Errorf for wrapped errors 2020-09-23 22:48:21 +03:00
Roman Khavronenko
16e0bb496e vmalert: update groups on config reload only if changes detected (#759)
On config reload event `vmalert` reloads configuration for every group. While
it works for simple configurations, the more complex and heavy installations may
suffer from frequent config reloads.
The change introduces the `checksum` field for every group and is set to md5 hash
of yaml configuration. The checksum will change if on any change to group
definition like rules order or annotation change. Comparing the `checksum` field
on config reload event helps to detect if group should be updated.
The groups update is now done concurrently, so reload duration will be limited by
the slowest group now.

Partially solves #691 by improving config reload speed.
2020-09-11 23:41:12 +03:00
Nikolay Khramchikhin
80a9dc79fe changed vmalert behaviour (#738)
* VMAlert start with empty rules dir

There are some applications (operator for instance), that generates alerts configuration at runtime
and vmalert must start correctly without rules to support this behaviour.
Later application will add rules files and send SIGHUP to vmalert,
which will trigger reading rules files and start rules exectuion.

Removing rules files with SIGHUP signal must stop rules execution and
vmalert will wait for new rules.

* imports sorted

* added test cases for empty rules, removed blank line

* fixed imports conflict

* updated tests
2020-09-03 11:07:40 +03:00