Commit Graph

86 Commits

Author SHA1 Message Date
Haleygo
ae0e4a8c90
vmalert: add keep_firing_for field for alerting rule (#4669)
vmalert: support `keep_firing_for` field for alerting rule

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4529

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
2023-07-27 15:13:13 +02:00
Roman Khavronenko
25317b4e70
vmalert: follow-up after d4ac4b7813 (#4659)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-07-18 15:53:37 +02:00
venkatbvc
d4ac4b7813
vmalert: allow to blackhole alerting notifications (#4639)
vmalert: support option to blackhole alerting notifications

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4122

---------

Co-authored-by: Rao, B V Chalapathi <b_v_chalapathi.rao@nokia.com>
2023-07-18 15:06:19 +02:00
Alexander Marshalov
2e494e2375
fixed typos in documentation and commandline flags descriptions (#4275) 2023-05-10 09:50:41 +02:00
Roman Khavronenko
29e059e49c
app/vmalert: follow-up after 6c322b4a00 (#4214)
6c322b4a00

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-04-27 13:02:21 +02:00
Haleygo
6c322b4a00
vmalert: allow configuring custom notifier headers per group (#4088)
vmalert: allow configuring custom notifier headers per group
2023-04-27 12:17:26 +02:00
Roman Khavronenko
4a49577028
vmalert: use missingkey=zero for templating (#4040)
Replace empty labels with "" instead of "<no value>"
during templating, as Prometheus does.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4012

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-03-30 16:57:00 +04:00
Roman Khavronenko
8fdd613f25
Vmalert tests (#3975)
* vmalert: add tests for notifier pkg

* vmalert: add tests for remotewrite pkg

* vmalert: add tests for template functions

* vmalert: add tests for web pages

* vmalert: fix int overflow in tests

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-03-17 15:57:24 +01:00
my-git9
9dec3c8f80
chore: Use http constants to replace numbers (#3846)
Signed-off-by: xin.li <xin.li@daocloud.io>
2023-02-22 18:53:05 -08:00
Oleksandr Redko
9fff48c3e3
app,lib: fix typos in comments (#3804) 2023-02-13 13:27:13 +01:00
Aliaksandr Valialkin
f2b40dbe9a
app/vmalert: use consistent randomizer in tests
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3683
2023-01-23 19:25:10 -08:00
Roman Khavronenko
4b3d8eb573
vmalert: mention specifics of Alertmanager HA mode (#3573)
Stress the importance of specifying of all Alertmanager
URLs in vmalert's `-notifier.url` or `notifier.config`
if it runs in cluster mode.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3547

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-12-30 09:54:03 +01:00
Roman Khavronenko
8286f9608f
vmalert: support $for or .For template variables (#3474)
support `$for` or `.For` template variables  in alert's annotations.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3246

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-12-12 22:16:10 +03:00
Aliaksandr Valialkin
a8b8e23d68
lib/promscrape: implement target-level and metric-level relabel debugging
Target-level debugging is performed by clicking the 'debug' link at the corresponding target
on either http://vmagent:8429/targets page or on http://vmagent:8428/service-discovery page.

Metric-level debugging is perfromed at http://vmagent:8429/metric-relabel-debug page.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3407

See https://docs.victoriametrics.com/vmagent.html#relabel-debug
2022-12-10 02:09:44 -08:00
Aliaksandr Valialkin
f325410c26
lib/promscrape: optimize service discovery speed
- Return meta-labels for the discovered targets via promutils.Labels
  instead of map[string]string. This improves the speed of generating
  meta-labels for discovered targets by up to 5x.

- Remove memory allocations in hot paths during ScrapeWork generation.
  The ScrapeWork contains scrape settings for a single discovered target.
  This improves the service discovery speed by up to 2x.
2022-11-29 21:26:00 -08:00
Aliaksandr Valialkin
a018b1d75e
app/vmalert/templates: properly escape all the special chars in quotesEscape function
Previously the `quotesEscape` function was escaping only double quotes.
This wasn't enough, since the input string could contain other special chars,
which must be escaped when put inside JSON string. For example, carriage return and line feed chars (\n\r),
backslash char, etc. This led to the following issues, which were improperly fixed:

- https://github.com/VictoriaMetrics/VictoriaMetrics/issues/890 - this issue
  was "fixed" by introducing the `crlfEscape` function, which led to unnecessary
  complications in user templates, while not fixing various corner cases
  such as backslash chars in the input string.
  See 1de15ad490

- https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3139 - this issue
  was "fixed" by urlencoding the whole string passed to -external.alert.source
  command-line flag. This led to invalid urls, which couldn't be parsed by Grafana.
  See 00c838353d
  and 4bd0244599

This commit properly encodes the input string passed to `quotesEscape`, so it can be safely embedded inside JSON strings.

This commit deprecates crlfEscape template function and adds the following new template functions:

- strvalue and stripDomain - these functions are supported by Prometheus, so they were added
  for compatibility purposes.
- jsonEscape and htmlEscape for converting the input string to valid quoted JSON string
  and for html-escaping the input string, so it could be safely embedded as a plaintext
  into html.

This commit also documents all supported template functions at https://docs.victoriametrics.com/vmalert.html#template-functions
The deprecated crlfEscape function isn't documented on purpose, since its usefulness is negative in general case.
2022-10-28 00:01:16 +03:00
Roman Khavronenko
a4975ace86
vmalert: revert unexpected fileds rename during refactoring (#3222)
Due to auto-refactoring, the filed `state` was automatically
renamed to `ruleState` when the entity with the same name
was renamed in other file. Reverting the change.

https://github.com/VictoriaMetrics/helm-charts/issues/391
Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-10-11 12:37:47 +02:00
Howie
e384d88abf
fix issue#3053 (#3182)
vmalert: prevent duplicating label `alertname` for notifications

The issue has no impact on alerting procedure. But still needs to be fixed
for clarity. 

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3053

Signed-off-by: lihaowei <haoweili35@gmail.com>
2022-10-10 09:44:58 +02:00
Aliaksandr Valialkin
50f5eae0e0
lib/promrelabel: remove unconditional sorting of the labels in ParsedConfigs.Apply(), since the sorting isnt needed in many places
Sort labels explicitly after calling the ParsedConfigs.Apply() when needed.

This reduces CPU usage when performing metric-level relabeling, where labels' sorting isn't needed.
2022-10-09 14:51:16 +03:00
Aliaksandr Valialkin
c1fa9828b3
lib/flagutil: rename Array to ArrayString
This makes the ArrayString more consistent with other Array* types.

While at it, add ArrayBytes type, which will be used for https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3071
2022-10-01 18:26:36 +03:00
Roman Khavronenko
740bb2cc00
vmalert: support auth configs per static_target (#3188)
Allow configuring authorization params per list of targets
in vmalert's notifier config for `static_configs`.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2690

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-30 17:10:17 +02:00
laixintao
88425bb285
vmalert: add $activeAt into template variables. (#3000)
vmalert: add `$activeAt` template variable for annotations

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2999
2022-08-22 13:32:36 +02:00
Aliaksandr Valialkin
343241680b
all: remove the remaining bits of io/ioutil
The io/ioutil package is deprecated since Go1.16 - see https://tip.golang.org/doc/go1.16#ioutil

VictoriaMetrics requires at least Go1.18, so it is time to remove the io/ioutil from source code

This is a follow-up for 02ca2342ab
2022-08-22 00:20:58 +03:00
Aliaksandr Valialkin
1f89278d88
all: subsitute ioutil.ReadAll with io.ReadAll
ioutil.ReadAll is deprecated since Go1.16 - see https://tip.golang.org/doc/go1.16#ioutil
VictoriaMetrics requires at least Go1.18, so it is OK to switch from ioutil.ReadAll to io.ReadAll.

This is a follow-up for 02ca2342ab
2022-08-22 00:16:37 +03:00
Aliaksandr Valialkin
9f94c295ab
all: use os.{Read|Write}File instead of ioutil.{Read|Write}File
The ioutil.{Read|Write}File is deprecated since Go1.16 -
see https://tip.golang.org/doc/go1.16#ioutil

VictoriaMetrics needs at least Go1.18, so it is safe to remove ioutil usage
from source code.

This is a follow-up for 02ca2342ab
2022-08-21 23:52:35 +03:00
Roman Khavronenko
f1b2273d13
vmalert: support $alertID and $groupID in template variables (#2983)
Support of these two variables allows building custom URLs with
alert's ID and group ID params.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/517#issuecomment-1207141432

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-08-16 08:08:27 +02:00
Aliaksandr Valialkin
5c8eee26bf
all: make fmt via the upcoming Go1.19 2022-07-11 19:22:15 +03:00
Aliaksandr Valialkin
1c4f67c5d2
lib/promauth: add ability to send additional http headers in requests to scrape targets
This solves https://stackoverflow.com/questions/66032498/prometheus-scrape-metric-with-custom-header
2022-06-22 20:39:43 +03:00
Roman Khavronenko
ef7f52e0e6
Vmalert notifiers (#2744)
* vmalert: remove head of line blocking for sending alerts

This change makes sending alerts to notifiers concurrent instead
of sequential. This eliminates head of line blocking, where first
faulty notifier address prevents the rest of notifiers from
receiving notifications.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: make default timeout for sending alerts 10s

Previous value of 1m was too high and was inconsistent
with default timeout defined for notifiers via
configuration file.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: linter checks fix

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-06-18 09:11:37 +02:00
Howie
d12614c0a0
chore: remove duplicated code (#2657)
Signed-off-by: lihaowei <haoweili35@gmail.com>
2022-05-30 08:17:40 +02:00
spectvtor
9e343faa41
fix alert relabeling (#2633) 2022-05-25 09:36:04 +02:00
Andrii Chubatiuk
a531a96193
added reusable templates support (#2532)
Signed-off-by: Andrii Chubatiuk <andrew.chubatiuk@gmail.com>
2022-05-14 11:38:44 +02:00
Roman Khavronenko
331a5d9a17
Code check (#2558)
* vmstorage: make gofmt happy

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: make linter happy

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-05-09 10:11:56 +02:00
Aliaksandr Valialkin
381e2de59c
app/vmalert: run make quicktemplate-gen from the root directory after the commit f6dcfbcdd6 2022-05-04 20:27:36 +03:00
Dmytro Kozlov
f6dcfbcdd6
vmalert/tpl: fixed truncating alerts expression in table (#2494)
vmalert: improve `/groups` UI visual 

The change also fixes truncated rules expressions in UI
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2484
2022-05-04 18:02:18 +02:00
Aliaksandr Valialkin
58390192c1
app/vmalert: run make quicktemplate-gen from the repository root
This is a follow-up after b2294d1cf1
2022-05-02 15:17:03 +03:00
Roman Khavronenko
3616337812
vmalert: do not execute templates during validation (#2528)
Function `ValidateTemplates`, used on the vmalert startup,
is supposed to check whether used templates and functions
in loaded rules are correct. The function was parsing
and executing loaded templates.
However, rules may contain functions which can't be executed
without values (label values or query results), like `slice`.
Because of this, validation for completely valid expression
`{{ slice $labels.job 9 }}` will fail since `$labels.job`
is empty during validation.

This PR updates `ValidateTemplates` function to only parse
templates without executing them.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2514
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-05-02 10:16:16 +02:00
Dmytro Kozlov
b2294d1cf1
vmctl/vm: added datapoints collection bar (#2486)
add progress bars to the VM importer

The new progress bars supposed to display the processing speed per each
VM importer worker. This info should help to identify if there is a bottleneck
on the VM side during the import process, without waiting for its finish.
The new progress bars can be disabled by passing `vm-disable-progress-bar` flag.

Plotting multiple progress bars requires using experimental progress bar pool
from github.com/cheggaaa/pb/v3. Switch to progress bar pool required changes
in all import modes.

The openTSDB mode wasn't changed due to its implementation, which implies individual progress
bars per each series. Because of this, using the pool wasn't possible.

Signed-off-by: dmitryk-dk <kozlovdmitriyy@gmail.com>

Co-authored-by: hagen1778 <roman@victoriametrics.com>
2022-05-02 09:06:34 +02:00
Aliaksandr Valialkin
ebaa1c7ad5
lib/promscrape: follow-up after baa1c24b36 2022-04-16 14:25:54 +03:00
Roman Khavronenko
45fcaa33e8
vmalert: add DNS service discovery (#2465)
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2460
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-04-13 11:50:26 +03:00
hagen1778
ed364a42e3 vmalert: support relabeling for alert labels sent via notifier
Before, relabeling for notifier configured via file was supported
only for target labels discovered via SD.
With this change, new config field `alert_relabel_configs` is introduced
for applying relabeling to labels of sent alerts.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-04-11 11:09:14 +03:00
Roman Khavronenko
2b59fff526
vmalert: fix labels and annotations processing for alerts (#2403)
To improve compatibility with Prometheus alerting the order of
templates processing has changed.
Before, vmalert did all labels processing beforehand. It meant
all extra labels (such as `alertname`, `alertgroup` or rule labels)
were available in templating. All collisions were resolved in favour
of extra labels.
In Prometheus, only labels from the received metric are available in
templating, so no collisions are possible.
This change makes vmalert's behaviour similar to Prometheus.

For example, consider alerting rule which is triggered by time series
with `alertname` label. In vmalert, this label would be overriden
by alerting rule's name everywhere: for alert labels, for annotations, etc.
In Prometheus, it would be overriden for alert's labels only, but in annotations
the original label value would be available.

See more details here https://github.com/prometheus/compliance/issues/80

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-04-06 20:24:45 +02:00
Roman Khavronenko
0989649ad0
Vmalert compliance 2 (#2340)
* vmalert: split alert's `Start` field into `ActiveAt` and `Start`

The `ActiveAt` field identifies when alert becomes active for rules
with `for > 0`. Previously, this value was stored in field `Start`.

The field `Start` now identifies the moment alert became `FIRING`.

The split is needed in order to distinguish these two moments
in the API responses for alerts.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: support specific moment of time for rules evaluation

The Querier interface was extended to accept a new argument
used as a timestamp at which evaluation should be made.

It is needed to align rules execution time within the group.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: mark disappeared series as stale

Series generated by alerting rules, which were sent to remote write
now will be marked as stale if they will disappear on the next
evaluation. This would make ALERTS and ALERTS_FOR_TIME series
more precise.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* wip

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: evaluate rules at fixed timestamp

Before, time at which rules were evaluated was calculated
right before rule execution. The change makes sure
that timestamp is calculated only once per evalution round
and all rules are using the same timestamp.

It also updates the logic of resending of already resolved
alert notification.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: allow overridin `alertname` label value if it is present in response

Previously, `alertname` was always equal to the Alerting Rule name. Now,
its value can be overriden if series in response containt the different value
for this label.

The change is needed for improving compatibility with Prometheus.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: align rules evaluation in time

Now, evaluation timestamp for rules evaluates as if
there was no delay in rules evaluation. It means, that
rules will be evaluated at fixed timestamps+group_interval.
This way provides more consistent evaluation results and
improves compatibility with Prometheus,

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: add metric for missed iterations

New metric `vmalert_iteration_missed_total` will show
whether rules evaluation round was missed.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: reduce delay before the initial rule evaluation in group

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: rollback alertname override

According to the spec:
```
The alert name from the alerting rule (HighRequestLatency from the example above) MUST be added to the labels of the alert with the label name as alertname. It MUST override any existing alertname label.
```

https://github.com/prometheus/compliance/blob/main/alert_generator/specification.md#step-3
Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: throw err immediately on dedup detection

```
The execution of an alerting rule MUST error out immediately and MUST NOT send any alerts
or add samples to samples receiver if there is more than one alert with the same labels
```

https://github.com/prometheus/compliance/blob/main/alert_generator/specification.md#step-4
Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: cleanup

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: use strings builder to reduce allocs

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-03-29 15:09:07 +02:00
Dmytro Kozlov
11ae1ae924
Added resendDelay for alerts (#2296)
* vmalert: add support of `resendDelay` flag for alerts

Co-authored-by: dmitryk-dk <dmitry.kozlov@brightlocal.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
2022-03-16 15:26:33 +00:00
Roman Khavronenko
fb6eab03a2
Vmalert compliance improvements (#2320)
* vmalert: add support for `sortByLabel` template function

* vmalert: update API according to Prometheus conformance program

The changes to the API, field names and URL path has been made
according to the Prometheus specification for `alert_generator`
https://github.com/prometheus/compliance/blob/main/alert_generator/specification.md

* vmalert: fix the timestamp of the evaluated rules

The timestamp used for alert's `EndsAt` was calculated
before sending the notification. While the correct way
is to use the timestamp taken right before rules evaluation.

* vmalert: add `-datasource.queryTimeAlignment` flag

The flag is supposed to provide ability to disable `time`
param alignment when executing rules. By default, this flag
is enabled, so it remains backward compatible.

The flag was introduced to achieve better compatibility
with Prometheus behaviour according to https://github.com/prometheus/compliance/blob/main/alert_generator/specification.md

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-03-15 11:54:53 +00:00
Dmytro Kozlov
565bd08c43
Issue-1824: added flags and different auth types support (#2287)
* vmalert/notifier: added flags and different auth types support

Co-authored-by: hagen1778 <roman@victoriametrics.com>
2022-03-10 13:09:12 +02:00
Roman Khavronenko
69d1893f4c
Consul SD - update services on the watcher's start (#2202)
* lib/discovery/consul: update services on the watcher's start

Previously, watcher's start was only initing goroutines for discovery
but not waiting for the first iteration to end. It means first Consul
discovery wasn't returning discovered targets until the next iteration.

The change makes the watcher's start blocking until we get first discovery
iteration done and all registries updated.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: remove workarounds for consul SD

Now when consul SD lib properly updates services
on the first start, we don't need workarounds in vmalert.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* lib/discovery/consul: update after review

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-02-21 15:32:45 +02:00
hagen1778
2efa46a11c vmalert: support $externalLabels and $externalURL in templates
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2193
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-02-15 17:33:52 +03:00
Roman Khavronenko
e3adcbec6e
lib/promscrape: support prometheus-like duration in scrape configs (#2169)
* lib/promscrape: support prometheus-like duration in scrape configs

The change allows to specify duration values like `1d`, `1w`
for fields `scrape_interval`, `scrape_timeout`, etc.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/817#issuecomment-1033384766
Signed-off-by: hagen1778 <roman@victoriametrics.com>

* lib/blockcache: make linter happy

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* lib/promscrape: support prometheus-like duration in scrape configs

* add support for extra fields `scrape_align_interval` and `scrape_offset`;
* support Prometheus duration parsing for `__scrape_interval__`
and `__scrape_duration__` labels;

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* wip

* wip

* docs/CHANGELOG.md: document the feature

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-02-11 16:17:00 +02:00
hagen1778
55e3bbd4cc vmalert: add support of -notifier.basicAuth.passwordFile flag for notifiers
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1567

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-02-02 18:58:54 +03:00