Commit Graph

478 Commits

Author SHA1 Message Date
Roman Khavronenko
5950a94e63
vmalert: fix the typo in popup (#4331)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 76f7e66d8e)
2023-06-02 13:19:32 +02:00
Roman Khavronenko
42b90e5e9a
vmalert: follow-up after 669becd011 (#4318)
* vmalert: follow-up after 669becd011

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: follow-up after 669becd011

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: follow-up after 669becd011

Signed-off-by: hagen1778 <roman@victoriametrics.com>

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-05-16 10:14:15 -07:00
Michael Hoffmann
a99918085d
vmalert: improve retry logic for remote write (#4134)
vmalert should not retry on 4xx status codes
according to https://prometheus.io/docs/concepts/remote_write_spec/
2023-05-16 10:10:39 -07:00
Roman Khavronenko
d6691e7a03
vmalert: add hints to filter buttons (#4296)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-05-11 13:35:27 -07:00
Roman Khavronenko
fc623e2b84
Vmalert UI updates (#4276)
* vmalert: expand rule groups on anchor click

before, anchor click was only updating the URL.
To expand the group, user had to click on rule's block.
Now, group will toggle automatically.

* vmalert: allow filtering group in web UI

The new filter allows to filter groups and rules within
groups by: errors only or noMatch only.

The filtering supposed to help navigating big numbers of groups/rules.
Filtering is reflected in URL, so can be shared as a link.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-05-10 13:05:05 -07:00
Roman Khavronenko
6365d97aee
vmalert: correctly update seriesFetched metric for const exprs (#4287)
Previously, metric `vmalert_alerting_rules_last_evaluation_series_fetched`
would be set to 0 for const expressions, because const expression do not match
any series. This may result into a confusion: no series were matched but response isn't empty.
The change updates the logic behind metric: if no series were matched but there are samples
in response - use amount of samples as number of series.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-05-10 13:01:03 -07:00
Alexander Marshalov
d321ea91f2
fixed typos in documentation and commandline flags descriptions (#4275) 2023-05-10 02:22:06 -07:00
Aliaksandr Valialkin
493d8ec7e0
docs/vmalert.md: clarify docs regarding the support of recursive globs
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4041
2023-05-09 22:49:37 -07:00
Roman Khavronenko
54bfacdfde
vmalert: follow-up after cae87da (#4269)
* vmalert: follow-up after cae87da

cae87da4bb
Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: update struct comments

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: rm typo

Signed-off-by: hagen1778 <roman@victoriametrics.com>

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-05-09 22:07:26 -07:00
Haleygo
d1c68888bd
vmalert: support reading rule from http url (#4212)
vmalert: support reading rule's config from HTTP URL
2023-05-09 21:59:21 -07:00
Roman Khavronenko
4edb97f4da
app/vmalert: detect alerting rules which don't match any series at all (#4198)
app/vmalert: detect alerting rules which don't match any series at all

vmalert starts to understand /query responses which contain object:
```
"stats":{"seriesFetched": "42"}
```
If object is present, vmalert parses it and populates a new field
`SeriesFetched`. This field is then used to populate the new metric
`vmalert_alerting_rules_last_evaluation_series_fetched` and to
display warnings in the vmalert's UI.

If response doesn't contain the new object (Prometheus or
VictoriaMetrics earlier than v1.90), then `SeriesFetched=nil`.
In this case, UI will contain no additional warnings.
And `vmalert_alerting_rules_last_evaluation_series_fetched` will
be set to `-1`. Negative value of the metric will help to compile
correct alerting rule in follow-up.

Thanks for the initial implementation to @Haleygo
See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4056

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4039

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-05-09 21:48:59 -07:00
Roman Khavronenko
8f1372bd43
vmalert: fix API to return non-nil values (#4222)
Properly return empty slices instead of nil for `/api/v1/rules` and `/api/v1/alerts` API handlers.
This improves compatibility with Grafana.

 https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4221

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-05-08 21:47:51 -07:00
Roman Khavronenko
c6511bc2d0
Revert "http server: limit max concurrent requests (#4185)" (#4215)
This reverts commit 77f76371

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-05-08 17:22:27 -07:00
Roman Khavronenko
1c3bf0d0d8
app/vmalert: follow-up after 6c322b4a00 (#4214)
6c322b4a00

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-05-08 17:20:49 -07:00
Haleygo
4b0db17bec
vmalert: allow configuring custom notifier headers per group (#4088)
vmalert: allow configuring custom notifier headers per group
2023-05-08 17:07:44 -07:00
Zakhar Bessarab
19eaf17e11
app/vmalert: add support of recursive path globs for rules and templates (#4148)
Supports using `**` for `-rule` and `-rule.templates`: `dir/**/*.tpl` loads contents of dir and all subdirectories recursively.

See: #4041

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Artem Navoiev <tenmozes@gmail.com>
Co-authored-by: Nikolay <nik@victoriametrics.com>
2023-05-08 16:22:30 -07:00
Zakhar Bessarab
55d772ab39
app/vmalert: return an error when using query function in -external.alert.source flag (#4191)
Templating of `-external.alert.source` is not expected to have access to the query which was causing runtime error when query function was passed as nil.
See: #4181

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-05-08 15:48:16 -07:00
Roman Khavronenko
20b025dc88
http server: limit max concurrent requests (#4185)
* lib/httpserver: introduce `-http.maxConcurrentRequests` command-line flag

Introduce `-http.maxConcurrentRequests` command-line flag to protect
VM components from resource exhaustion during unexpected spikes of HTTP requests.
By default, the new flag's value is set to 0 which means no limits are applied.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* lib/httpserver: mention http.maxConcurrentRequests in docs

Signed-off-by: hagen1778 <roman@victoriametrics.com>

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-05-08 13:13:58 -07:00
Roman Khavronenko
e9ce67adb8
vmalert: retry datasource requests with EOF or unexpected EOF errors (#4146)
* vmalert: retry datasource requests with EOF or unexpected EOF errors

Retry failed read request on the closed connection one more time.
This may improve rules execution reliability when connection
between vmalert and datasource closes unexpectedly.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: fix old tests

Signed-off-by: hagen1778 <roman@victoriametrics.com>

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-05-08 09:49:49 -07:00
Zakhar Bessarab
54edd6992a
app/vmalert: update Grafana URLs to match latest format (#4061)
See: #4019

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-04-05 13:31:06 -07:00
Roman Khavronenko
0bde9722ed
vmalert: use missingkey=zero for templating (#4040)
Replace empty labels with "" instead of "<no value>"
during templating, as Prometheus does.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4012

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-03-31 22:43:39 -07:00
Aliaksandr Valialkin
9387793f47
app/vmselect: follow-up for 10ab086366
- Expose stats.seriesFetched at `/api/v1/query_range` responses too
  for the sake of consistency.

- Initialize QueryStats when it is needed and pass it to EvalConfig then.
  This guarantees that the QueryStats is properly collected when the query
  contains some subqueries.
2023-03-27 15:11:42 -07:00
Roman Khavronenko
a09dabc78f
vmalert: add anchor char to Group's link (#4006)
This should help users to see that Group's name is clickable
and used for anchoring.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-03-24 17:56:04 -07:00
Roman Khavronenko
ec6a20880c
vmalert: mention VMUI example for alert's source (#4005)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-03-24 17:55:30 -07:00
Roman Khavronenko
6a7de761f4
vmalert: support logs suppressing during config reloads (#3973)
* vmalert: support logs suppressing during config reloads

The change is mostly required for ENT version of vmalert,
since it supports object-storage for config files.
Reading data from object storage could be time-consuming,
so vmalert emits logs to track the progress.

However, these logs are mostly needed on start or on
manual config reload. Printing these logs each time
`rule.configCheckInterval` is triggered would too verbose.
So the change allows to control logs emitting during
config reloads.

Now, logs are emitted during start up or when SIGHUP is receieved.
For periodicall config checks logs emitted by config pkg are suppressed.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: review fixes

Signed-off-by: hagen1778 <roman@victoriametrics.com>

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-03-20 14:25:26 -07:00
Roman Khavronenko
0ac57ef5b9
Vmalert tests (#3975)
* vmalert: add tests for notifier pkg

* vmalert: add tests for remotewrite pkg

* vmalert: add tests for template functions

* vmalert: add tests for web pages

* vmalert: fix int overflow in tests

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-03-17 16:16:13 -07:00
Zakhar Bessarab
3b7152b1d8
docs: add a note about cache reset for vmalert backfilling docs (#3940)
docs: add a note about cache reset for vmalert backfilling docs
2023-03-12 00:13:00 -08:00
Roman Khavronenko
310b380a03
app/vmalert: log number of configration files found for each specified -rule (#3936)
The change also introduces `List` method to `FS` interface.
The `List` method can be used for wildcard support in object storage FS.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Nikolay <nik@victoriametrics.com>
2023-03-11 23:40:40 -08:00
Aliaksandr Valialkin
54fe207cc0
all: follow-up for 7a3e16e774
- Sync the description for -httpListenAddr.useProxyProtocol command-line flag at vmagent and vmauth,
  so it is consistent with the description at vmauth and victoria-metrics
- Add a sample of panic text to docs/CHANGELOG.md, so it could be googled
- Mention the -httpListenAddr.useProxyProtocol command-line flag in the description for the bugfix

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3335
2023-03-08 01:42:58 -08:00
Roman Khavronenko
fa3b2bd205
app/vmalert: do not wait for group start on removal (#3891)
Each group in vmalert starts with an artifical delay to avoid
thundering herd problem. For some groups with high evaluation
intervals, the delay could be significant.
If during this delay user will remove the group from the config
and hot-reload it - vmalert will have to wait until the delay
ends. This results into slow config reloading and UI hang.

The change moves the start-delay logic back to the group's
`start` method. Now, group can immediately exit from the
delay when `group.close()` method is called.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-03-08 01:10:41 -08:00
Roman Khavronenko
b176247e16
vmalert: cancel in-flight requests on group's update or close (#3886)
When group's update() or close() method is called, the group
still need to wait for its current evaluation to finish.
Sometimes, evaluation could take a significant amount of time
which slows configuration update or vmalert's graceful shutdown.

The change interrupts current evaluation in order to speed up
the graceful shutdown or config update procedures.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-03-08 00:10:11 -08:00
Aliaksandr Valialkin
bbd5914eb1
all: add makefile rules for GOARCH=s390x for all the VictoriaMetrics components
This is a follow-up for 007530f882
2023-02-26 12:38:48 -08:00
Aliaksandr Valialkin
0c60e4a30a
all: consistently use http.Method{Get,Post,Put} across the codebase
This is a follow-up after 9dec3c8f80
2023-02-22 19:01:09 -08:00
my-git9
7d86c5c94a
chore: Use http constants to replace numbers (#3846)
Signed-off-by: xin.li <xin.li@daocloud.io>
2023-02-22 18:59:32 -08:00
Roman Khavronenko
a29c1d3a02
docs: mention rules replay blogpost in vmalert docs (#3851)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-02-21 17:44:32 -08:00
Roman Khavronenko
fd139b463b
docs: update vmalert docs (#3843)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-02-20 19:13:00 -08:00
Aliaksandr Valialkin
58779363b4
app/vmalert/README.md: sync with docs/vmalert.md after 6ef6f3a771 2023-02-18 15:21:31 -08:00
Haleygo
9a274567f1
vmalert: fix maxResolveDuration flag note (#3827)
Signed-off-by: Haleygo <hui.wang@daocloud.io>
2023-02-18 15:20:30 -08:00
Roman Khavronenko
c6251ec8aa
docs: improve troubleshooting docs for vmalert (#3812)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-02-13 09:42:18 -08:00
Oleksandr Redko
0e1c395609
app,lib: fix typos in comments (#3804) 2023-02-13 09:32:35 -08:00
Aliaksandr Valialkin
ca61c276ca
app/vmalert: follow-up after d3c64aae8768d58781ee7e358bd7f3d8e0eb836d
- Document the change at docs/CHANGELOG.md
- Add `Reading rules from object storage` section to docs/vmalert.md
- Add `s3` prefix to command-line flags related to the configuration of s3 and gcs clients
- Explicitly mention that reading rules from object storage is supported only in enterprise version
2023-02-09 19:10:36 -08:00
Roman Khavronenko
2eb9ca1889
vmalert: support object storage for rules (#519)
* vmalert: support object storage for rules

Support loading of alerting and recording rules from object
storages `gcs://`, `gs://`, `s3://`.

* review fixes
2023-02-09 19:10:34 -08:00
Aliaksandr Valialkin
34379d4cf1
all: run apk update && apk upgrade in base Alpine Docker image in order to get all the recent security fixes 2023-02-09 14:03:02 -08:00
Roman Khavronenko
4e922eb93b
Vmalert fixes (#3788)
* vmalert: use group's ID in UI to avoid collisions

Identical group names are allowed. So we should used IDs
for various groupings and aggregations in UI.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: prevent disabling state updates tracking

The minimum number of update states to track is now set to 1.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: properly update `debug` and `update_entries_limit` params on hot-reload

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: display `debug` field for rule in UI

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: exclude `updates` field from json marhsaling

This field isn't correctly marshaled right now.
And implementing the correct marshaling for it doesn't
seem right, since json representation is mostly used
by systems like Grafana. And Grafana doesn't expect this
field to be present.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* fix test for disabled state

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* fix test for disabled state

Signed-off-by: hagen1778 <roman@victoriametrics.com>

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-02-08 08:45:25 -08:00
Roman Khavronenko
80bf0bcf8c
vmalert: update docs (#3770)
vmalert: update flags description

Signed-off-by: hagen1778 <roman@victoriametrics.com>

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-02-07 09:28:59 -08:00
Roman Khavronenko
96db7ac52c
vmalert: speed up state restore procedure on start (#3758)
* vmalert: speed up state restore procedure on start

Alerts state restore procedure has been changed to become asynchronous.
It doesn't block groups start anymore which significantly improves vmalert's startup time.
Instead, state restore is called by each group in their goroutines after the first rules
evaluation.

While previously state restore attempt was made for all loaded alerting rules,
now it is called only for alerts which became active after the first evaluation.
This reduces the amount of API calls to the configured remote read URL.

This also means that `remoteRead.ignoreRestoreErrors` command-line flag becomes deprecated now
and will have no effect if configured.

See relevant issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2608

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* make lint happy

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* Apply suggestions from code review

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-02-03 19:46:41 -08:00
Roman Khavronenko
d93ac2b1ea
docs: mention -vmalert.proxyURL in vmalert docs (#3730)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-01-31 10:49:49 -08:00
Aliaksandr Valialkin
4cf4c307ea
docs: update command-line descriptions after 73256fe438 2023-01-27 00:01:14 -08:00
Nikolay
ebebaecd94
lib/netutil: init implimentation of proxy protocol (#3687)
* lib/netutil: init implimentation of proxy protocol
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3335

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-01-26 23:25:22 -08:00
Aliaksandr Valialkin
bd809db4d9
docs: update the list of command-line flags according to the latest changes 2023-01-25 09:22:23 -08:00
Aliaksandr Valialkin
ef7683f2e0
app/vmalert: use consistent randomizer in tests
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3683
2023-01-23 19:25:32 -08:00
Aliaksandr Valialkin
ac890b3081
docs: update -help outputs for vm* tools 2023-01-03 23:27:31 -08:00
Aliaksandr Valialkin
3369371636
app/{vmagent,vminsert}: add support for streaming aggregation
See https://docs.victoriametrics.com/stream-aggregation.html

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3460
2023-01-03 22:22:07 -08:00
Roman Khavronenko
dde750e7c1
vmalert: mention specifics of Alertmanager HA mode (#3573)
Stress the importance of specifying of all Alertmanager
URLs in vmalert's `-notifier.url` or `notifier.config`
if it runs in cluster mode.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3547

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-01-03 21:46:00 -08:00
Roman Khavronenko
5cf2998af8
vmalert: allow configuring the default number of stored rule's update states (#3556)
Allow configuring the default number of stored rule's update states in memory
 via global `-rule.updateEntriesLimit` command-line flag or per-rule via rule's
 `update_entries_limit` configuration param.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-12-29 10:41:51 -08:00
Artem Navoiev
393f4ab86f
update links to grafana dashboards (#3534)
docs: update links to grafana dashboards

Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2022-12-28 11:22:02 -08:00
Zakhar Bessarab
decf46d72b
app/vmbackupmanager: add metrics for better observability (#488)
* app/vmbackupmanager: add metrics for better observability, include more information to `/api/v1/backups` API call response

* app/vmbackupmanager: drop old metrics before creating new ones

* app/vmbackupmanager: use `_total` postfix for counter metrics

* app/vmbackupmanager: remove `_total` postfix for gauge-like metrics

* app/vmbackupmanager: add `_last_run_failed` metrics for backups and retention

* app/vmbackupmanager: address review feedback

* app/vmbackupmanager: fix metric name

* app/vmbackupmanager: address review feedback, remove background updates of metrics, add restoring state of `_last_run_failed` metric from remote storage

* app/vmbackupmanager: improve performance for backup size calculation

* app/vmbackupmanager: refactor backup and retention runs to deduplicate each run logic

* {app/vmbackupmanager,lib/formatutil}: move HumanizeBytes into lib package

* app/vmbackupmanager: fix creating new metrics instead of reusing existing ones

* lit/formatutil: add comment to make linter happy

* app/vmbackupmanager: address review feedback
2022-12-20 14:18:43 -08:00
Aliaksandr Valialkin
2a229a319e
docs/vmalert.md: mention latency_offset query arg, which has been added in 86dae56bd0
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3481
2022-12-16 17:20:51 -08:00
Aliaksandr Valialkin
3a28a52667
lib/flagutil: support for TB and TiB suffixes for command-line flags, which accept byte sizes 2022-12-14 17:53:18 -08:00
Roman Khavronenko
a44af871d3
vmalert: support $for or .For template variables (#3474)
support `$for` or `.For` template variables  in alert's annotations.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3246

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-12-12 14:42:16 -08:00
Aliaksandr Valialkin
97b41e727c
lib/promscrape: implement target-level and metric-level relabel debugging
Target-level debugging is performed by clicking the 'debug' link at the corresponding target
on either http://vmagent:8429/targets page or on http://vmagent:8428/service-discovery page.

Metric-level debugging is perfromed at http://vmagent:8429/metric-relabel-debug page.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3407

See https://docs.victoriametrics.com/vmagent.html#relabel-debug
2022-12-10 02:25:56 -08:00
Aliaksandr Valialkin
8fce069c7b
app/vmalert: properly handle nil req passed to requestToCurl()
This fixes a panic in the TestAlertingRule_Exec_Negative test.
The panic has been introduced in the commit b97bd01605
2022-12-10 02:05:20 -08:00
Aliaksandr Valialkin
650d1d1ae5
app/vmalert: do not show system links at http://vmalert:8880/ page when it is requested via proxy
The system links are absolute, e.g. they start from `/`, so there are high chances
they won't work as expected when requested via proxy such as vmselect with -vmalert.proxyURL
command-line flag.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3424
2022-12-09 11:49:53 -08:00
Roman Khavronenko
385d082bca
vmalert: do not hold pointer to http.Request (#3467)
http.Request was used as a part of state struct
for generating the curl command when viewing the rule's
state changes.
It appears, that holding a referencing is far more expensive
than generating the curl command immediately.
On the test with 40k rules, this change reduces memory
and CPU usage by 50%.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-12-09 11:49:53 -08:00
Aliaksandr Valialkin
676de127aa
all: update Go builder from v1.19.3 to v1.19.4
See https://github.com/golang/go/issues?q=milestone%3AGo1.19.4+label%3ACherryPickApproved
2022-12-08 17:04:41 -08:00
Roman Khavronenko
5bbb88902e
vmalert: correctly return error for RW failures (#3452)
* vmalert: correctly return error for RW failures

By mistake, in 0989649ad0 the error
for remote write failures weren't return to user.
This change fixes it.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-12-06 16:31:12 -08:00
Roman Khavronenko
a922308438
vmalert: reduce allocations for Prometheus resp parse (#3435)
Method `metrics()` now pre-allocates slices for labels
and results from query responses. This reduces the number 
of allocations on the hot path for instant requests.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-12-05 00:18:11 -08:00
Roman Khavronenko
31ca22109e
vmalert: fix replay step param (#3428)
The recent change in modifying default value
of `datasource.queryStep` flag resulted in situation
where replay mode was always running queries with
step=`datasource.queryStep`. When it should always
use rule's evaluation interval.

The fix is related not to replay mode only, but
for all Range requests. Now step param is set
individually for each mode.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-12-02 19:09:30 -08:00
Zakhar Bessarab
59f889cd3f
app/vmalert: add remoteWrite.sendTimeout command-line flag to configure timeout for sending data to remoteWrite.url (#3423)
* app/vmalert: add `remoteWrite.sendTimeout` command-line flag to configure timeout for sending data to `remoteWrite.url`

* vmalert: remove WriteTimeout from clients Cfg
No need to have it as a part of configuration struct:
* the client isn't used by other packages;
* there are no internal tests to check the WriteTimeout.

* vmalert: remove DisablePathAppend from clients Cfg
No need to have it as a part of configuration struct:
* the client isn't used by other packages;
* there are no internal tests to check the DisablePathAppend.

Co-authored-by: hagen1778 <roman@victoriametrics.com>
2022-12-02 19:03:34 -08:00
Roman Khavronenko
435f6f3add
vmalert: properly pass headers during the restore procedure (#3420)
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3418

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-12-02 18:53:44 -08:00
Aliaksandr Valialkin
be6da5053f
lib/promscrape: optimize service discovery speed
- Return meta-labels for the discovered targets via promutils.Labels
  instead of map[string]string. This improves the speed of generating
  meta-labels for discovered targets by up to 5x.

- Remove memory allocations in hot paths during ScrapeWork generation.
  The ScrapeWork contains scrape settings for a single discovered target.
  This improves the service discovery speed by up to 2x.
2022-11-29 21:26:23 -08:00
Aliaksandr Valialkin
2a107cc8a7
app/vmalert: substitute -datasource.disablePathAppend with -remoteRead.disablePathAppend in the description for -datasource.url command-line flag
This is a follow-up for 959f06d175
2022-11-29 21:11:18 -08:00
Max Golionko
d272a8270b
vmalert: flag reference update (#3415)
* flag reference update

there is no flag `-datasource.disablePathAppend` and datasource actually checking for `-remoteRead.disablePathAppend`

* update source for doc as well
2022-11-29 20:38:02 -08:00
Roman Khavronenko
0475f8a38e
vmalert: add default list of alerting rules (#3373)
The default list of alerting rules contains the basic
rules for checking vmalert's health state and is recommended
to use for monitoring vmalert deployments.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-11-21 16:09:47 +02:00
Aliaksandr Valialkin
6fe8eec745
all: add a link to https://docs.victoriametrics.com/enterprise.html into description for enterprise flags 2022-11-21 15:44:54 +02:00
Roman Khavronenko
8ee464b22b
bump go version to 1.19.3 (#3327)
Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-11-09 11:56:38 +02:00
Aliaksandr Valialkin
7ae038766c
app/vmalert/templates: properly escape all the special chars in quotesEscape function
Previously the `quotesEscape` function was escaping only double quotes.
This wasn't enough, since the input string could contain other special chars,
which must be escaped when put inside JSON string. For example, carriage return and line feed chars (\n\r),
backslash char, etc. This led to the following issues, which were improperly fixed:

- https://github.com/VictoriaMetrics/VictoriaMetrics/issues/890 - this issue
  was "fixed" by introducing the `crlfEscape` function, which led to unnecessary
  complications in user templates, while not fixing various corner cases
  such as backslash chars in the input string.
  See 1de15ad490

- https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3139 - this issue
  was "fixed" by urlencoding the whole string passed to -external.alert.source
  command-line flag. This led to invalid urls, which couldn't be parsed by Grafana.
  See 00c838353d
  and 4bd0244599

This commit properly encodes the input string passed to `quotesEscape`, so it can be safely embedded inside JSON strings.

This commit deprecates crlfEscape template function and adds the following new template functions:

- strvalue and stripDomain - these functions are supported by Prometheus, so they were added
  for compatibility purposes.
- jsonEscape and htmlEscape for converting the input string to valid quoted JSON string
  and for html-escaping the input string, so it could be safely embedded as a plaintext
  into html.

This commit also documents all supported template functions at https://docs.victoriametrics.com/vmalert.html#template-functions
The deprecated crlfEscape function isn't documented on purpose, since its usefulness is negative in general case.
2022-10-28 00:08:50 +03:00
Aliaksandr Valialkin
8a6898b625
Revert "vmalert: escape query params if external alert source defined (#3267)"
This reverts commit 00c838353d.

Reason for revert: it incorrectly fixes the issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3139 .
Now `-external.alert.source=explore?orgId=1&left=...` is converted to the following invalid url, which cannot be handled by Grafana:

https://grafana.example.com/explore%3ForgId%3D1%26left%3D...

The next commit will contain the correct fix of the issue - the `quotesEscape` function must
properly escape the string, so it could be embedded into JSON string. This function must
properly escape \n\r chars too. In this case the `crlfEscape` function becomes unnecessary.
Actually, the next commit makes the `crlfEscape` function deprecated.
2022-10-28 00:08:50 +03:00
Dmytro Kozlov
3123059407
vmalert: escape query params if external alert source defined (#3267)
vmalert: escape query args if external alert source defined
2022-10-28 00:08:50 +03:00
Aliaksandr Valialkin
450a32970a
lib/envtemplate: allow referring env vars from other env vars via %{ENV_VAR} syntax
This is a follow-up for 02096e06d0
2022-10-26 14:51:02 +03:00
Aliaksandr Valialkin
8ea84432ef
docs/enterprise.md: describe all the enteprise features in a short doc at https://docs.victoriametrics.com/enterprise.html 2022-10-24 18:03:22 +03:00
Roman Khavronenko
f7d69c1735
vmalert: lower severity level for RW retries (#3237)
The message about dropped data still remains at `error` level.
The change supposed to make log message more clear about how
serious it is.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-10-18 20:40:37 +03:00
Aliaksandr Valialkin
d0288ea417
all: log error when environment variables referred from -promscrape.config are missing
This should prevent from using incorrect config files
2022-10-18 10:29:59 +03:00
Roman Khavronenko
895cb3e7c6
vmalert: update troubleshooting docs (#3228)
The default value of `-datasource.queryStep` has changed, so we update
the troubleshooting docs accordingly.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-10-13 10:14:40 +03:00
Roman Khavronenko
7fc812d3c4
vmalert: revert unexpected fileds rename during refactoring (#3222)
Due to auto-refactoring, the filed `state` was automatically
renamed to `ruleState` when the entity with the same name
was renamed in other file. Reverting the change.

https://github.com/VictoriaMetrics/helm-charts/issues/391
Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-10-12 09:33:16 +03:00
Howie
d9abdc57d4
fix issue#3053 (#3182)
vmalert: prevent duplicating label `alertname` for notifications

The issue has no impact on alerting procedure. But still needs to be fixed
for clarity. 

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3053

Signed-off-by: lihaowei <haoweili35@gmail.com>
2022-10-10 21:54:18 +03:00
Aliaksandr Valialkin
087393bcef
lib/promrelabel: remove unconditional sorting of the labels in ParsedConfigs.Apply(), since the sorting isnt needed in many places
Sort labels explicitly after calling the ParsedConfigs.Apply() when needed.

This reduces CPU usage when performing metric-level relabeling, where labels' sorting isn't needed.
2022-10-09 14:53:35 +03:00
Aliaksandr Valialkin
98a4ab796c
all: update the minimum required Go verson from 1.19.1 to 1.19.2
This is needed because of security vulnerabilities found in Go 1.19.1
See https://go.dev/doc/devel/release#go1.19.2
2022-10-07 22:46:44 +03:00
Roman Khavronenko
de92a8375c
vmalert: fix misleading line regarding multitenancy (#3206)
Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-10-06 15:10:52 +03:00
Aliaksandr Valialkin
9b1443bde5
app/vmalert: follow-up after f8ac55d70ada9ef8490b322abefb05f28f75e2e9
* Use vm_account_id and vm_project_id labels to be consistent with https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#multitenancy-via-labels
* Document the feature that vmalert now exposes vm_account_id and vm_project_id
  labels if -clusterMode is set.
* Use literal strings instead of string constants for vm_account_id and vm_project_id.
  This improves code readability.
2022-10-06 00:06:06 +03:00
Aliaksandr Valialkin
98d58fdb57
app/vmalert: update -external.alert.source command-line flag description after 61544e13ad 2022-10-05 22:54:23 +03:00
Roman Khavronenko
6f6f6afae0
vmalert: allow using {{$labels}} for templating in -external.alert.source (#3194)
The change is supposed to provide additional flexibility for generating alert's
source link based on label values.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-10-05 22:53:02 +03:00
Aliaksandr Valialkin
6f9ce3f6d6
lib/flagutil: rename Array to ArrayString
This makes the ArrayString more consistent with other Array* types.

While at it, add ArrayBytes type, which will be used for https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3071
2022-10-01 18:28:19 +03:00
Aliaksandr Valialkin
93e84a1c57
lib/httpserver: use 302 redirects instead of 301 redirects
Incorrect 301 redirects can be cached by user agents such as web browsers.
This can complicate recovery procedure after the incorrect redirect is fixed,
e.g. web browser cache must be reset.

The related issue - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1752
2022-10-01 16:56:43 +03:00
Roman Khavronenko
408d7043a1
vmalert: support auth configs per static_target (#3188)
Allow configuring authorization params per list of targets
in vmalert's notifier config for `static_configs`.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2690

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-30 18:38:11 +03:00
Roman Khavronenko
a2ded58600
vmalert: allow using extra labels in annotations (#3181)
According to Ruler specification, only labels returned within time series
should be available for use in annotations.

For long time, vmalert didn't respect this rule. And in PR
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2403
this was fixed for the sake of compatibility. However, this resulted
into users confusion, as they expected all configured and extra labels
to be available - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3013

This fix allows to use extra labels in Annotations. But in the case of conflicts
the original labels (extracted from time series) are preferred.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-30 07:48:59 +03:00
panguicai
156e7035c7
docs: fix typo for vmalert docs (#3173)
Signed-off-by: panguicai008 <1121906548@qq.com>
2022-09-28 10:42:13 +03:00
Dmytro Kozlov
28dcff5791
lib/{httpserver,netutil}: allow to define min and max TLS version of the http server (#3109)
* lib/{httpserver,netutil}: allow to define min and max TLS version of the http server

* lib/httpserver: added descriptions about tls supported versions

* lib/netutil: check minimal tls version, added supported tls versions to error

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-09-26 17:38:43 +03:00
Aliaksandr Valialkin
594a4ab345
docs/vmalert.md: follow-up for 0c95f928ae
- Clarify the description for -datasource.queryStep command-line flag
- Consistently use a single dash in front of -datasource.queryStep command-line flag
- Update -help output at docs/vmalert.md
2022-09-26 08:49:47 +03:00
Aliaksandr Valialkin
a6a869c365
docs/vmalert.md: follow-up after 7748a9d629
- Consistently use single dash in front of command-line flags instead of double dashes.
- Add a warning that too small -search.latencyOffset may lead to incomplete query results.
2022-09-26 08:49:47 +03:00