VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-12-22 16:36:27 +01:00

Author	SHA1	Message	Date
Roman Khavronenko	35237fe1f5	vmalert: fix error when rule didn't start if restore failed (#1279 ) Previously, `startGroup` could exit on restore errors despite the `remoteRead.ignoreRestoreErrors` flag value. Now vmalert checks the flag value before deciding whether to return error or just log it.	2021-05-10 11:10:32 +03:00
Aliaksandr Valialkin	07bc021f58	app/vmalert: add missing comment for ErrStateRestore	2021-05-08 19:53:45 +03:00
Roman Khavronenko	bb7e113dd4	vmalert: add flag to control behaviour on startup for state restore errors (#1265 ) Alerting rules now can return specific error type ErrStateRestore to indicate whether restore state procedure failed. Such errors were returned and logged before as well. But now user can specify whether to just log these errors (remoteRead.ignoreRestoreErrors=true) or to stop the process (remoteRead.ignoreRestoreErrors=false). The latter is important when VM isn't ready yet to serve queries from vmalert and it needs to wait. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1252	2021-05-05 12:24:32 +03:00
Aliaksandr Valialkin	0a2e746175	docs/vmalert.md: update docs after `afca7b430c`	2021-04-30 11:49:40 +03:00
Roman Khavronenko	7394967841	vmalert: fix the typo in ApplyParams func (#1259 )	2021-04-30 11:47:11 +03:00
Roman Khavronenko	6fbedd62b8	vmalert: use rule's `evaluationInterval` as `step` param by default (#1258 ) User still can override param by specifying `datasource.queryStep` flag.	2021-04-30 10:03:50 +03:00
Aliaksandr Valialkin	daf2778025	docs/CHANGELOG.md: document the change from `f3a048288e` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1232	2021-04-30 09:56:47 +03:00
Roman Khavronenko	b55677e93d	Vmalert: adjust `time` param for datasource queries according to `evaluationInterval` (#1257 ) * Simplify arguments list for fn `queryDataSource` to improve readbility * vmalert: adjust `time` param according to rule evaluation interval With this change, vmalert will start to use rule's evaluation interval for truncating the `time` param. This is mostly needed to produce consistent time series with timestamps unaffected by vmalert start time. Now, timestamp becomes predictable. Additionally, adjustment is similar to what Grafana does for plotting range graphs. Hence, recording rule series and recording rule expression plotted in grafana suppose to become similar in most of cases.	2021-04-30 09:56:46 +03:00
Aliaksandr Valialkin	8be1cb297b	app/vmagent: list user-visible endpoints at `http://vmagent:8429/` While at it, use common WriteAPIHelp function for the listing in vmagent, vmalert and victoria-metrics	2021-04-30 09:38:23 +03:00
Nikolay	2eb8ef7b2b	changes vmalert Querier with per rule querier (#1249 ) * changes vmalert Querier with per rule querier it allows to changes some parametrs based on rule setting for instance - alert type, tenant for cluster version or event endpoint url.	2021-04-29 11:31:07 +03:00
Roman Khavronenko	0ceb4f7565	vmalert: keep the returned timestamp when persisting recording rule (#1245 ) Previously, vmalert used `lastExecTime` timestamp when writing recording rules to the remote storage. This may be incorrect, if vmalert uses `datasource.lookback` flag, which means rule's expression will be executed at some moment in the past. To avoid such situations, vmalert now will use returned timestamp instead of `lastExecTime`.	2021-04-27 00:16:45 +03:00
Aliaksandr Valialkin	6dc5d3b357	all: rename https://victoriametrics.github.io to https://docs.victoriametrics.com	2021-04-20 20:20:01 +03:00
Aliaksandr Valialkin	64f1ddefe5	all: consistency renaming `Victoria Metrics` -> `VictoriaMetrics` VMInsert -> vminsert VMSelect -> vmselect VMStorage -> vmstorage	2021-04-20 11:45:02 +03:00
Aliaksandr Valialkin	c872ba45b9	docs: update `-help` output after the commit `77be3e3a82`	2021-04-12 12:35:39 +03:00
Artem Navoiev	c3dcfdef8c	improve docs for cli flags (#1202 ) * improve docs for cli flags * improve docs for cli flags.2	2021-04-12 12:28:36 +03:00
Roman Khavronenko	712725b4a5	vmalert: document template functions and mention them in README (#1197 )	2021-04-08 18:20:57 +03:00
Roman Khavronenko	ff3711eea2	docs: update docs ordering and formatting (#1192 ) The major change is adding `sort` directive to docs. For those docs which are copied from internal packages `sort` is added via makefile command. For the rest it is added manually since they're updated manually as well. The rest of changes is connected with markdown formatting. For example, changing headers in some files (`##` => `#`) makes navigation on .github.io to look better. This especially useful for `changelog` docs. Table of contents for `vmctl` is dropped, since we already have it autogenerated on .github.io. No link changes expected. The corresponding PR to `cluster` branch will be made in follow-up PR.	2021-04-07 13:43:01 +03:00
Aliaksandr Valialkin	4028d692f5	app: do not process non-GET requests on at `/` handler	2021-04-02 22:56:38 +03:00
Aliaksandr Valialkin	0a8f0a4e2f	all: increase minimum supported Go version for building VictoriaMetrics components from v1.14 to v1.15 This is needed after the commit `c0ac740f93`, which uses URL.Redacted() method, which has been added in v1.15. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1147	2021-03-29 23:06:36 +03:00
Aliaksandr Valialkin	82047be90b	docs: add a link to the repository from build instruction for all the VictoriaMetrics components	2021-03-25 17:16:55 +02:00
Aliaksandr Valialkin	450e23533d	docs/vmalert.md: remove misleading `-evaluationInterval=3s` from example config args 3s evaluation interval is too small for practical setups. It can result in increased load on datasource. So it is better to remove it from example config args, which are usually copy-pasted by novice users.	2021-03-25 15:31:10 +02:00
Aliaksandr Valialkin	3caac5edd4	Makefile: prepare vmutils-windows-.zip archive on `make release-vmutils` command The archive contains the following executables for Windows: vmagent * vmalert * vmauth * vmctl Other components - vmbackup, vmrestore, victoria-metrics - aren't supported for Windows yet	2021-03-16 20:54:10 +02:00
Aliaksandr Valialkin	e2717d84c0	all: various fixes in command-line flag descriptions	2021-03-15 22:03:49 +02:00
Ihor Borodin	933de6b9b1	Fixing examples of external.alert.source in documentation (#1120 ) * Fixing examples of external.alert.source in documentation	2021-03-10 12:08:22 +02:00
Aliaksandr Valialkin	d109e17f46	all: bump minimum supported Go version from 1.13 to 1.14	2021-03-03 15:58:17 +02:00
Aliaksandr Valialkin	2ecee0515a	app/vmalert/README.md: sync with docs/vmalert.md	2021-03-03 10:42:54 +02:00
Aliaksandr Valialkin	d9e8af0e8f	docs: actualize `-help` output	2021-03-01 17:02:05 +02:00
Nikolay	b52d1e4f19	adds query params for vmalert (#1094 ) remoteWrite.url now accepts query params at provided url https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1087	2021-02-28 14:12:34 +02:00
Aliaksandr Valialkin	901710b9e2	app/vmalert: add missing multiarch Dockerfile	2021-02-18 15:23:57 +02:00
Roman Khavronenko	2aa37b0450	vmalert: mention `-datasource.appendTypePrefix` in README (#1052 )	2021-02-03 23:48:45 +02:00
Dmitry Shevchuk	007fd6ce9c	Adds ability to query right vmselect endpoint based on the query type (#1050 ) * Adds ability to query right vmselect endpoint based on the query type Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>	2021-02-03 23:48:44 +02:00
Aliaksandr Valialkin	88ee836d0c	docs/vmalert.md: mention that `type` option can be set at group level additionally to rule level	2021-02-03 21:12:39 +02:00
Aliaksandr Valialkin	6811445b64	docs: document ability to query Graphite datasource from vmalert	2021-02-01 15:28:31 +02:00
Nikolay	b8bc1c2e0f	Graphite vmalert wip (#112 ) * init implementation for graphite alerts * adds graphite support for vmalert * small fix * changes vmalert graphite api with type * updates tests * small fix * fixes graphite parse * Fixes graphite from time	2021-02-01 15:28:30 +02:00
weng zhao	2a8a34ea05	vmalert: add option datasource.queryStep to allow user to address the inconsistency between grafana dashboards(query_range with step 15s usually) and ALERTS (#1027 ) Co-authored-by: zhao.weng <zhao.weng@shopee.com>	2021-01-26 16:38:20 +02:00
Roman Khavronenko	304512b668	vmalert-989: return non-empty result in template func `query` stub to pass validation (#1002 ) On templates validation stage vmalert does not acutally send queries, so for complex chained expression validation may fail. To avoid this, we add a blank sample in response so validation can pass successfully. Later, during the rule execution, stub will be replaced with real `query` function. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/989	2021-01-11 12:59:33 +02:00
Nikolay	14915071d6	adds escape for CRLF (#984 ) at external.alert.source - \n and \r symbols was url encoded, instead of direct usage. replace it from "\n" to `\n` allows to skip url encoding. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/890	2020-12-25 11:06:47 +02:00
Aliaksandr Valialkin	b480585905	app/vmalert: typo fix in descriptions for notifier.basicAuth.username and notifier.basicAuth.password command-line flags	2020-12-24 12:49:40 +02:00
Aliaksandr Valialkin	d8511b6651	docs: mention that it is possible to set multiple `-notifier.tlsInsecureSkipVerify` command-line flags for vmalert See c3a92968343c2b3619f1ab935702d0e9b3a46733	2020-12-22 22:32:56 +02:00
Nikolay	67e470e598	changes vmalert notifier flag, (#978 ) fixes issue with notifier insecure setting, now its possible to use multiple notifier.tlsInsecureSkipVerify multiple time.	2020-12-22 22:27:03 +02:00
Roman Khavronenko	9ce8b36d2a	vmalert-974: fix order for labels templating (#975 ) The change fixes bug caused by `3adf8c5a6f`. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/974	2020-12-19 14:21:27 +02:00
Roman Khavronenko	9f578e389c	vmalert: add function "query", "first" and "value" to alert templates functions (#960 ) The commit adds a support for template function `query`, `first` and `value`. The function `query` executes a MetricsQL query for active alerts. In vmalert we update templates on every evaluation for active alerts to keep them up to date. With `query` func it may become a perf issue since it will fire a query on every execution. We should keep it in mind for now. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/539	2020-12-14 20:12:16 +02:00
Aliaksandr Valialkin	fc82c22e50	docs: consistently use links to https://victoriametrics.github.io for documentation references	2020-12-11 21:09:17 +02:00
Aliaksandr Valialkin	1a237c6903	all: properly handle CPU limits set on the host system/container This can reduce memory usage on systems with enabled CPU limits. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/946	2020-12-08 21:07:03 +02:00
Aliaksandr Valialkin	7bdf07883b	app/{vmalert,vmagent}: skip empty values in `-remoteWrite.label` and `-label` lists	2020-12-08 14:54:02 +02:00
Aliaksandr Valialkin	bdac2171f1	all: do not print usage info for all the flags when incorrect command-line flag is passed This should improve usability for VictoriaMetrics apps that have big number of command-line flags, i.e. all the apps.	2020-12-03 21:46:19 +02:00
Nikolay	e4e33cb757	fixes checksum calculation (#928 ) * fixes checksum calculation, 'for' rule param wasnt marshal properly during checksum calculation * fixes error	2020-11-29 09:50:57 +02:00
Aliaksandr Valialkin	7ceaf4ba8f	all: consistently return text-based HTTP responses with charset=utf-8 This is a follow-up for https://github.com/VictoriaMetrics/VictoriaMetrics/pull/897	2020-11-13 10:30:21 +02:00
Roman Khavronenko	4fd2b6cd16	vmalert: explicitly set extra labels to alert entities (#886 ) The previous implementation treated extra labels (global and rule labels) as separate label set to returned time series labels. Hence, time series always contained only original labels and alert ID was generated from sorted labels key-values. Extra labels didn't affect the generated ID and were applied on the following actions: - templating for Summary and Annotations; - persisting state via remote write; - restoring state via remote read. Such behaviour caused difficulties on restore procedure because extra labels had to be dropped before checking the alert ID, but that not always worked. Consider the case when expression returns the following time series `up{job="foo"}` and rule has extra label `job=bar`. This would mean that restored alert ID will be always different to the real time series because of collision. To solve the situation extra labels are now always applied beforehand and `vmalert` doesn't store original labels anymore. However, this could result into a new error situation. Consider the case when expression returns two time series `up{job="foo"}` and `up{job="baz"}`, while rule has extra label `job=bar`. In such case, applying extra labels will result into two identical time series and `vmalert` will return error: `result contains metrics with the same labelset after applying rule labels` https://github.com/VictoriaMetrics/VictoriaMetrics/issues/870	2020-11-10 00:27:56 +02:00
Roman Khavronenko	333675875f	vmalert: skip automatically added labels on alerts restore (#871 ) Label `alertgroup` was introduced in #611 and automatically added to generated time series. By mistake, this new label wasn't correctly purged on restore event and affected alert's ID uniqueness. This commit removes `alertgroup` label in restore function. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/870	2020-11-01 23:26:00 +02:00
kreedom	4526cf92d3	vmalert - add dryRun (#842 ) vmalert: add `dryRun` flag for rules validation without running the service	2020-10-20 10:49:22 +03:00
Roman Khavronenko	d6155a3f33	vmalert: update docs to highlight the state restore requirements; (#833 ) Address https://github.com/VictoriaMetrics/VictoriaMetrics/issues/830	2020-10-13 18:34:00 +03:00
Aliaksandr Valialkin	4b1c401790	app/vmalert: accept days, weeks and years in `for:` part of config like Prometheus does Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/817	2020-10-08 20:13:20 +03:00
Aliaksandr Valialkin	f9f8e4a39c	app/vmalert: do not pring description for all the flags on config errors The description is too big to consume by human and it just distracts humans.	2020-10-08 13:35:46 +03:00
Dmitry Shihovtsev	aec863e70b	Fix typos in the vmalert datasource (#814 ) * Fix typos in the vmalert datasource * Fix typo in the vmalert datasource test	2020-10-07 18:00:29 +03:00
Roman Khavronenko	368b890e11	vmalert: make maxIdleConnections configurable for datasource HTTP client (#797 ) Address https://github.com/VictoriaMetrics/VictoriaMetrics/issues/795	2020-09-30 09:51:14 +03:00
Aliaksandr Valialkin	543f3aea97	all: consistently use "%w" formatting in fmt.Errorf for wrapped errors	2020-09-23 22:48:21 +03:00
Aliaksandr Valialkin	8cd89cb847	app/vmalert: remove unneeded UTC() call UTC() doesn't change the underlying timestamp, so the call isn't needed here	2020-09-21 15:56:48 +03:00
Roman Khavronenko	d111969d39	vmalert: add support for `datasource.lookback` flag (#779 ) New datasource flag `datasource.lookback` defines how far to look into past when evaluating queries. Address https://github.com/VictoriaMetrics/VictoriaMetrics/issues/668	2020-09-21 15:56:47 +03:00
Roman Khavronenko	0042b0f307	vmalert: fix the typo in error message (#782 ) The error will be always nil so no sense in printing it.	2020-09-21 11:36:09 +03:00
Roman Khavronenko	e2b31590e6	vmalert: add Group name as label to generated alerts and timeseries (#761 ) Solves #611	2020-09-11 23:41:12 +03:00
Roman Khavronenko	16e0bb496e	vmalert: update groups on config reload only if changes detected (#759 ) On config reload event `vmalert` reloads configuration for every group. While it works for simple configurations, the more complex and heavy installations may suffer from frequent config reloads. The change introduces the `checksum` field for every group and is set to md5 hash of yaml configuration. The checksum will change if on any change to group definition like rules order or annotation change. Comparing the `checksum` field on config reload event helps to detect if group should be updated. The groups update is now done concurrently, so reload duration will be limited by the slowest group now. Partially solves #691 by improving config reload speed.	2020-09-11 23:41:12 +03:00
Aliaksandr Valialkin	475698d2ad	docs: sync docs for vmalert, vmauth, vmbackup and vmrestore	2020-09-09 21:10:48 +03:00
Nikolay Khramchikhin	80a9dc79fe	changed vmalert behaviour (#738 ) * VMAlert start with empty rules dir There are some applications (operator for instance), that generates alerts configuration at runtime and vmalert must start correctly without rules to support this behaviour. Later application will add rules files and send SIGHUP to vmalert, which will trigger reading rules files and start rules exectuion. Removing rules files with SIGHUP signal must stop rules execution and vmalert will wait for new rules. * imports sorted * added test cases for empty rules, removed blank line * fixed imports conflict * updated tests	2020-09-03 11:07:40 +03:00
Aliaksandr Valialkin	7ac10ee978	app/vmalert: imrovements over `3f932c2db1`	2020-09-03 01:14:30 +03:00
DexterZhang	85f49ad439	feat: spread load of rule evaluation by group when starting new groups (#724 ) * feat: spread load of rule evaluation by group when starting new groups * review: reduce the resulting diff. * Update app/vmalert/group.go Co-authored-by: Roman Khavronenko <hagen1778@gmail.com> Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com> Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>	2020-09-03 01:14:26 +03:00
Roman Khavronenko	08b76cb26f	vmalert: update `-rule` flag description to enforce quotes using (#709 ) Description for `-rule` flag uses as example specific chars like asterisks which could be interpreted wrong by different shells. To avoid this, description now contains quoted flag values. See also #708	2020-08-28 09:46:35 +03:00
Aliaksandr Valialkin	e7c0b2ca56	docs: update docs	2020-08-14 19:14:46 +03:00
Aliaksandr Valialkin	60c7397be5	all: support `%{ENV_VAR}` placeholders in yaml configs in all the vm* components Such placeholders are substituted by the corresponding environment variable values. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/583	2020-08-13 17:17:06 +03:00
Aliaksandr Valialkin	6721e47ae9	app: respect CPU limits set via cgroups Update GOMAXPROCS to limits set via cgroups. This should reduce CPU trashing and reduce memory usage for cases when VictoriaMetrics components run in containers with CPU limits. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/685	2020-08-11 23:01:03 +03:00
Roman Khavronenko	78afc61896	app/vmalert: extend metrics set exported by `vmalert` #573 (#654 ) * app/vmalert: extend metrics set exported by `vmalert` #573 New metrics were added to improve observability: + vmalert_alerts_pending{alertname, group} - number of pending alerts per group per alert; + vmalert_alerts_acitve{alertname, group} - number of active alerts per group per alert; + vmalert_alerts_error{alertname, group} - is 1 if alertname ended up with error during prev execution, is 0 if no errors happened; + vmalert_recording_rules_error{recording, group} - is 1 if recording rule ended up with error during prev execution, is 0 if no errors happened; * vmalert_iteration_total{group, file} - now contains group and file name labels. This should improve control over specific groups; * vmalert_iteration_duration_seconds{group, file} - now contains group and file name labels. This should improve control over specific groups; Some collisions for alerts and recording rules are possible, because neither group name nor alert/recording rule name are unique for compatibility reasons. Commit contains list of TODOs for Unregistering metrics since groups and rules are ephemeral and could be removed without application restart. In order to unlock Unregistering feature corresponding PR was filed - https://github.com/VictoriaMetrics/metrics/pull/13 * app/vmalert: extend metrics set exported by `vmalert` #573 The changes are following: * add an ID label to rules metrics, since `name` collisions within one group is a common case - see the k8s example alerts; * supports metrics unregistering on rule updates. Consider the case when one rule was added or removed from the group, or the whole group was added or removed. The change depends on https://github.com/VictoriaMetrics/metrics/pull/16 where race condition for Unregister method was fixed.	2020-08-09 09:42:05 +03:00
Aliaksandr Valialkin	67cacb22ac	lib/httpserver: add `-tls`, `-tlsCertFile` and `-tlsKeyFile` command-line flags in every vm binary This makes such binaries compatible with binaries from `master` branch (aka single-node version) See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/677	2020-08-07 10:57:32 +03:00
Aliaksandr Valialkin	106e302d7a	all: add mssing APP_NAME to vm*-GOARCH builds	2020-07-31 13:45:32 +03:00
Aliaksandr Valialkin	945645f38f	docs/{vmagent,vmalert}: add instruction on how to build for ARM	2020-07-31 09:25:41 +03:00
Roman Khavronenko	ec6ed467c6	app/vmalert: support `external.label` to specify global labelset for all rules #622 (#652 ) `external.label` flag supposed to help to distinguish alert or recording rules source in situations when more than one `vmalert` runs for the same datasource or AlertManager.	2020-07-28 14:23:04 +03:00
Aliaksandr Valialkin	31ef39e8da	lib/httpserver: log remote address in error message from `httpserver.Errorf` This should improve detection of the root cause of errors. Thanks to Anant for the idea.	2020-07-20 14:06:29 +03:00
Aliaksandr Valialkin	ce381b3868	app/vmalert: consistently use "%w" instead of "%s" in `fmt.Errorf` when wrapping errors	2020-07-15 13:55:13 +03:00
Roman Khavronenko	9afd19d375	app/vmalert: add retries to remotewrite (#605 ) * app/vmalert: add retries to remotewrite Remotewrite pkg now does limited number of retries if write request failed. This suppose to make vmalert state persisting more reliable. New metrics were added to remotewrite in order to track rows/bytes sent/dropped. defaultFlushInterval was increased from 1s to 5s for sanity reasons. * fix * wip * wip * wip * fix bits alignment bug for 32-bit systems * fix mistakenly dropped field	2020-07-05 18:47:38 +03:00
Ween	d28fb0baf9	[VMAlert] Fix error log when remoteWrite queue size is full (#602 ) * Fix Auto metrics relabeled errors * Finalize auto-genenated Labels * Fix Test Errors * fix error logs when queue is full Co-authored-by: xinyulong <xinyulong@kuaishou.com>	2020-07-03 16:50:43 +03:00
Aliaksandr Valialkin	a45856570b	all: typo fix: exptected -> expected	2020-07-02 18:06:21 +03:00
BigFish	aa26b94f33	fix: spelling mistakes (#594 ) Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>	2020-07-01 01:36:40 +03:00
Aliaksandr Valialkin	d962568e93	all: use %w instead of %s for wrapping errors in `fmt.Errorf` This will simplify examining the returned errors such as httpserver.ErrorWithStatusCode . See https://blog.golang.org/go1.13-errors for details.	2020-06-30 23:33:46 +03:00
Roman Khavronenko	156c83d112	app/vmalert: support multiple notifier urls (#584 ) (#590 ) * app/vmalert: support multiple notifier urls (#584) User now can set multiple notifier URLs in the same fashion as for other vmutils (e.g. vmagent). The same is correct for TLS setting for every configured URL. Alerts sending is done in sequential way for respecting the specified URLs order. * app/vmalert: add basicAuth support for notifier client (#585) The change adds possibility to set basicAuth creds for notifier client in the same fasion as for remote write/read and datasource.	2020-06-29 22:21:56 +03:00
Roman Khavronenko	bbeab70de6	app/vmalert: move flags description and initialization into subpackages The change adds no new functionality and aims to move flags definitions to subpackages that are using them. This should improve readability of the main function.	2020-06-29 22:18:29 +03:00
kreedom	63c36e2e69	app/vmalert: properly set transport for HTTP clients Fixes issue #586	2020-06-29 22:18:25 +03:00
nicbaz	ea2ed4b7e8	vmalert: add support for TLS configuration (#578 ) app/vmalert: add support for TLS configuration Add support for TLS optional configuration in a similar fashion to what is currently supported in other vmutils such as vmagent. TLS configuration options are distinct for datasource, remoteRead, remoteWrite as well as notifier.	2020-06-23 22:47:23 +03:00
kreedom	f227799c87	Support of custom URL path for alert (#560 ) app/vmalert: Support custom URL for alerts source Add flag `external.alert.source` for configuring custom URL for alert's source. This may be handy to re-point default source URL to other systems like Grafana. Updates #517	2020-06-21 16:33:58 +03:00
Roman Khavronenko	1a01fe2cf2	vmalert-537: allow name duplication for rules within one group. (#559 ) Uniqueness of rule is now defined by combination of its name, expression and labels. The hash of the combination is now used as rule ID and identifies rule within the group. Set of rules from coreos/kube-prometheus was added for testing purposes to verify compatibility. The check also showed that `vmalert` doesn't support `query` template function that was mentioned as limitation in README.	2020-06-18 18:54:35 +03:00
Clémence Saussez	0b53e380cf	app/vmalert: fix link to testdata (#547 ) Fix broken link to vmalert test data Signed-off-by: Clemence Saussez <clemence@zen.ly>	2020-06-10 19:37:21 +03:00
Roman Khavronenko	d71b6e6584	vmalert-491: allow to configure concurrent rules execution per group. (#542 ) The feature allows to speed up group rules execution by executing them concurrently. Change also contains README changes to reflect configuration details.	2020-06-09 15:22:11 +03:00
Roman Khavronenko	5c049bf4dd	vmalert-521: allow to disable rules expression validation. (#536 ) This feature may be useful for using `vmalert` with PromQL compatible datasources like Loki.	2020-06-09 15:19:25 +03:00
Aliaksandr Valialkin	58069f5a6a	app/vmalert: print brief usage info for `vmalert -help`	2020-06-05 10:43:24 +03:00
Aliaksandr Valialkin	045b87c662	app/vmalert: fix comment for UpdateWith exported methods	2020-06-01 14:35:03 +03:00
Roman Khavronenko	44c51c627f	vmalert: Add recording rules support. (#519 ) * vmalert: Add recording rules support. Recording rules support required additional service refactoring since it wasn't planned to support them from the very beginning. The list of changes is following: * new entity RecordingRule was added for writing results of MetricsQL expressions into remote storage; * interface Rule now unites both recording and alerting rules; * configuration parser was moved to separate package and now performs more strict validation; * new endpoint for listing all groups and rules in json format was added; * evaluation interval may be set to every particular group; * vmalert: uncomment tests * vmalert: rm outdated TODO * vmalert: fix typos in README	2020-06-01 13:53:46 +03:00
kreedom	2752d6cb26	vmalert add quotes escape function (#510 ) * vmalert add quotes escape function Co-authored-by: kreedom	2020-05-21 12:10:35 +03:00
Aliaksandr Valialkin	9ca781b8f0	app/vmalert/notifier: go fmt	2020-05-19 13:00:18 +03:00
kreedom	27911ae179	vmalert - add expr to variables, add escape functions (#495 ) * vmalert - add expr to variables, add escape functions Co-authored-by: kreedom	2020-05-19 11:55:03 +03:00
Roman Khavronenko	c7f3e58032	vmalert: avoid sending resolves for pending alerts (#498 ) Before the change we were sending notifications to notifier if following conditions are met: * alert is in Fire state * alert is in Inactive state We were sending Inactive notifications to resolve alert ASAP. Unfortunately, we were sending resolves for Pending alerts that become Inactive, which is wrong. In this change we delete alert from the active list if it was Pending and become Inactive. In this way we now have Inactive alerts only if they were in state Fire before. See test change for example.	2020-05-19 11:55:00 +03:00
Roman Khavronenko	e5f5342e18	vmalert: fix potential race during configuration reloads (#497 ) Configuration reload and rules evaluation can't be executed in same time now. This may make reload time longer but prevents from potential races.	2020-05-19 11:54:55 +03:00
Aliaksandr Valialkin	b99d03a956	app/vmalert: run `make quicktemplate-gen` from the root dir of the repository	2020-05-16 22:45:45 +03:00
Aliaksandr Valialkin	2784015a4d	all: print `--help` output to stdout instead of stderr This is easier to grep and pipe	2020-05-16 12:03:06 +03:00
Roman Khavronenko	e850bf0eff	vmalert: fix the access to rules slice element by wrong index (#486 ) During group's update rules deletion was causing slice mutations while slice index was assumed to be unchanged. This caused "slice bounds out of range" errors when multiple rules were deleted sequentially.	2020-05-15 13:26:06 +03:00
hagen1778	d369450f27	vmalert: update README	2020-05-15 13:26:04 +03:00
Aliaksandr Valialkin	3845420a8f	lib: extract common code for returning fast unix timestamp into lib/fasttime	2020-05-14 23:06:50 +03:00
Roman Khavronenko	e208e76222	vmalert: check if remoteRead object was initied before calling Restore (#473 ) The check for non-nil remoteRead was mistakenly dropped during refactoring which caused panics when `vmalert` wasn't configured with `remoteRead` flag.	2020-05-13 22:57:26 +03:00
Roman Khavronenko	1523890742	vmalert: fix flag names and description in README (#475 ) Change also adds the recommendation for `remotewrite` queue error.	2020-05-13 22:57:20 +03:00
肖贝贝	8c3e9adf7f	Feat/vmalert add max queue size (#472 ) * feat: add remoteWrite.maxQueueSize to reduce queue full * rename remote(write\|read) flags to remote(Write\|Read) for the sake of consistency Co-authored-by: xiaobeibei <xiaobeibei@bigo.sg>	2020-05-13 22:57:16 +03:00
Roman Khavronenko	0157566fdb	vmalert: cleanup and restructure of code to improve maintainability (#471 ) The change introduces new entity `manager` which replaces `watchdog`, decouples requestHandler and groups. Manager supposed to control life cycle of groups, rules and config reloads. Groups export an ID method which returns a hash from filename and group name. ID supposed to be unique identifier across all loaded groups. Some tests were added to improve coverage. Bug with wrong annotation value if $value is used in templates after metrics being restored fixed. Notifier interface was extended to accept context. New set of metrics was introduced for config reload.	2020-05-11 14:35:55 +03:00
Nikolay Khramchikhin	0e8c345ffb	vmalert config reload added config hot reload for vmalert with sighup and api call	2020-05-11 14:35:50 +03:00
Roman Khavronenko	abce2b092f	app/vmalert: restore alerts state from datasource metrics (#461 ) * app/vmalert: restore alerts state from datasource metrics Vmalert will restore alerts state for rules that have `rule.For` > 0 from previously written timeseries via `remotewrite.url` flag. * app/vmalert: mention remotewerite and remoteread configuration in README	2020-05-05 00:52:19 +03:00
Artem Navoiev	121f7e1d56	Update README.md	2020-04-29 17:41:04 +03:00
Aliaksandr Valialkin	9ed4951ec8	lib/metricsql: move it to a separate repository - github.com/VictoriaMetrics/metrics	2020-04-28 15:30:06 +03:00
Aliaksandr Valialkin	a858b7e393	app/vmalert: added missing comments for public entities	2020-04-28 11:19:48 +03:00
Aliaksandr Valialkin	50af16baf2	app/vmalert: fix build	2020-04-28 00:34:01 +03:00
Aliaksandr Valialkin	e3db2c73a6	app/vmalert: sync with master branch	2020-04-28 00:19:42 +03:00
Aliaksandr Valialkin	7644f40763	app/vmalert: include it into the next release	2020-04-28 00:11:41 +03:00

... 3 4 5 6 7

316 Commits