VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-11-23 20:37:12 +01:00

Author	SHA1	Message	Date
Howie	d12614c0a0	chore: remove duplicated code (#2657 ) Signed-off-by: lihaowei <haoweili35@gmail.com>	2022-05-30 08:17:40 +02:00
spectvtor	9e343faa41	fix alert relabeling (#2633 )	2022-05-25 09:36:04 +02:00
Andrii Chubatiuk	a531a96193	added reusable templates support (#2532 ) Signed-off-by: Andrii Chubatiuk <andrew.chubatiuk@gmail.com>	2022-05-14 11:38:44 +02:00
Roman Khavronenko	331a5d9a17	Code check (#2558 ) * vmstorage: make gofmt happy Signed-off-by: hagen1778 <roman@victoriametrics.com> * vmalert: make linter happy Signed-off-by: hagen1778 <roman@victoriametrics.com>	2022-05-09 10:11:56 +02:00
Aliaksandr Valialkin	381e2de59c	app/vmalert: run `make quicktemplate-gen` from the root directory after the commit `f6dcfbcdd6`	2022-05-04 20:27:36 +03:00
Dmytro Kozlov	f6dcfbcdd6	vmalert/tpl: fixed truncating alerts expression in table (#2494 ) vmalert: improve `/groups` UI visual The change also fixes truncated rules expressions in UI https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2484	2022-05-04 18:02:18 +02:00
Aliaksandr Valialkin	58390192c1	app/vmalert: run `make quicktemplate-gen` from the repository root This is a follow-up after `b2294d1cf1`	2022-05-02 15:17:03 +03:00
Roman Khavronenko	3616337812	vmalert: do not execute templates during validation (#2528 ) Function `ValidateTemplates`, used on the vmalert startup, is supposed to check whether used templates and functions in loaded rules are correct. The function was parsing and executing loaded templates. However, rules may contain functions which can't be executed without values (label values or query results), like `slice`. Because of this, validation for completely valid expression `{{ slice $labels.job 9 }}` will fail since `$labels.job` is empty during validation. This PR updates `ValidateTemplates` function to only parse templates without executing them. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2514 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2022-05-02 10:16:16 +02:00
Dmytro Kozlov	b2294d1cf1	vmctl/vm: added datapoints collection bar (#2486 ) add progress bars to the VM importer The new progress bars supposed to display the processing speed per each VM importer worker. This info should help to identify if there is a bottleneck on the VM side during the import process, without waiting for its finish. The new progress bars can be disabled by passing `vm-disable-progress-bar` flag. Plotting multiple progress bars requires using experimental progress bar pool from github.com/cheggaaa/pb/v3. Switch to progress bar pool required changes in all import modes. The openTSDB mode wasn't changed due to its implementation, which implies individual progress bars per each series. Because of this, using the pool wasn't possible. Signed-off-by: dmitryk-dk <kozlovdmitriyy@gmail.com> Co-authored-by: hagen1778 <roman@victoriametrics.com>	2022-05-02 09:06:34 +02:00
Aliaksandr Valialkin	ebaa1c7ad5	lib/promscrape: follow-up after `baa1c24b36`	2022-04-16 14:25:54 +03:00
Roman Khavronenko	45fcaa33e8	vmalert: add DNS service discovery (#2465 ) https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2460 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2022-04-13 11:50:26 +03:00
hagen1778	ed364a42e3	vmalert: support relabeling for alert labels sent via notifier Before, relabeling for notifier configured via file was supported only for target labels discovered via SD. With this change, new config field `alert_relabel_configs` is introduced for applying relabeling to labels of sent alerts. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2022-04-11 11:09:14 +03:00
Roman Khavronenko	2b59fff526	vmalert: fix labels and annotations processing for alerts (#2403 ) To improve compatibility with Prometheus alerting the order of templates processing has changed. Before, vmalert did all labels processing beforehand. It meant all extra labels (such as `alertname`, `alertgroup` or rule labels) were available in templating. All collisions were resolved in favour of extra labels. In Prometheus, only labels from the received metric are available in templating, so no collisions are possible. This change makes vmalert's behaviour similar to Prometheus. For example, consider alerting rule which is triggered by time series with `alertname` label. In vmalert, this label would be overriden by alerting rule's name everywhere: for alert labels, for annotations, etc. In Prometheus, it would be overriden for alert's labels only, but in annotations the original label value would be available. See more details here https://github.com/prometheus/compliance/issues/80 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2022-04-06 20:24:45 +02:00
Roman Khavronenko	0989649ad0	Vmalert compliance 2 (#2340 ) * vmalert: split alert's `Start` field into `ActiveAt` and `Start` The `ActiveAt` field identifies when alert becomes active for rules with `for > 0`. Previously, this value was stored in field `Start`. The field `Start` now identifies the moment alert became `FIRING`. The split is needed in order to distinguish these two moments in the API responses for alerts. Signed-off-by: hagen1778 <roman@victoriametrics.com> * vmalert: support specific moment of time for rules evaluation The Querier interface was extended to accept a new argument used as a timestamp at which evaluation should be made. It is needed to align rules execution time within the group. Signed-off-by: hagen1778 <roman@victoriametrics.com> * vmalert: mark disappeared series as stale Series generated by alerting rules, which were sent to remote write now will be marked as stale if they will disappear on the next evaluation. This would make ALERTS and ALERTS_FOR_TIME series more precise. Signed-off-by: hagen1778 <roman@victoriametrics.com> * wip Signed-off-by: hagen1778 <roman@victoriametrics.com> * vmalert: evaluate rules at fixed timestamp Before, time at which rules were evaluated was calculated right before rule execution. The change makes sure that timestamp is calculated only once per evalution round and all rules are using the same timestamp. It also updates the logic of resending of already resolved alert notification. Signed-off-by: hagen1778 <roman@victoriametrics.com> * vmalert: allow overridin `alertname` label value if it is present in response Previously, `alertname` was always equal to the Alerting Rule name. Now, its value can be overriden if series in response containt the different value for this label. The change is needed for improving compatibility with Prometheus. Signed-off-by: hagen1778 <roman@victoriametrics.com> * vmalert: align rules evaluation in time Now, evaluation timestamp for rules evaluates as if there was no delay in rules evaluation. It means, that rules will be evaluated at fixed timestamps+group_interval. This way provides more consistent evaluation results and improves compatibility with Prometheus, Signed-off-by: hagen1778 <roman@victoriametrics.com> * vmalert: add metric for missed iterations New metric `vmalert_iteration_missed_total` will show whether rules evaluation round was missed. Signed-off-by: hagen1778 <roman@victoriametrics.com> * vmalert: reduce delay before the initial rule evaluation in group Signed-off-by: hagen1778 <roman@victoriametrics.com> * vmalert: rollback alertname override According to the spec: ``` The alert name from the alerting rule (HighRequestLatency from the example above) MUST be added to the labels of the alert with the label name as alertname. It MUST override any existing alertname label. ``` https://github.com/prometheus/compliance/blob/main/alert_generator/specification.md#step-3 Signed-off-by: hagen1778 <roman@victoriametrics.com> * vmalert: throw err immediately on dedup detection ``` The execution of an alerting rule MUST error out immediately and MUST NOT send any alerts or add samples to samples receiver if there is more than one alert with the same labels ``` https://github.com/prometheus/compliance/blob/main/alert_generator/specification.md#step-4 Signed-off-by: hagen1778 <roman@victoriametrics.com> * vmalert: cleanup Signed-off-by: hagen1778 <roman@victoriametrics.com> * vmalert: use strings builder to reduce allocs Signed-off-by: hagen1778 <roman@victoriametrics.com>	2022-03-29 15:09:07 +02:00
Dmytro Kozlov	11ae1ae924	Added resendDelay for alerts (#2296 ) * vmalert: add support of `resendDelay` flag for alerts Co-authored-by: dmitryk-dk <dmitry.kozlov@brightlocal.com> Co-authored-by: hagen1778 <roman@victoriametrics.com>	2022-03-16 15:26:33 +00:00
Roman Khavronenko	fb6eab03a2	Vmalert compliance improvements (#2320 ) * vmalert: add support for `sortByLabel` template function * vmalert: update API according to Prometheus conformance program The changes to the API, field names and URL path has been made according to the Prometheus specification for `alert_generator` https://github.com/prometheus/compliance/blob/main/alert_generator/specification.md * vmalert: fix the timestamp of the evaluated rules The timestamp used for alert's `EndsAt` was calculated before sending the notification. While the correct way is to use the timestamp taken right before rules evaluation. * vmalert: add `-datasource.queryTimeAlignment` flag The flag is supposed to provide ability to disable `time` param alignment when executing rules. By default, this flag is enabled, so it remains backward compatible. The flag was introduced to achieve better compatibility with Prometheus behaviour according to https://github.com/prometheus/compliance/blob/main/alert_generator/specification.md Signed-off-by: hagen1778 <roman@victoriametrics.com>	2022-03-15 11:54:53 +00:00
Dmytro Kozlov	565bd08c43	Issue-1824: added flags and different auth types support (#2287 ) * vmalert/notifier: added flags and different auth types support Co-authored-by: hagen1778 <roman@victoriametrics.com>	2022-03-10 13:09:12 +02:00
Roman Khavronenko	69d1893f4c	Consul SD - update services on the watcher's start (#2202 ) * lib/discovery/consul: update services on the watcher's start Previously, watcher's start was only initing goroutines for discovery but not waiting for the first iteration to end. It means first Consul discovery wasn't returning discovered targets until the next iteration. The change makes the watcher's start blocking until we get first discovery iteration done and all registries updated. Signed-off-by: hagen1778 <roman@victoriametrics.com> * vmalert: remove workarounds for consul SD Now when consul SD lib properly updates services on the first start, we don't need workarounds in vmalert. Signed-off-by: hagen1778 <roman@victoriametrics.com> * lib/discovery/consul: update after review Signed-off-by: hagen1778 <roman@victoriametrics.com> * wip Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2022-02-21 15:32:45 +02:00
hagen1778	2efa46a11c	vmalert: support `$externalLabels` and `$externalURL` in templates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2193 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2022-02-15 17:33:52 +03:00
Roman Khavronenko	e3adcbec6e	lib/promscrape: support prometheus-like duration in scrape configs (#2169 ) * lib/promscrape: support prometheus-like duration in scrape configs The change allows to specify duration values like `1d`, `1w` for fields `scrape_interval`, `scrape_timeout`, etc. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/817#issuecomment-1033384766 Signed-off-by: hagen1778 <roman@victoriametrics.com> * lib/blockcache: make linter happy Signed-off-by: hagen1778 <roman@victoriametrics.com> * lib/promscrape: support prometheus-like duration in scrape configs * add support for extra fields `scrape_align_interval` and `scrape_offset`; * support Prometheus duration parsing for `__scrape_interval__` and `__scrape_duration__` labels; Signed-off-by: hagen1778 <roman@victoriametrics.com> * wip * wip * docs/CHANGELOG.md: document the feature Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2022-02-11 16:17:00 +02:00
hagen1778	55e3bbd4cc	vmalert: add support of `-notifier.basicAuth.passwordFile` flag for notifiers https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1567 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2022-02-02 18:58:54 +03:00
hagen1778	f57982eddc	vmalert: remove trailing slash for static notifier addresses This would make addresses `http://localhost:9093` and `http://localhost:9093/` both to result into `http://localhost:9093/api/v2/alerts`. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2022-02-02 18:58:17 +03:00
Roman Khavronenko	5da71eb685	vmalert: support configuration file for notifiers (#2127 ) vmalert: support configuration file for notifiers * vmalert notifiers now can be configured via file see https://docs.victoriametrics.com/vmalert.html#notifier-configuration-file * add support of Consul service discovery for notifiers config see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1947 * add UI section for currently loaded/discovered notifiers * deprecate `-rule.configCheckInterval` in favour of `-configCheckInterval` * add ability to suppress logs for duplicated targets for notifiers discovery * change behaviour of `vmalert_alerts_send_errors_total` - it now accounts for failed alerts, not HTTP calls.	2022-02-02 14:11:41 +02:00
Aliaksandr Valialkin	831b93a755	app/vmalert: add `parseDuration` function in the same way as Prometheus does See https://github.com/prometheus/prometheus/pull/8817	2022-01-13 23:30:41 +02:00
Aliaksandr Valialkin	80f966b80c	app/vmalert: add `stripPort` template function in the same way as Prometheus does See https://github.com/prometheus/prometheus/pull/10002	2022-01-13 22:53:42 +02:00
Roman Khavronenko	852a895b70	vmalert: make notifier.Addr optional (#1870 ) For a long time notifier.Addr flag was required. The assumption was that vmalert will be always used for alerting. However, practice shows that some users need only recording rules. In this case, requirement of notifier.Addr is ambigious. The change verifies if loaded config contains recording or alerting rules and if there are corresponding flags set. This is true for initial config load and hot reload. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2021-11-30 01:18:48 +02:00
Aliaksandr Valialkin	e5ac9d8e57	all: consistently return `application/json` content-type without `charset=utf-8` The `application/json` content-type has utf-8 encoding by default. See https://stackoverflow.com/questions/9254891/what-does-content-type-application-json-charset-utf-8-really-mean Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/897	2021-11-09 18:04:44 +02:00
Roman Khavronenko	43a7984cd8	vmalert: correctly calculate alert ID including extra labels (#1734 ) Previously, ID for alert entity was generated without alertname or groupname. This led to collision, when multiple alerting rules within the same group producing same labelsets. E.g. expr: `sum(metric1) by (job) > 0` and expr: `sum(metric2) by (job) > 0` could result into same labelset `job: "job"`. The issue affects only UI and Web API parts of vmalert, because alert ID is used only for displaying and finding active alerts. It does not affect state restore procedure, since this label was added right before pushing to remote storage. The change now adds all extra labels right after receiving response from the datasource. And removes adding extra labels before pushing to remote storage. Additionally, change introduces a new flag `Restored` which will be displayed in UI for alerts which have been restored from remote storage on restart.	2021-10-22 12:30:38 +03:00
Roman Khavronenko	5494bc02a6	vmalert: add flag to limit the max value for auto-resovle duration for alerts (#1609 ) * vmalert: add flag to limit the max value for auto-resovle duration for alerts The new flag `rule.maxResolveDuration` suppose to limit max value for alert.End param, which is used by notifiers like Alertmanager for alerts auto resolve. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1586	2021-09-13 15:48:18 +03:00
Roman Khavronenko	eff940aa76	Vmalert metrics update (#1580 ) * vmalert: remove `vmalert_execution_duration_seconds` metric The summary for `vmalert_execution_duration_seconds` metric gives no additional value comparing to `vmalert_iteration_duration_seconds` metric. * vmalert: update config reload success metric properly Previously, if there was unsuccessfull attempt to reload config and then rollback to previous version - the metric remained set to 0. * vmalert: add Grafana dashboard to overview application metrics * docker: include vmalert target into list for scraping * vmalert: extend notifier metrics with addr label The change adds an `addr` label to metrics for alerts_sent and alerts_send_errors to identify which exact address is having issues. The according change was made to vmalert dashboard. * vmalert: update documentation and docker environment for vmalert's dashboard Mention Grafana's dashboard in vmalert's README in a new section #Monitoring. Update docker-compose env to automatically add vmalert's dashboard. Update docker-compose README with additional info about services.	2021-08-31 12:28:02 +03:00
assassins	a483044557	Performance optimization (#1481 ) There are redundant steps	2021-07-28 19:26:20 +03:00
Roman Khavronenko	2a259ef5e7	vmalert: support rules backfilling (aka `replay`) (#1358 ) * vmalert: support rules backfilling (aka `replay`) vmalert can `replay` configured rules in the past and backfill results via remote write protocol. It supports MetricsQL/PromQL storage as data source, and can backfill data to remote write compatible storage. Supports recording and alerting rules `replay`. See more details in README. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/836 * vmalert: review fixes * vmalert: readme fixes	2021-06-09 12:20:38 +03:00
Nikolay	d626c5c2a9	changes vmalert query function (#1307 ) * changes vmalert query function for prometheus rules compatibility its better to use labels as map. it simplifies template evaluation and allow to ignore can't evaluate field error because map will return default value. fixes https://github.com/VictoriaMetrics/operator/issues/243	2021-05-21 13:55:43 +03:00
Roman Khavronenko	4ed8de62ac	vmalert: document template functions and mention them in README (#1197 )	2021-04-08 18:19:08 +03:00
Roman Khavronenko	2e2e4f7e21	vmalert-989: return non-empty result in template func `query` stub to pass validation (#1002 ) On templates validation stage vmalert does not acutally send queries, so for complex chained expression validation may fail. To avoid this, we add a blank sample in response so validation can pass successfully. Later, during the rule execution, stub will be replaced with real `query` function. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/989	2021-01-10 02:56:11 +03:00
Nikolay	1de15ad490	adds escape for CRLF (#984 ) at external.alert.source - \n and \r symbols was url encoded, instead of direct usage. replace it from "\n" to `\n` allows to skip url encoding. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/890	2020-12-25 11:03:13 +02:00
Aliaksandr Valialkin	0326638c90	app/vmalert: typo fix in descriptions for notifier.basicAuth.username and notifier.basicAuth.password command-line flags	2020-12-24 12:48:59 +02:00
Nikolay	c270f8f3e6	changes vmalert notifier flag, (#978 ) fixes issue with notifier insecure setting, now its possible to use multiple notifier.tlsInsecureSkipVerify multiple time.	2020-12-22 23:23:04 +03:00
Roman Khavronenko	404cbd1522	vmalert-974: fix order for labels templating (#975 ) The change fixes bug caused by `3adf8c5a6f`. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/974	2020-12-19 14:10:59 +02:00
Roman Khavronenko	6247884057	vmalert: add function "query", "first" and "value" to alert templates functions (#960 ) The commit adds a support for template function `query`, `first` and `value`. The function `query` executes a MetricsQL query for active alerts. In vmalert we update templates on every evaluation for active alerts to keep them up to date. With `query` func it may become a perf issue since it will fire a query on every execution. We should keep it in mind for now. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/539	2020-12-14 20:11:45 +02:00
Aliaksandr Valialkin	47a038401b	all: consistently return text-based HTTP responses with charset=utf-8 This is a follow-up for https://github.com/VictoriaMetrics/VictoriaMetrics/pull/897	2020-11-13 10:35:41 +02:00
Aliaksandr Valialkin	d423d73251	app/vmalert: do not pring description for all the flags on config errors The description is too big to consume by human and it just distracts humans.	2020-10-08 13:35:57 +03:00
Aliaksandr Valialkin	e5500bfcf2	all: typo fix: exptected -> expected	2020-07-02 18:05:52 +03:00
Aliaksandr Valialkin	d5dddb0953	all: use %w instead of %s for wrapping errors in `fmt.Errorf` This will simplify examining the returned errors such as httpserver.ErrorWithStatusCode . See https://blog.golang.org/go1.13-errors for details.	2020-06-30 23:05:11 +03:00
Roman Khavronenko	88538df267	app/vmalert: support multiple notifier urls (#584 ) (#590 ) * app/vmalert: support multiple notifier urls (#584) User now can set multiple notifier URLs in the same fashion as for other vmutils (e.g. vmagent). The same is correct for TLS setting for every configured URL. Alerts sending is done in sequential way for respecting the specified URLs order. * app/vmalert: add basicAuth support for notifier client (#585) The change adds possibility to set basicAuth creds for notifier client in the same fasion as for remote write/read and datasource.	2020-06-29 22:21:03 +03:00
Roman Khavronenko	82ecfa3b32	app/vmalert: move flags description and initialization into subpackages The change adds no new functionality and aims to move flags definitions to subpackages that are using them. This should improve readability of the main function.	2020-06-28 12:26:22 +01:00
kreedom	7ec6711f06	Support of custom URL path for alert (#560 ) app/vmalert: Support custom URL for alerts source Add flag `external.alert.source` for configuring custom URL for alert's source. This may be handy to re-point default source URL to other systems like Grafana. Updates #517	2020-06-21 11:32:46 +01:00
Roman Khavronenko	270552fde4	vmalert: Add recording rules support. (#519 ) * vmalert: Add recording rules support. Recording rules support required additional service refactoring since it wasn't planned to support them from the very beginning. The list of changes is following: * new entity RecordingRule was added for writing results of MetricsQL expressions into remote storage; * interface Rule now unites both recording and alerting rules; * configuration parser was moved to separate package and now performs more strict validation; * new endpoint for listing all groups and rules in json format was added; * evaluation interval may be set to every particular group; * vmalert: uncomment tests * vmalert: rm outdated TODO * vmalert: fix typos in README	2020-06-01 13:46:37 +03:00
kreedom	6b23df2bec	vmalert add quotes escape function (#510 ) * vmalert add quotes escape function Co-authored-by: kreedom	2020-05-20 22:20:31 +03:00
Aliaksandr Valialkin	d0f08b4a58	app/vmalert/notifier: go fmt	2020-05-19 12:59:46 +03:00
kreedom	7e173655ba	vmalert - add expr to variables, add escape functions (#495 ) * vmalert - add expr to variables, add escape functions Co-authored-by: kreedom	2020-05-18 11:55:16 +03:00
Aliaksandr Valialkin	eac3da478e	app/vmalert: run `make quicktemplate-gen` from the root dir of the repository	2020-05-16 22:46:02 +03:00
Roman Khavronenko	8c8ff5d0cb	vmalert: cleanup and restructure of code to improve maintainability (#471 ) The change introduces new entity `manager` which replaces `watchdog`, decouples requestHandler and groups. Manager supposed to control life cycle of groups, rules and config reloads. Groups export an ID method which returns a hash from filename and group name. ID supposed to be unique identifier across all loaded groups. Some tests were added to improve coverage. Bug with wrong annotation value if $value is used in templates after metrics being restored fixed. Notifier interface was extended to accept context. New set of metrics was introduced for config reload.	2020-05-10 17:58:17 +01:00
Roman Khavronenko	3bfa41a95c	app/vmalert: initial remote-write support for alerts state persistence. (#442 ) * app/vmalert: initial remote-write support for alerts state persistence. If `remotewrite.url` flag is set, vmalert will send alerts state via remote-write protocol to remote storage. The sending is asynchronous to avoid blocking calls in rules evaluation loop. * app/vmalert: merge with master * app/vmalert: write both `instant` and `for` alerts timeseries states in remote storage.	2020-04-28 00:18:02 +03:00
kreedom	2c18548e08	alert - rename validate function and flags (#440 ) * alert - rename validate function and flags	2020-04-26 14:15:04 +03:00
kreedom	90de3086b3	[vmalert] add webserver (#410 ) * [vmalert] add webserver	2020-04-11 12:40:24 +03:00
Roman Khavronenko	b099d84271	Vmalert/rules eval (#400 ) * Initial rules evaluation support. Rules are now store alerts state in private field `alerts`. Every evaluation updates the alerts and state. Every unique metric received from datastore represents a unique alert, uniqueness is guaranteed by hashing ordered labelset. * merge with master * cleanup * support endAt parameter as 3evaluationInterval for active alerts make golint happy	2020-04-06 14:44:03 +03:00

1 2 3

107 Commits