VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-12-15 08:23:34 +01:00

Author	SHA1	Message	Date
Roman Khavronenko	16e0bb496e	vmalert: update groups on config reload only if changes detected (#759 ) On config reload event `vmalert` reloads configuration for every group. While it works for simple configurations, the more complex and heavy installations may suffer from frequent config reloads. The change introduces the `checksum` field for every group and is set to md5 hash of yaml configuration. The checksum will change if on any change to group definition like rules order or annotation change. Comparing the `checksum` field on config reload event helps to detect if group should be updated. The groups update is now done concurrently, so reload duration will be limited by the slowest group now. Partially solves #691 by improving config reload speed.	2020-09-11 23:41:12 +03:00
Aliaksandr Valialkin	b776a93608	app/vmselect/promql: support composite durations like Prometheus 2.21 does The following durations are supported now: `1h5m35s` or `1s543ms` See https://github.com/prometheus/prometheus/releases/tag/v2.21.0 and https://github.com/prometheus/prometheus/pull/7713	2020-09-11 22:17:24 +03:00
Aliaksandr Valialkin	6382e8081a	app/vmagent: allow setting multiple identical `-remoteWrite.url` values This may be useful when each url is authenticated via different `-remoteWrite.basicAuth.username`. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/755	2020-09-11 15:17:29 +03:00
Aliaksandr Valialkin	d3ad0d365e	app/vmselect: move Deadline from netstorage to searchutils This removes dependency on netstorage from searchutils.	2020-09-11 13:39:13 +03:00
Aliaksandr Valialkin	58d3b82ae5	app/{vminsert,vmagent}: allow passing timestamp via `timestamp` query arg when ingesting data to `/api/v1/import/prometheus` See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/750	2020-09-11 13:28:31 +03:00
Aliaksandr Valialkin	579c20756a	app/vmselect: substitute inf values at smooth_exponential with the previous values Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/757	2020-09-11 12:23:56 +03:00
Aliaksandr Valialkin	d67e6d3d2e	app/vmselect: skip infinite values when calculating smooth_exponential Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/757	2020-09-11 11:57:53 +03:00
Aliaksandr Valialkin	06427a184f	app/vmselect/graphite: typo fix in label name for vm_request_duration_seconds metric	2020-09-11 01:59:52 +03:00
Aliaksandr Valialkin	f307e6f432	app/vmselect: initial implementation of Graphite Metrics API See https://graphite-api.readthedocs.io/en/latest/api.html#the-metrics-api	2020-09-11 00:30:20 +03:00
Aliaksandr Valialkin	f5cb213ef9	lib/storage: reuse timestamp blocks for adjancent metric blocks with identical timestamps This should reduce disk space usage when scraping targets containing metrics with identical names such as `node_cpu_seconds_total`, histograms, quantiles, etc. Expose `vm_timestamps_blocks_merged_total` and `vm_timestamps_bytes_saved_total` metrics for monitoring the effectiveness of timestamp blocks merging.	2020-09-09 23:59:21 +03:00
Aliaksandr Valialkin	475698d2ad	docs: sync docs for vmalert, vmauth, vmbackup and vmrestore	2020-09-09 21:10:48 +03:00
Aliaksandr Valialkin	5ab57f916b	docs/vmagent.md: clarified the case when `-remoteWrite.queues` must be tuned Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/745	2020-09-08 20:15:49 +03:00
Aliaksandr Valialkin	f3a79abfb4	app/vmselect/promql: go fmt	2020-09-08 15:18:57 +03:00
Aliaksandr Valialkin	4f06eed1c1	app/vmselect/promql: adjust `integrate()` calculations to be more similar to calculations from InfluxDB: attempt #2 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/701	2020-09-08 14:36:23 +03:00
Aliaksandr Valialkin	0d0b606455	app/vmselect/promql: adjust `integrate()` calculations to be more similar to calculations from InfluxDB Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/701	2020-09-08 14:24:02 +03:00
Aliaksandr Valialkin	db91045348	app/vmselect/promql: increase floating point calculations accuracy by dividing by `1e3` instead of multiplying by `1e-3`	2020-09-08 14:01:02 +03:00
Aliaksandr Valialkin	804304c365	app/vmselect: add missing deletion for temporary files on partial responses when `-search.denyPartialResponse=true`	2020-09-04 02:23:12 +03:00
Aliaksandr Valialkin	478d8f8393	app/vmselect/promql: add `count_le_over_time(m[d], le)` and `count_gt_over_time(m[d], gt)` functions These functions returns the number of raw samples that don't exceed `le` or are bigger than `gt`. These functions are complement to already existing `share_le_over_time(m[d], le)` and `share_gt_over_time(m[d], gt)`.	2020-09-03 15:28:58 +03:00
Aliaksandr Valialkin	3490160fd0	app/vmselect: unconditionally align time range boundaries to step for subqueries as Prometheus does	2020-09-03 13:22:06 +03:00
Aliaksandr Valialkin	a3cdef6b06	app/vmagent: properly flush big blocks of data Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/741 Thanks to @IceRain00 for the investigation and initial attempt to fix the issue at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/742	2020-09-03 12:12:12 +03:00
Aliaksandr Valialkin	de216bab41	app/vmagent: fix data race when accessing writeRequest.lastFlushTime	2020-09-03 12:12:09 +03:00
Nikolay Khramchikhin	80a9dc79fe	changed vmalert behaviour (#738 ) * VMAlert start with empty rules dir There are some applications (operator for instance), that generates alerts configuration at runtime and vmalert must start correctly without rules to support this behaviour. Later application will add rules files and send SIGHUP to vmalert, which will trigger reading rules files and start rules exectuion. Removing rules files with SIGHUP signal must stop rules execution and vmalert will wait for new rules. * imports sorted * added test cases for empty rules, removed blank line * fixed imports conflict * updated tests	2020-09-03 11:07:40 +03:00
Aliaksandr Valialkin	7ac10ee978	app/vmalert: imrovements over `3f932c2db1`	2020-09-03 01:14:30 +03:00
DexterZhang	85f49ad439	feat: spread load of rule evaluation by group when starting new groups (#724 ) * feat: spread load of rule evaluation by group when starting new groups * review: reduce the resulting diff. * Update app/vmalert/group.go Co-authored-by: Roman Khavronenko <hagen1778@gmail.com> Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com> Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>	2020-09-03 01:14:26 +03:00
Aliaksandr Valialkin	4fa97430d7	app/{vminsert,vmagent}: allow adding extra labels when importing data via Prometheus, CSV and JSON line formats Extra labels may be added to the imported data by passing `extra_label=name=value` query args. Multiple query args may be passed in order to add multiple extra labels. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/719	2020-09-02 19:47:02 +03:00
Aliaksandr Valialkin	fe08b1eb26	app/vminsert: improve error message when the data cannot be sent to vmstorage - log reroutedBR buffer size This should improve debuggability for improperly configured cluster	2020-08-31 17:51:44 +03:00
Aliaksandr Valialkin	6f9c1bc078	app/vmagent: log unsuccessful attempt number when sending data to -remoteWrite.url	2020-08-30 21:40:15 +03:00
Aliaksandr Valialkin	3b1ecac04b	app/vmagent: apply sane limits to `-remoteWrite.queues` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/707	2020-08-30 21:25:51 +03:00
Roman Khavronenko	08b76cb26f	vmalert: update `-rule` flag description to enforce quotes using (#709 ) Description for `-rule` flag uses as example specific chars like asterisks which could be interpreted wrong by different shells. To avoid this, description now contains quoted flag values. See also #708	2020-08-28 09:46:35 +03:00
Roman Khavronenko	4b89da9463	lib/decimal: rename `significant decimal digits` to `significant figures` (#698 ) The previous notion was inconsistent with what `decimal.Round` does. According to [wiki](https://en.wikipedia.org/wiki/Significant_figures) rounding applied to all significant figures, not just decimal ones.	2020-08-16 17:22:40 +03:00
Aliaksandr Valialkin	6aab2f4989	all: allow using `KB`, `MB`, `GB`, `KiB`, `MiB` and `GiB` suffixes in command-line flag values related to byte sizes or byte rates	2020-08-16 17:08:28 +03:00
Aliaksandr Valialkin	285665e93b	app/vmselect/promql: allow passing multiple args to aggregate functions such as `avg(q1, q2, q3)`	2020-08-15 01:15:16 +03:00
Aliaksandr Valialkin	a2021d0dde	docs/vmagent.md: mention that gaps in remote storage may appear if vmagent cannot keep up with data ingestion	2020-08-14 20:48:17 +03:00
Aliaksandr Valialkin	e7c0b2ca56	docs: update docs	2020-08-14 19:14:46 +03:00
Aliaksandr Valialkin	b996280c65	app/{vminsert,vmagent}: improve documentation for `-influxListenAddr` command-line flag	2020-08-14 18:03:08 +03:00
Aliaksandr Valialkin	60c7397be5	all: support `%{ENV_VAR}` placeholders in yaml configs in all the vm* components Such placeholders are substituted by the corresponding environment variable values. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/583	2020-08-13 17:17:06 +03:00
Aliaksandr Valialkin	6721e47ae9	app: respect CPU limits set via cgroups Update GOMAXPROCS to limits set via cgroups. This should reduce CPU trashing and reduce memory usage for cases when VictoriaMetrics components run in containers with CPU limits. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/685	2020-08-11 23:01:03 +03:00
Aliaksandr Valialkin	62b6e54622	app/vmselect: reduce memory usage when exporting time series with big number of samples via `/api/v1/export` if `max_rows_per_line` is set to non-zero value Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/685	2020-08-10 20:57:43 +03:00
Aliaksandr Valialkin	c9f5c5623f	app/vmselect/netstorage: vary batch size for data unpacking depending on the available CPU cores This should reduce contention on the channel with unpack work for systems with high number of CPU cores	2020-08-10 15:16:48 +03:00
Aliaksandr Valialkin	b3d4ff7ee2	app/vmstorage: improve error logging when the request times out	2020-08-10 13:17:24 +03:00
Roman Khavronenko	78afc61896	app/vmalert: extend metrics set exported by `vmalert` #573 (#654 ) * app/vmalert: extend metrics set exported by `vmalert` #573 New metrics were added to improve observability: + vmalert_alerts_pending{alertname, group} - number of pending alerts per group per alert; + vmalert_alerts_acitve{alertname, group} - number of active alerts per group per alert; + vmalert_alerts_error{alertname, group} - is 1 if alertname ended up with error during prev execution, is 0 if no errors happened; + vmalert_recording_rules_error{recording, group} - is 1 if recording rule ended up with error during prev execution, is 0 if no errors happened; * vmalert_iteration_total{group, file} - now contains group and file name labels. This should improve control over specific groups; * vmalert_iteration_duration_seconds{group, file} - now contains group and file name labels. This should improve control over specific groups; Some collisions for alerts and recording rules are possible, because neither group name nor alert/recording rule name are unique for compatibility reasons. Commit contains list of TODOs for Unregistering metrics since groups and rules are ephemeral and could be removed without application restart. In order to unlock Unregistering feature corresponding PR was filed - https://github.com/VictoriaMetrics/metrics/pull/13 * app/vmalert: extend metrics set exported by `vmalert` #573 The changes are following: * add an ID label to rules metrics, since `name` collisions within one group is a common case - see the k8s example alerts; * supports metrics unregistering on rule updates. Consider the case when one rule was added or removed from the group, or the whole group was added or removed. The change depends on https://github.com/VictoriaMetrics/metrics/pull/16 where race condition for Unregister method was fixed.	2020-08-09 09:42:05 +03:00
ofen	3fea7c39be	401 Unauthorize HTTP error added (#681 ) 401 Unauthorize HTTP error added to trigger browser credentials pop-up promt [RFC 7235 https://developer.mozilla.org/en-US/docs/Web/HTTP/Authentication]	2020-08-09 09:39:37 +03:00
Aliaksandr Valialkin	67cacb22ac	lib/httpserver: add `-tls`, `-tlsCertFile` and `-tlsKeyFile` command-line flags in every vm binary This makes such binaries compatible with binaries from `master` branch (aka single-node version) See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/677	2020-08-07 10:57:32 +03:00
Aliaksandr Valialkin	95a8c492ef	app/vmselect/promql: properly handle `-n^m` like Prometheus does `-n^m` must be handled as `-(n^m)` instead of `(-n)^m`. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/675	2020-08-07 07:42:42 +03:00
Aliaksandr Valialkin	7f93d61a56	app/vmselect/promql: remove metric name after applying `clamp_min` and `clamp_max` functions in order to be consistent with Prometheus This improves VictoriaMetrics score at https://promlabs.com/promql-compliance-test-results-victoriametrics/	2020-08-06 23:42:55 +03:00
Aliaksandr Valialkin	01000505a0	app/vmselect/promql: remove metric name after applying `ceil`, `floor` and `round` functions in order to be more consistent with Prometheus This improves VictoriaMetrics score at https://promlabs.com/promql-compliance-test-results-victoriametrics/	2020-08-06 23:34:03 +03:00
Aliaksandr Valialkin	75bff1a567	app/vmselect/promql: remove metric name from results of certain rollup functions in order to be consistent with Prometheus Rollup functions: - avg_over_time - min_over_time - max_over_time - quantile_over_time This improves VictoriaMetrics results at https://promlabs.com/promql-compliance-test-results-victoriametrics/	2020-08-06 23:29:18 +03:00
Aliaksandr Valialkin	8835004a4c	app/vmselect: properly handle PromQL queries like `scalar1 < metric < scalar2` like Prometheus does This fixes some cases from https://promlabs.com/promql-compliance-test-results-victoriametrics/	2020-08-06 23:21:14 +03:00
Aliaksandr Valialkin	14ddb8a34e	app/vmselect/netstorage: reduce CPU contention when upacking time series blocks by unpacking batches of such blocks instead of a single block This should improve query performance on systems with big number of CPU cores (16 and more)	2020-08-06 17:50:13 +03:00
Aliaksandr Valialkin	46c98cd97a	app/vmselect/netstorage: reduce contention on unpackworkCh and timeseriesWorkCh for multi-CPU system by providing more capacity for these chans	2020-08-06 17:22:39 +03:00
Aliaksandr Valialkin	a455930ab4	app/vmstorage: rename `vm_cache_size_entries{type="storage/prefetchedMetricIDs"}` to `vm_cache_entries{type="storage/prefetchedMetricIDs"}` to be consistent with other `vm_cache_entries` metrics	2020-08-06 16:34:18 +03:00
Aliaksandr Valialkin	a3e91c593b	lib/storage: limit the number of concurrent calls to storage.searchTSIDs to GOMAXPROCS*2 This should limit the maximum memory usage and reduce CPU trashing on vmstorage when multiple heavy queries are executed. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648	2020-08-05 18:27:21 +03:00
Aliaksandr Valialkin	a930460236	app/vmagent: tune http client for sending data to remote storage in order to disable closing keep-alive connections Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/663	2020-08-04 21:01:40 +03:00
Aliaksandr Valialkin	a04f4a3d9a	app/vmselect: use warning level instead of info level for logging slow queries that take longer than `-search.logSlowQueryDuration`	2020-08-04 20:24:38 +03:00
Aliaksandr Valialkin	bdb881c43b	app/vmselect/promql: add zscore-related functions: `zscore_over_time(m[d])` and `zscore(q) by (...)`	2020-08-03 21:52:15 +03:00
Aliaksandr Valialkin	94471a1273	app: remove duplicate *-pure makefile rules	2020-07-31 20:01:30 +03:00
Aliaksandr Valialkin	a2aa3a60eb	app/vmselect: show `X-Forwarded-For` contents on `/api/v1/status/active_queries` page Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/659	2020-07-31 20:01:09 +03:00
Aliaksandr Valialkin	106e302d7a	all: add mssing APP_NAME to vm*-GOARCH builds	2020-07-31 13:45:32 +03:00
Aliaksandr Valialkin	945645f38f	docs/{vmagent,vmalert}: add instruction on how to build for ARM	2020-07-31 09:25:41 +03:00
Aliaksandr Valialkin	0c00fe70cf	app/vmselect: do not adjust `start` and `end` query args passed to `/api/v1/query_range` when `-search.disableCache` command-line flag is set Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/563	2020-07-30 23:14:56 +03:00
Aliaksandr Valialkin	29bbab0ec9	lib/storage: remove prioritizing of merging small parts over merging big parts, since it doesn't work as expected The prioritizing could lead to big merge starvation, which could end up in too big number of parts that must be merged into big parts. Multiple big merges may be initiated after the migration from v1.39.0 or v1.39.1. It is OK - these merges should be finished soon, which should return CPU and disk IO usage to normal levels. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/618	2020-07-30 20:02:22 +03:00
Aliaksandr Valialkin	338ee47d60	app/vmselect/promql: return non-empty value from `rate_over_sum(m[d])` even if a single data point is located in the given `[d]` window Just divide the data point value by the window duration in this case.	2020-07-29 12:37:34 +03:00
Aliaksandr Valialkin	717c554fb0	app/vmselect/promql: remove rollupFuncArg.realPrevValue handling, since the corner case in `increase()` is handled in another way now See `e00cfc854d` for the approach used now.	2020-07-29 12:37:34 +03:00
Aliaksandr Valialkin	d9037b3970	app/vmselect/promql: fill gaps with 0 in `rate_over_sum` response when the last value before the selected time window isnt empty	2020-07-29 12:37:34 +03:00
Aliaksandr Valialkin	f6d4275087	app/{vmagent,vminsert}: properly preserve `db` tag from query string passed to Influx line protocol query Previously `db` tag from the query string wasn't added to metrics after encountering `db` tag in the Influx line Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/653	2020-07-28 21:25:49 +03:00
Aliaksandr Valialkin	baebe86844	app/vmagent/remotewrite: add missing `resp.Body.Close()` after pushing data to remote storage Missing body close could disable HTTP keep-alive connections. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/653	2020-07-28 21:00:25 +03:00
Aliaksandr Valialkin	0f6f0d30d3	app/vmselect: show query origin (aka remote_addr or client address) on the `/api/v1/status/active_queries` page for every query	2020-07-28 15:14:40 +03:00
Roman Khavronenko	ec6ed467c6	app/vmalert: support `external.label` to specify global labelset for all rules #622 (#652 ) `external.label` flag supposed to help to distinguish alert or recording rules source in situations when more than one `vmalert` runs for the same datasource or AlertManager.	2020-07-28 14:23:04 +03:00
Aliaksandr Valialkin	9dccedc599	app/vmselect/promql: return empty values from `group()` if all the time series have no values at the given timestamp This aligns `group()` behaviour to Prometheus	2020-07-28 13:41:04 +03:00
Aliaksandr Valialkin	d5057f6d04	app/vmagent/remotewrite: create new request on failure to send a block of data to remote storage Previously the request body was already consumed before the retry, so this led to the following error: http: ContentLength=... with Body length 0	2020-07-27 17:33:05 +03:00
Aliaksandr Valialkin	b191e425b3	app/vmselect/promql: improve further the accuracy of buckets_limit() function The accuracy is increased by mergin the smallest bucket with the smallest adjacent bucket.	2020-07-26 12:10:56 +03:00
Aliaksandr Valialkin	43871e79c6	app/vmselect/promql: avoid dropping `inf` bucket in `buckets_limit` The `le="inf"` bucket must be preserved in order to maintain the maximum level of accuracy.	2020-07-25 17:00:25 +03:00
Aliaksandr Valialkin	978c1e930e	app/vmselect/promql: optimize `buckets_limit(k, buckets)` for big number of buckets	2020-07-25 13:24:33 +03:00
Aliaksandr Valialkin	51cbf27077	app/vmselect/promql: improve the accuracy of `buckets_limit(k, buckets)` function Now it properly merges the bucket with the previous bucket after deletion.	2020-07-24 17:07:30 +03:00
Aliaksandr Valialkin	cf69b1ea6f	app/vmselect/promql: add `buckets_limit(k, buckets)` function, which limits the number of buckets per time series to `k` This function works with both Prometheus-style and VictoriaMetrics-style buckets. The function removes buckets with the lowest values in order to reserve the highest precision. The function is useful for building heatmaps in Grafana from too big number of buckets.	2020-07-24 16:14:12 +03:00
Aliaksandr Valialkin	45334f61de	app/vmselect: fix tests for rate_over_sum	2020-07-24 02:35:09 +03:00
Aliaksandr Valialkin	3526e8768a	app/vmselect/promql: typo fix after `3e557c9861`	2020-07-24 02:15:23 +03:00
Aliaksandr Valialkin	8d1721d128	app/vmselect/promql: add `rate_over_sum(m[d])` function to MetricsQL, which returns rate over sum of `m` values over `d` duration Something like `sum_over_time(m[d]) / d`, but more accurate.	2020-07-24 01:17:15 +03:00
Aliaksandr Valialkin	88e8bed0c9	app/vmselect/promql: allow setting `[d]` window smaller than the interval between raw points for `avg_over_time` This makes `avg_over_time` behavior consistent with `sum_over_time` and `count_over_time` behaviors. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/636	2020-07-23 22:25:33 +03:00
Aliaksandr Valialkin	fb3d1380ac	lib/storage: respect `-search.maxQueryDuration` when searching for time series in inverted index Previously the time spent on inverted index search could exceed the configured `-search.maxQueryDuration`. This commit stops searching in inverted index on query timeout.	2020-07-23 21:22:05 +03:00
Aliaksandr Valialkin	dbf3038637	lib/storage: add more fine-grained pace limiting for search	2020-07-23 19:21:49 +03:00
Aliaksandr Valialkin	16a4b1b20c	app/vmselect/netstorage: protect from too smart compiler, which may break memory usage optimization in tmpBlocksFileWrapper.WriteBlocks	2020-07-23 17:57:24 +03:00
Aliaksandr Valialkin	0750d2cec1	app/vminsert: export `vm_relabel_metrics_dropped_total` metric that shows the number of metrics dropped due to relabeling	2020-07-23 14:58:02 +03:00
Aliaksandr Valialkin	55ed07add1	app/vmselect: typo fix after 0168e21fe32776e2f7f003f88e0e6e490eb2dcb0g	2020-07-23 14:11:15 +03:00
Aliaksandr Valialkin	7aa5b48508	app/vmselect: reduce memory usage when querying big number of time series with long labels Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/646	2020-07-23 13:48:58 +03:00
Aliaksandr Valialkin	49a0011837	app/vminsert: do not call ApplyRelabeling function if relabeling is disabled This should reduce CPU usage a bit when `-relabelConfig` isn't set	2020-07-23 13:35:36 +03:00
Aliaksandr Valialkin	c91ccce50c	app/vminsert: fix relabeling for metrics ingested via Influx line protocol Previously the enabled relabeling with `-relabelConfig` command-line flag could result in missing labels if a single Influx line protocol message contains multiple field values. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/638	2020-07-23 13:25:37 +03:00
Aliaksandr Valialkin	b8303afcd8	lib/storage: improve prioritizing of data ingestion over querying Prioritize also small merges over big merges. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/291 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648	2020-07-23 01:40:38 +03:00
Aliaksandr Valialkin	20d0c41ac5	app/vmselect/prometheus: support `d`, `w` and `y` suffixes for durations passed to `step` in `/api/v1/query_range` like Prometheus does Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/641	2020-07-22 16:27:27 +03:00
Aliaksandr Valialkin	bd4299fafe	app/vmselect/netstorage: reduce memory allocations when unpacking time series data by using a pool for unpackWork entries This should slightly reduce load on GC when processing queries that touch big number of time series. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/646 according to the provided memory profile	2020-07-22 15:04:42 +03:00
Aliaksandr Valialkin	a3f48e395e	app/vmagent: add `-remoteWrite.decimalPlaces` command-line flag, which may be used for reducing disk space usage on the remote storage	2020-07-21 21:55:42 +03:00
Aliaksandr Valialkin	5bb4fe1ba4	app/vmselect: take into account the time spent in wait queue before query execution as time spent on the query	2020-07-21 19:00:00 +03:00
Aliaksandr Valialkin	0755cb3b50	app/vmselect/promql: skip the first value in time series passed to `increase()` if it exceeds by more than 10x the delta between the next value and the first value This should prvent from inflated `increase()` results for time series that start from big initial values. Such cases may occur when a label value changes in a metric without counter reset.	2020-07-21 17:24:28 +03:00
Aliaksandr Valialkin	71eba8dcf5	app/vmselect: log the total available memory for concurrent requests on `not enough memory` errors This should simplify root cause analysis	2020-07-20 19:51:58 +03:00
Aliaksandr Valialkin	3b246aa569	app/vmagent: add `-remoteWrite.proxyURL` command-line option This option allows writing data to `-remoteWrite.url` via http, https or socks5 proxy. This is similar to `proxy_url` option in `remote_write` section of Prometheus. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write	2020-07-20 19:31:08 +03:00
Aliaksandr Valialkin	8bee3ef91b	docs/vmagent.md: sync with app/vmagent/README.md	2020-07-20 17:09:30 +03:00
Roman Khavronenko	8949ec961d	app/vmagent: mention grafana dashboard in README (#639 )	2020-07-20 17:09:27 +03:00
Aliaksandr Valialkin	86b54f3768	app/vmagent/remotewrite: allow passing empty `-remoteWrite.urlRelabelConfig` entries	2020-07-20 15:49:13 +03:00
Aliaksandr Valialkin	141e84b5a4	app/vmselect/prometheus: do not return time series with empty list of datapoints from /api/v1/query_range This matches Prometheus behaviour. This should fix https://github.com/jacksontj/promxy/issues/329	2020-07-20 15:30:13 +03:00
Aliaksandr Valialkin	4d2011a87d	app/vmselect/promql: add `mode()` aggregate function	2020-07-20 15:30:11 +03:00
Aliaksandr Valialkin	31ef39e8da	lib/httpserver: log remote address in error message from `httpserver.Errorf` This should improve detection of the root cause of errors. Thanks to Anant for the idea.	2020-07-20 14:06:29 +03:00
Aliaksandr Valialkin	427fa43ce2	app/vmselect/promql: add `mode_over_time(m[d])` function See https://en.wikipedia.org/wiki/Mode_(statistics) and https://stackoverflow.com/questions/61134078/promql-query-to-return-the-value-from-a-range-vector-which-occurs-maximum-no-of	2020-07-17 18:29:10 +03:00
Aliaksandr Valialkin	eb402a17bd	app/vmselect/promql: optimize `group(rollup(m))` calculations	2020-07-17 16:47:30 +03:00
Aliaksandr Valialkin	ea8dc85ba8	app/vmselect/promql: check that `any()` doesn't touch metric name	2020-07-17 16:23:11 +03:00
Aliaksandr Valialkin	fc8fe38a82	app/vmselect/promql: add `group()` aggregate function to MetricsQL This function has been added in Prometheus 2.20. See https://github.com/prometheus/prometheus/pull/7480	2020-07-17 15:17:38 +03:00
Aliaksandr Valialkin	c64914a7e4	app/vmselect/promql: keep all labels for time series from `any()` call	2020-07-17 15:17:37 +03:00
Aliaksandr Valialkin	f9b38f7f2d	app/vminsert/influx: properly handle the case when certain labels with empty values are removed by ApplyRelabeling() call Previously this could lead to `out of range` panic	2020-07-17 00:05:24 +03:00
Aliaksandr Valialkin	14dc426b45	app/vmselect: fix `nil pointer dereference` panic when unsuccessfully querying `vmstorage`	2020-07-16 19:15:18 +03:00
Aliaksandr Valialkin	ce381b3868	app/vmalert: consistently use "%w" instead of "%s" in `fmt.Errorf` when wrapping errors	2020-07-15 13:55:13 +03:00
Aliaksandr Valialkin	e6d96bb0bd	docs/vmagent.md: make filtering rules for init container pods less confusing	2020-07-14 20:33:19 +03:00
Aliaksandr Valialkin	c2b4b9138d	app/vmagent/remotewrite: return proper value from `tssRelabelPool.New` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/599	2020-07-14 14:28:14 +03:00
Aliaksandr Valialkin	86044f6561	app/{vminsert,vmagent}: add `-influxSkipMeasurement` command-line flag for using field name as metric name See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/626	2020-07-14 14:18:40 +03:00
Aliaksandr Valialkin	0e7b2008b2	app/vmselect/prometheus: do not adjust last points in time series with timestamps exceeding the current time Such timestamps usually mean that the query contains `offset`. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/625	2020-07-14 12:56:21 +03:00
Aliaksandr Valialkin	3898cc0285	app/vmselect/prometheus: minimize the diff for the change `1033dc7e2a` over `619b0a25c9`	2020-07-13 21:41:17 +03:00
faceair	bf39e67ade	fix empty response template (#617 )	2020-07-13 21:41:15 +03:00
Aliaksandr Valialkin	b6a5c29549	docs/vmagent.md: sync with app/vmagent/README.md	2020-07-13 21:26:00 +03:00
ofen	9ffa688846	Update README.md (#621 ) Troubleshooting section updated to help out with duplicate targets detection	2020-07-13 21:25:59 +03:00
Aliaksandr Valialkin	4353ff7ef1	app/vmagent: fix data race when multiple `-remoteWrite.urlRelabelConfig` options are set Previously multiple goroutines could access remoteWriteCtx.tss concurrently, which could lead to data race and improper relabeling. Now each goroutine has its own copy of tss during relabeling. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/599	2020-07-10 15:17:23 +03:00
Aliaksandr Valialkin	805a90f642	app/vmagent/remotewrite: typo fix in `-remoteWrite.showURL` help message	2020-07-10 14:07:14 +03:00
Aliaksandr Valialkin	6373d377ef	app/{vminsert,vmagent}: add ability to import data in Prometheus exposition format via `/api/v1/import/prometheus`	2020-07-10 12:13:28 +03:00
Aliaksandr Valialkin	d449d0a0e1	app/vmselect/promql: add missing tests for `ifnot` binary operation	2020-07-09 13:24:12 +03:00
Aliaksandr Valialkin	7e706eea13	app/vmselect/promql: refactor implementations for `and` and `unless` binary operations, so they are closer to `or` implementation	2020-07-09 13:06:01 +03:00
Aliaksandr Valialkin	6c1a47b5e0	app/vmselect/promql/active_queries.go: simplify code a bit by inlining getNextActiveQueryID function	2020-07-09 11:18:53 +03:00
Aliaksandr Valialkin	fb86071552	app/vmselect: add `/api/v1/status/active_queries` page with the list of currently running queries This is a follow-up for https://github.com/VictoriaMetrics/VictoriaMetrics/pull/575 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/528	2020-07-08 19:09:31 +03:00
DexterZhang	9930ce1fa9	Feat/query list vmselect (#575 ) * feat(vmselect): add support for listing current running queries and canceling specific query * fix(vmselect): change current queries' pid from int64 counter to uuid * feat(vmselect): add auth to internal operations like `/resetRollupResultCache`, `/query/list` and `/query/kill`. add flag `internalAuthKey` for these auth * fix(vmselect): add more info to current queries * review: delete some unnecessary code and use function instead of init * review: returen queriesMap in newQueriesMap review: delete unused var in struct queriesMap, add comments to exported functions * review: add return if error occurs * feat(vmselect): truncate query string in current running query list API since the size of query string might be large; use query string's pointer in struct `query` for the same reason; add query info API to get full access of query's info;	2020-07-08 19:04:29 +03:00
Aliaksandr Valialkin	0bff96fe4b	lib/storage: prioritize data ingestion over heavy queries Heavy queries could result in the lack of CPU resources for processing the current data ingestion stream. Prevent this by delaying queries' execution until free resources are available for data ingestion. Expose `vm_search_delays_total` metric, which may be used in for alerting when there is no enough CPU resources for data ingestion and/or for executing heavy queries. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/291	2020-07-05 19:44:04 +03:00
Roman Khavronenko	9afd19d375	app/vmalert: add retries to remotewrite (#605 ) * app/vmalert: add retries to remotewrite Remotewrite pkg now does limited number of retries if write request failed. This suppose to make vmalert state persisting more reliable. New metrics were added to remotewrite in order to track rows/bytes sent/dropped. defaultFlushInterval was increased from 1s to 5s for sanity reasons. * fix * wip * wip * wip * fix bits alignment bug for 32-bit systems * fix mistakenly dropped field	2020-07-05 18:47:38 +03:00
Aliaksandr Valialkin	82871fb7a5	app/vmselect/prometheus: small fixes on top of `8bb762124a`	2020-07-05 18:17:53 +03:00
faceair	17f175ff5a	fix adjust last points avoid influence earlier value (#606 )	2020-07-05 18:17:52 +03:00
Ween	d28fb0baf9	[VMAlert] Fix error log when remoteWrite queue size is full (#602 ) * Fix Auto metrics relabeled errors * Finalize auto-genenated Labels * Fix Test Errors * fix error logs when queue is full Co-authored-by: xinyulong <xinyulong@kuaishou.com>	2020-07-03 16:50:43 +03:00
Aliaksandr Valialkin	8bb3622e9d	app/vminsert: prevent from adding and/or selecting labels with empty values Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/600	2020-07-02 23:17:12 +03:00
Aliaksandr Valialkin	6ebac3ab63	app/vminsert: add ability to apply relabeling to all the incoming metrics if `-relabelConfig` command-line arg points to a file with a list of `relabel_config` entries See https://victoriametrics.github.io/#relabeling	2020-07-02 20:36:33 +03:00
Aliaksandr Valialkin	a45856570b	all: typo fix: exptected -> expected	2020-07-02 18:06:21 +03:00
Aliaksandr Valialkin	f10e8809c0	app/vmselect: add `interpolate` function for filling gaps with linearly interpolated values See https://stackoverflow.com/q/62565021/274937 for details	2020-07-02 14:54:46 +03:00
Aliaksandr Valialkin	2361ad8ab4	lib/promscrape: add ability to set `disable_compression` and `disable_keepalive` options in `scrape_config` section of the config passed to `-promscrape.config` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/580	2020-07-02 14:19:34 +03:00
BigFish	aa26b94f33	fix: spelling mistakes (#594 ) Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>	2020-07-01 01:36:40 +03:00
Aliaksandr Valialkin	4cb3e7595c	app/vmstorage: add `-denyQueriesOutsideRetention` command-line flag for denying queries outside the configured retention	2020-07-01 00:58:42 +03:00
Aliaksandr Valialkin	0c4e8aeb2b	all: use `errors.As` for inspecting errors that implement httpserver.ErrorWithStatusCode	2020-07-01 00:03:11 +03:00
Aliaksandr Valialkin	d962568e93	all: use %w instead of %s for wrapping errors in `fmt.Errorf` This will simplify examining the returned errors such as httpserver.ErrorWithStatusCode . See https://blog.golang.org/go1.13-errors for details.	2020-06-30 23:33:46 +03:00
Roman Khavronenko	156c83d112	app/vmalert: support multiple notifier urls (#584 ) (#590 ) * app/vmalert: support multiple notifier urls (#584) User now can set multiple notifier URLs in the same fashion as for other vmutils (e.g. vmagent). The same is correct for TLS setting for every configured URL. Alerts sending is done in sequential way for respecting the specified URLs order. * app/vmalert: add basicAuth support for notifier client (#585) The change adds possibility to set basicAuth creds for notifier client in the same fasion as for remote write/read and datasource.	2020-06-29 22:21:56 +03:00
Roman Khavronenko	bbeab70de6	app/vmalert: move flags description and initialization into subpackages The change adds no new functionality and aims to move flags definitions to subpackages that are using them. This should improve readability of the main function.	2020-06-29 22:18:29 +03:00
kreedom	63c36e2e69	app/vmalert: properly set transport for HTTP clients Fixes issue #586	2020-06-29 22:18:25 +03:00
Aliaksandr Valialkin	2b504f17de	docs: update the info that docker images are built on top of `alpine` image now A follow-up after the commit `ff624c9125` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/522	2020-06-26 13:52:25 +03:00
Aliaksandr Valialkin	a586b8b6d4	app/vminsert/netstorage: do not re-route every time series to more than two vmstorage nodes when certain vmstorage nodes are temporarily slower than the rest of them Previously vminsert may spread data for a single time series across all the available vmstorage nodes when vmstorage nodes couldn't handle the given ingestion rate. This could lead to increased usage of CPU and memory on every vmstorage node, since every vmstorage node had to register all the time series seen in the cluster. Now a time series may spread to maximum two vmstorage nodes under heavy load. Every time series is routed to a single vmstorage node under normal load.	2020-06-25 16:42:37 +03:00
Aliaksandr Valialkin	12b87b2088	app/vmselect/netstorage: reset big result values every 10 seconds instead of after processing every time series This should reduce GC pressure when processing time series with big number of rows	2020-06-24 19:37:35 +03:00
nicbaz	46c5c0772c	vmselect: fix label_replace when mismatch (#579 ) As per documentation on `label_replace` function: "If the regular expression doesn't match then the timeseries is returned unchanged". Currently this behavior is not enforced, if a regexp on an existing tag doesn't match then the tag value is copied as-is in the destination tag. This fix first checks that the regular expression matches the source tag before applying anything. Given the current implementation, this fix also changes the behavior of the MetricsQL `label_transform` function which does not document this behavior at the moment.	2020-06-23 23:54:29 +03:00
nicbaz	ea2ed4b7e8	vmalert: add support for TLS configuration (#578 ) app/vmalert: add support for TLS configuration Add support for TLS optional configuration in a similar fashion to what is currently supported in other vmutils such as vmagent. TLS configuration options are distinct for datasource, remoteRead, remoteWrite as well as notifier.	2020-06-23 22:47:23 +03:00
Aliaksandr Valialkin	0fdbe5de25	app/vmselect/netstorage: increase concurrency when processing small number of time series with big number of data points per each time series Previously VictoriaMetrics was processing up to 32 time series in a single goroutine. This could be slow if each time series contains big number of data points (10M+ or more), since only a single CPU core could be loaded with work, while other CPU cores were idle. Fix this by launching GOMAXPROCS workers for time series processing. This should help with https://github.com/VictoriaMetrics/VictoriaMetrics/issues/572	2020-06-23 22:45:57 +03:00
Aliaksandr Valialkin	3a444bb7bb	lib/promrelabel: add support for `keep_if_equal` and `drop_if_equal` actions to relabel configs These actions may be useful for filtering out unneeded targets and/or metrics if they contain equal label values. For example, the following rule would leave the target only if __meta_kubernetes_annotation_prometheus_io_port equals __meta_kubernetes_pod_container_port_number: - action: keep_if_equal source_labels: [__meta_kubernetes_annotation_prometheus_io_port, __meta_kubernetes_pod_container_port_number]	2020-06-23 17:29:19 +03:00
kreedom	f227799c87	Support of custom URL path for alert (#560 ) app/vmalert: Support custom URL for alerts source Add flag `external.alert.source` for configuring custom URL for alert's source. This may be handy to re-point default source URL to other systems like Grafana. Updates #517	2020-06-21 16:33:58 +03:00

1 2 3 4 5 ...

746 Commits