VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-12-15 08:23:34 +01:00

Author	SHA1	Message	Date
Aliaksandr Valialkin	978c1e930e	app/vmselect/promql: optimize `buckets_limit(k, buckets)` for big number of buckets	2020-07-25 13:24:33 +03:00
Aliaksandr Valialkin	51cbf27077	app/vmselect/promql: improve the accuracy of `buckets_limit(k, buckets)` function Now it properly merges the bucket with the previous bucket after deletion.	2020-07-24 17:07:30 +03:00
Aliaksandr Valialkin	cf69b1ea6f	app/vmselect/promql: add `buckets_limit(k, buckets)` function, which limits the number of buckets per time series to `k` This function works with both Prometheus-style and VictoriaMetrics-style buckets. The function removes buckets with the lowest values in order to reserve the highest precision. The function is useful for building heatmaps in Grafana from too big number of buckets.	2020-07-24 16:14:12 +03:00
Aliaksandr Valialkin	45334f61de	app/vmselect: fix tests for rate_over_sum	2020-07-24 02:35:09 +03:00
Aliaksandr Valialkin	3526e8768a	app/vmselect/promql: typo fix after `3e557c9861`	2020-07-24 02:15:23 +03:00
Aliaksandr Valialkin	8d1721d128	app/vmselect/promql: add `rate_over_sum(m[d])` function to MetricsQL, which returns rate over sum of `m` values over `d` duration Something like `sum_over_time(m[d]) / d`, but more accurate.	2020-07-24 01:17:15 +03:00
Aliaksandr Valialkin	88e8bed0c9	app/vmselect/promql: allow setting `[d]` window smaller than the interval between raw points for `avg_over_time` This makes `avg_over_time` behavior consistent with `sum_over_time` and `count_over_time` behaviors. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/636	2020-07-23 22:25:33 +03:00
Aliaksandr Valialkin	fb3d1380ac	lib/storage: respect `-search.maxQueryDuration` when searching for time series in inverted index Previously the time spent on inverted index search could exceed the configured `-search.maxQueryDuration`. This commit stops searching in inverted index on query timeout.	2020-07-23 21:22:05 +03:00
Aliaksandr Valialkin	dbf3038637	lib/storage: add more fine-grained pace limiting for search	2020-07-23 19:21:49 +03:00
Aliaksandr Valialkin	16a4b1b20c	app/vmselect/netstorage: protect from too smart compiler, which may break memory usage optimization in tmpBlocksFileWrapper.WriteBlocks	2020-07-23 17:57:24 +03:00
Aliaksandr Valialkin	0750d2cec1	app/vminsert: export `vm_relabel_metrics_dropped_total` metric that shows the number of metrics dropped due to relabeling	2020-07-23 14:58:02 +03:00
Aliaksandr Valialkin	55ed07add1	app/vmselect: typo fix after 0168e21fe32776e2f7f003f88e0e6e490eb2dcb0g	2020-07-23 14:11:15 +03:00
Aliaksandr Valialkin	7aa5b48508	app/vmselect: reduce memory usage when querying big number of time series with long labels Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/646	2020-07-23 13:48:58 +03:00
Aliaksandr Valialkin	49a0011837	app/vminsert: do not call ApplyRelabeling function if relabeling is disabled This should reduce CPU usage a bit when `-relabelConfig` isn't set	2020-07-23 13:35:36 +03:00
Aliaksandr Valialkin	c91ccce50c	app/vminsert: fix relabeling for metrics ingested via Influx line protocol Previously the enabled relabeling with `-relabelConfig` command-line flag could result in missing labels if a single Influx line protocol message contains multiple field values. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/638	2020-07-23 13:25:37 +03:00
Aliaksandr Valialkin	b8303afcd8	lib/storage: improve prioritizing of data ingestion over querying Prioritize also small merges over big merges. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/291 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648	2020-07-23 01:40:38 +03:00
Aliaksandr Valialkin	20d0c41ac5	app/vmselect/prometheus: support `d`, `w` and `y` suffixes for durations passed to `step` in `/api/v1/query_range` like Prometheus does Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/641	2020-07-22 16:27:27 +03:00
Aliaksandr Valialkin	bd4299fafe	app/vmselect/netstorage: reduce memory allocations when unpacking time series data by using a pool for unpackWork entries This should slightly reduce load on GC when processing queries that touch big number of time series. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/646 according to the provided memory profile	2020-07-22 15:04:42 +03:00
Aliaksandr Valialkin	a3f48e395e	app/vmagent: add `-remoteWrite.decimalPlaces` command-line flag, which may be used for reducing disk space usage on the remote storage	2020-07-21 21:55:42 +03:00
Aliaksandr Valialkin	5bb4fe1ba4	app/vmselect: take into account the time spent in wait queue before query execution as time spent on the query	2020-07-21 19:00:00 +03:00
Aliaksandr Valialkin	0755cb3b50	app/vmselect/promql: skip the first value in time series passed to `increase()` if it exceeds by more than 10x the delta between the next value and the first value This should prvent from inflated `increase()` results for time series that start from big initial values. Such cases may occur when a label value changes in a metric without counter reset.	2020-07-21 17:24:28 +03:00
Aliaksandr Valialkin	71eba8dcf5	app/vmselect: log the total available memory for concurrent requests on `not enough memory` errors This should simplify root cause analysis	2020-07-20 19:51:58 +03:00
Aliaksandr Valialkin	3b246aa569	app/vmagent: add `-remoteWrite.proxyURL` command-line option This option allows writing data to `-remoteWrite.url` via http, https or socks5 proxy. This is similar to `proxy_url` option in `remote_write` section of Prometheus. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write	2020-07-20 19:31:08 +03:00
Aliaksandr Valialkin	8bee3ef91b	docs/vmagent.md: sync with app/vmagent/README.md	2020-07-20 17:09:30 +03:00
Roman Khavronenko	8949ec961d	app/vmagent: mention grafana dashboard in README (#639 )	2020-07-20 17:09:27 +03:00
Aliaksandr Valialkin	86b54f3768	app/vmagent/remotewrite: allow passing empty `-remoteWrite.urlRelabelConfig` entries	2020-07-20 15:49:13 +03:00
Aliaksandr Valialkin	141e84b5a4	app/vmselect/prometheus: do not return time series with empty list of datapoints from /api/v1/query_range This matches Prometheus behaviour. This should fix https://github.com/jacksontj/promxy/issues/329	2020-07-20 15:30:13 +03:00
Aliaksandr Valialkin	4d2011a87d	app/vmselect/promql: add `mode()` aggregate function	2020-07-20 15:30:11 +03:00
Aliaksandr Valialkin	31ef39e8da	lib/httpserver: log remote address in error message from `httpserver.Errorf` This should improve detection of the root cause of errors. Thanks to Anant for the idea.	2020-07-20 14:06:29 +03:00
Aliaksandr Valialkin	427fa43ce2	app/vmselect/promql: add `mode_over_time(m[d])` function See https://en.wikipedia.org/wiki/Mode_(statistics) and https://stackoverflow.com/questions/61134078/promql-query-to-return-the-value-from-a-range-vector-which-occurs-maximum-no-of	2020-07-17 18:29:10 +03:00
Aliaksandr Valialkin	eb402a17bd	app/vmselect/promql: optimize `group(rollup(m))` calculations	2020-07-17 16:47:30 +03:00
Aliaksandr Valialkin	ea8dc85ba8	app/vmselect/promql: check that `any()` doesn't touch metric name	2020-07-17 16:23:11 +03:00
Aliaksandr Valialkin	fc8fe38a82	app/vmselect/promql: add `group()` aggregate function to MetricsQL This function has been added in Prometheus 2.20. See https://github.com/prometheus/prometheus/pull/7480	2020-07-17 15:17:38 +03:00
Aliaksandr Valialkin	c64914a7e4	app/vmselect/promql: keep all labels for time series from `any()` call	2020-07-17 15:17:37 +03:00
Aliaksandr Valialkin	f9b38f7f2d	app/vminsert/influx: properly handle the case when certain labels with empty values are removed by ApplyRelabeling() call Previously this could lead to `out of range` panic	2020-07-17 00:05:24 +03:00
Aliaksandr Valialkin	14dc426b45	app/vmselect: fix `nil pointer dereference` panic when unsuccessfully querying `vmstorage`	2020-07-16 19:15:18 +03:00
Aliaksandr Valialkin	ce381b3868	app/vmalert: consistently use "%w" instead of "%s" in `fmt.Errorf` when wrapping errors	2020-07-15 13:55:13 +03:00
Aliaksandr Valialkin	e6d96bb0bd	docs/vmagent.md: make filtering rules for init container pods less confusing	2020-07-14 20:33:19 +03:00
Aliaksandr Valialkin	c2b4b9138d	app/vmagent/remotewrite: return proper value from `tssRelabelPool.New` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/599	2020-07-14 14:28:14 +03:00
Aliaksandr Valialkin	86044f6561	app/{vminsert,vmagent}: add `-influxSkipMeasurement` command-line flag for using field name as metric name See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/626	2020-07-14 14:18:40 +03:00
Aliaksandr Valialkin	0e7b2008b2	app/vmselect/prometheus: do not adjust last points in time series with timestamps exceeding the current time Such timestamps usually mean that the query contains `offset`. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/625	2020-07-14 12:56:21 +03:00
Aliaksandr Valialkin	3898cc0285	app/vmselect/prometheus: minimize the diff for the change `1033dc7e2a` over `619b0a25c9`	2020-07-13 21:41:17 +03:00
faceair	bf39e67ade	fix empty response template (#617 )	2020-07-13 21:41:15 +03:00
Aliaksandr Valialkin	b6a5c29549	docs/vmagent.md: sync with app/vmagent/README.md	2020-07-13 21:26:00 +03:00
ofen	9ffa688846	Update README.md (#621 ) Troubleshooting section updated to help out with duplicate targets detection	2020-07-13 21:25:59 +03:00
Aliaksandr Valialkin	4353ff7ef1	app/vmagent: fix data race when multiple `-remoteWrite.urlRelabelConfig` options are set Previously multiple goroutines could access remoteWriteCtx.tss concurrently, which could lead to data race and improper relabeling. Now each goroutine has its own copy of tss during relabeling. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/599	2020-07-10 15:17:23 +03:00
Aliaksandr Valialkin	805a90f642	app/vmagent/remotewrite: typo fix in `-remoteWrite.showURL` help message	2020-07-10 14:07:14 +03:00
Aliaksandr Valialkin	6373d377ef	app/{vminsert,vmagent}: add ability to import data in Prometheus exposition format via `/api/v1/import/prometheus`	2020-07-10 12:13:28 +03:00
Aliaksandr Valialkin	d449d0a0e1	app/vmselect/promql: add missing tests for `ifnot` binary operation	2020-07-09 13:24:12 +03:00
Aliaksandr Valialkin	7e706eea13	app/vmselect/promql: refactor implementations for `and` and `unless` binary operations, so they are closer to `or` implementation	2020-07-09 13:06:01 +03:00
Aliaksandr Valialkin	6c1a47b5e0	app/vmselect/promql/active_queries.go: simplify code a bit by inlining getNextActiveQueryID function	2020-07-09 11:18:53 +03:00
Aliaksandr Valialkin	fb86071552	app/vmselect: add `/api/v1/status/active_queries` page with the list of currently running queries This is a follow-up for https://github.com/VictoriaMetrics/VictoriaMetrics/pull/575 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/528	2020-07-08 19:09:31 +03:00
DexterZhang	9930ce1fa9	Feat/query list vmselect (#575 ) * feat(vmselect): add support for listing current running queries and canceling specific query * fix(vmselect): change current queries' pid from int64 counter to uuid * feat(vmselect): add auth to internal operations like `/resetRollupResultCache`, `/query/list` and `/query/kill`. add flag `internalAuthKey` for these auth * fix(vmselect): add more info to current queries * review: delete some unnecessary code and use function instead of init * review: returen queriesMap in newQueriesMap review: delete unused var in struct queriesMap, add comments to exported functions * review: add return if error occurs * feat(vmselect): truncate query string in current running query list API since the size of query string might be large; use query string's pointer in struct `query` for the same reason; add query info API to get full access of query's info;	2020-07-08 19:04:29 +03:00
Aliaksandr Valialkin	0bff96fe4b	lib/storage: prioritize data ingestion over heavy queries Heavy queries could result in the lack of CPU resources for processing the current data ingestion stream. Prevent this by delaying queries' execution until free resources are available for data ingestion. Expose `vm_search_delays_total` metric, which may be used in for alerting when there is no enough CPU resources for data ingestion and/or for executing heavy queries. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/291	2020-07-05 19:44:04 +03:00
Roman Khavronenko	9afd19d375	app/vmalert: add retries to remotewrite (#605 ) * app/vmalert: add retries to remotewrite Remotewrite pkg now does limited number of retries if write request failed. This suppose to make vmalert state persisting more reliable. New metrics were added to remotewrite in order to track rows/bytes sent/dropped. defaultFlushInterval was increased from 1s to 5s for sanity reasons. * fix * wip * wip * wip * fix bits alignment bug for 32-bit systems * fix mistakenly dropped field	2020-07-05 18:47:38 +03:00
Aliaksandr Valialkin	82871fb7a5	app/vmselect/prometheus: small fixes on top of `8bb762124a`	2020-07-05 18:17:53 +03:00
faceair	17f175ff5a	fix adjust last points avoid influence earlier value (#606 )	2020-07-05 18:17:52 +03:00
Ween	d28fb0baf9	[VMAlert] Fix error log when remoteWrite queue size is full (#602 ) * Fix Auto metrics relabeled errors * Finalize auto-genenated Labels * Fix Test Errors * fix error logs when queue is full Co-authored-by: xinyulong <xinyulong@kuaishou.com>	2020-07-03 16:50:43 +03:00
Aliaksandr Valialkin	8bb3622e9d	app/vminsert: prevent from adding and/or selecting labels with empty values Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/600	2020-07-02 23:17:12 +03:00
Aliaksandr Valialkin	6ebac3ab63	app/vminsert: add ability to apply relabeling to all the incoming metrics if `-relabelConfig` command-line arg points to a file with a list of `relabel_config` entries See https://victoriametrics.github.io/#relabeling	2020-07-02 20:36:33 +03:00
Aliaksandr Valialkin	a45856570b	all: typo fix: exptected -> expected	2020-07-02 18:06:21 +03:00
Aliaksandr Valialkin	f10e8809c0	app/vmselect: add `interpolate` function for filling gaps with linearly interpolated values See https://stackoverflow.com/q/62565021/274937 for details	2020-07-02 14:54:46 +03:00
Aliaksandr Valialkin	2361ad8ab4	lib/promscrape: add ability to set `disable_compression` and `disable_keepalive` options in `scrape_config` section of the config passed to `-promscrape.config` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/580	2020-07-02 14:19:34 +03:00
BigFish	aa26b94f33	fix: spelling mistakes (#594 ) Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>	2020-07-01 01:36:40 +03:00
Aliaksandr Valialkin	4cb3e7595c	app/vmstorage: add `-denyQueriesOutsideRetention` command-line flag for denying queries outside the configured retention	2020-07-01 00:58:42 +03:00
Aliaksandr Valialkin	0c4e8aeb2b	all: use `errors.As` for inspecting errors that implement httpserver.ErrorWithStatusCode	2020-07-01 00:03:11 +03:00
Aliaksandr Valialkin	d962568e93	all: use %w instead of %s for wrapping errors in `fmt.Errorf` This will simplify examining the returned errors such as httpserver.ErrorWithStatusCode . See https://blog.golang.org/go1.13-errors for details.	2020-06-30 23:33:46 +03:00
Roman Khavronenko	156c83d112	app/vmalert: support multiple notifier urls (#584 ) (#590 ) * app/vmalert: support multiple notifier urls (#584) User now can set multiple notifier URLs in the same fashion as for other vmutils (e.g. vmagent). The same is correct for TLS setting for every configured URL. Alerts sending is done in sequential way for respecting the specified URLs order. * app/vmalert: add basicAuth support for notifier client (#585) The change adds possibility to set basicAuth creds for notifier client in the same fasion as for remote write/read and datasource.	2020-06-29 22:21:56 +03:00
Roman Khavronenko	bbeab70de6	app/vmalert: move flags description and initialization into subpackages The change adds no new functionality and aims to move flags definitions to subpackages that are using them. This should improve readability of the main function.	2020-06-29 22:18:29 +03:00
kreedom	63c36e2e69	app/vmalert: properly set transport for HTTP clients Fixes issue #586	2020-06-29 22:18:25 +03:00
Aliaksandr Valialkin	2b504f17de	docs: update the info that docker images are built on top of `alpine` image now A follow-up after the commit `ff624c9125` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/522	2020-06-26 13:52:25 +03:00
Aliaksandr Valialkin	a586b8b6d4	app/vminsert/netstorage: do not re-route every time series to more than two vmstorage nodes when certain vmstorage nodes are temporarily slower than the rest of them Previously vminsert may spread data for a single time series across all the available vmstorage nodes when vmstorage nodes couldn't handle the given ingestion rate. This could lead to increased usage of CPU and memory on every vmstorage node, since every vmstorage node had to register all the time series seen in the cluster. Now a time series may spread to maximum two vmstorage nodes under heavy load. Every time series is routed to a single vmstorage node under normal load.	2020-06-25 16:42:37 +03:00
Aliaksandr Valialkin	12b87b2088	app/vmselect/netstorage: reset big result values every 10 seconds instead of after processing every time series This should reduce GC pressure when processing time series with big number of rows	2020-06-24 19:37:35 +03:00
nicbaz	46c5c0772c	vmselect: fix label_replace when mismatch (#579 ) As per documentation on `label_replace` function: "If the regular expression doesn't match then the timeseries is returned unchanged". Currently this behavior is not enforced, if a regexp on an existing tag doesn't match then the tag value is copied as-is in the destination tag. This fix first checks that the regular expression matches the source tag before applying anything. Given the current implementation, this fix also changes the behavior of the MetricsQL `label_transform` function which does not document this behavior at the moment.	2020-06-23 23:54:29 +03:00
nicbaz	ea2ed4b7e8	vmalert: add support for TLS configuration (#578 ) app/vmalert: add support for TLS configuration Add support for TLS optional configuration in a similar fashion to what is currently supported in other vmutils such as vmagent. TLS configuration options are distinct for datasource, remoteRead, remoteWrite as well as notifier.	2020-06-23 22:47:23 +03:00
Aliaksandr Valialkin	0fdbe5de25	app/vmselect/netstorage: increase concurrency when processing small number of time series with big number of data points per each time series Previously VictoriaMetrics was processing up to 32 time series in a single goroutine. This could be slow if each time series contains big number of data points (10M+ or more), since only a single CPU core could be loaded with work, while other CPU cores were idle. Fix this by launching GOMAXPROCS workers for time series processing. This should help with https://github.com/VictoriaMetrics/VictoriaMetrics/issues/572	2020-06-23 22:45:57 +03:00
Aliaksandr Valialkin	3a444bb7bb	lib/promrelabel: add support for `keep_if_equal` and `drop_if_equal` actions to relabel configs These actions may be useful for filtering out unneeded targets and/or metrics if they contain equal label values. For example, the following rule would leave the target only if __meta_kubernetes_annotation_prometheus_io_port equals __meta_kubernetes_pod_container_port_number: - action: keep_if_equal source_labels: [__meta_kubernetes_annotation_prometheus_io_port, __meta_kubernetes_pod_container_port_number]	2020-06-23 17:29:19 +03:00
kreedom	f227799c87	Support of custom URL path for alert (#560 ) app/vmalert: Support custom URL for alerts source Add flag `external.alert.source` for configuring custom URL for alert's source. This may be handy to re-point default source URL to other systems like Grafana. Updates #517	2020-06-21 16:33:58 +03:00
Aliaksandr Valialkin	70bf8218bb	app/vmselect/promql: properly override label values from `group_left` and `group_right` lists like Prometheus does Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/577	2020-06-21 16:32:27 +03:00
Aliaksandr Valialkin	2fc2679a3f	app/vminsert/netstorage: remove possible race condition when broken connection may be recovered before acquiring storageNode.bcLock	2020-06-20 16:38:08 +03:00
Aliaksandr Valialkin	9409a31c07	docs/vmauth.md: mention that we can provide custom integration with SAML	2020-06-19 13:13:53 +03:00
Aliaksandr Valialkin	4400700832	app/vminsert: properly replicate data for the last `RF-1` storage nodes for `-replicationFactor=RF` Previously the data for the last `RF-1` storage noes has been incorrectly replicated to the first storage node.	2020-06-19 12:40:22 +03:00
Aliaksandr Valialkin	4f673a5201	app/vminsert: export metrics for determining ingested rows with dropped or truncated labels Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/565	2020-06-19 01:12:44 +03:00
Aliaksandr Valialkin	6939e36fdd	app/vmselect/promql: fill gaps on right side with values from left side of `or` operator in the same way as Prometheus does Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/552	2020-06-18 23:05:23 +03:00
Aliaksandr Valialkin	85c1ccb8b8	app/vminsert/netstorage: add missing `return` in storageNode.checkHealth on connection failure	2020-06-18 20:51:51 +03:00
Aliaksandr Valialkin	464682f380	app/vminsert/netstorage: periodically check for each `-storageNode` health, so it could be marked as healthy when it is ready to accept data This fixes uneven data routing in cluster version when `-replicationFactor` is set to 1 (default value), i.e. when the replication is disabled. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/546	2020-06-18 20:42:43 +03:00
Roman Khavronenko	1a01fe2cf2	vmalert-537: allow name duplication for rules within one group. (#559 ) Uniqueness of rule is now defined by combination of its name, expression and labels. The hash of the combination is now used as rule ID and identifies rule within the group. Set of rules from coreos/kube-prometheus was added for testing purposes to verify compatibility. The check also showed that `vmalert` doesn't support `query` template function that was mentioned as limitation in README.	2020-06-18 18:54:35 +03:00
Aliaksandr Valialkin	87151e825e	docs/vmbackup.md: mention that backups from single-node and cluster versions are incompatible	2020-06-18 18:54:34 +03:00
Aliaksandr Valialkin	cc2225cc49	app/vmselect: fix the error after `936f35920a`	2020-06-12 22:00:45 +03:00
Aliaksandr Valialkin	936f35920a	app/vmselect/prometheus: allow returning partial response from `/api/v1/export` if `-search.denyPartialResponse=false` This makes `/api/v1/export` behaviour consistent with other `/api/v1/*` handlers.	2020-06-12 21:11:48 +03:00
Clémence Saussez	0b53e380cf	app/vmalert: fix link to testdata (#547 ) Fix broken link to vmalert test data Signed-off-by: Clemence Saussez <clemence@zen.ly>	2020-06-10 19:37:21 +03:00
Roman Khavronenko	d71b6e6584	vmalert-491: allow to configure concurrent rules execution per group. (#542 ) The feature allows to speed up group rules execution by executing them concurrently. Change also contains README changes to reflect configuration details.	2020-06-09 15:22:11 +03:00
Roman Khavronenko	5c049bf4dd	vmalert-521: allow to disable rules expression validation. (#536 ) This feature may be useful for using `vmalert` with PromQL compatible datasources like Loki.	2020-06-09 15:19:25 +03:00
Aliaksandr Valialkin	c1be462d42	app/vmauth: disable automatic response compression/uncompression, since it may work improperly in some cases See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/535	2020-06-05 20:14:07 +03:00
Aliaksandr Valialkin	7680b7155d	app/vmauth: emit fatal errors instead of panics when incorrect command-line flags are set	2020-06-05 20:14:05 +03:00
Aliaksandr Valialkin	01719f4949	app/vmstorage/transport: simplify setupTfss in order to prevent the possibility of nil tfs Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/534	2020-06-05 13:17:26 +03:00
Aliaksandr Valialkin	e4cef1b678	app/vmstorage: prevent from serving conns from vminsert and vmselect after the server is closed Previously it was possible that the connection is served after the server is closed if the following steps are performed: 1) Server accepts new connection. 2) Server.MustClose() is called and successfully finished. 3) Server starts processing the connection accepted at step 1. There could be various crashes like in https://github.com/VictoriaMetrics/VictoriaMetrics/issues/534 since the storage may be already closed. Now the server closes the connection at step 3 without processing it.	2020-06-05 11:55:48 +03:00
Aliaksandr Valialkin	58069f5a6a	app/vmalert: print brief usage info for `vmalert -help`	2020-06-05 10:43:24 +03:00
Aliaksandr Valialkin	3848ea3a4a	app/vmauth: print brief usage info for `vmauth -help`	2020-06-05 10:40:11 +03:00
Aliaksandr Valialkin	8ad8ca350a	app/vmagent: print brief usage info for `vmagent -help`	2020-06-05 10:40:10 +03:00
Aliaksandr Valialkin	3d0a0b3785	lib/fs: optimize MustGetFreeSpace performance by caching the results for up to 2 seconds	2020-06-04 13:14:04 +03:00
DexterZhang	fa103875a0	feat(vmselect): add tmp block dir size metrics `vm_tmp_blocks_files_size_total` (#527 ) * feat(vmselect): add tmp block dir size metrics `vm_tmp_blocks_files_size_total` * refactor(vmselect): use free space instead of used space in tmp block file metrics * fix: add `bytes` suffix to tmp dir free space metric	2020-06-04 13:05:50 +03:00
Aliaksandr Valialkin	faea804b88	app/vmauth: log when -auth.config is reloaded in SIGHUP	2020-06-03 23:22:20 +03:00
Aliaksandr Valialkin	045b87c662	app/vmalert: fix comment for UpdateWith exported methods	2020-06-01 14:35:03 +03:00
Aliaksandr Valialkin	43b14b9569	app/vminsert/netstorage: free up unused memory in buffer after memory usage spikes	2020-06-01 14:33:35 +03:00
Roman Khavronenko	44c51c627f	vmalert: Add recording rules support. (#519 ) * vmalert: Add recording rules support. Recording rules support required additional service refactoring since it wasn't planned to support them from the very beginning. The list of changes is following: * new entity RecordingRule was added for writing results of MetricsQL expressions into remote storage; * interface Rule now unites both recording and alerting rules; * configuration parser was moved to separate package and now performs more strict validation; * new endpoint for listing all groups and rules in json format was added; * evaluation interval may be set to every particular group; * vmalert: uncomment tests * vmalert: rm outdated TODO * vmalert: fix typos in README	2020-06-01 13:53:46 +03:00
Aliaksandr Valialkin	37aa4fe282	app/vmagent: reload -remoteWrite.relabelConfig and -remoteWrite.urlRelabelConfig on SIGHUP and on `/-/reload` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/518	2020-05-30 14:37:02 +03:00
Aliaksandr Valialkin	a646131a33	app/vmagent: log fatal errors instead of panics when improper command-line flags are passed to vmagent	2020-05-30 14:22:38 +03:00
Aliaksandr Valialkin	f41a01332a	app/vminsert/netstorage: evenly distribute rerouted rows among all the availalbe storage nodes Previously such rows were distributed to the original storage node or to the next storage node. This may result to uneven load among the remaining storage nodes.	2020-05-30 13:51:09 +03:00
Aliaksandr Valialkin	02b2064d8e	app/vminsert/netstorage: do not increment vm_rpc_rows_lost_total when all the vmstorage nodes are unavailable, since vminsert retries sending the data instead of dropping it	2020-05-28 22:36:56 +03:00
Aliaksandr Valialkin	7a61357b5d	app/vminsert/netstorage: make sure that the the data is always replicated among -replicationFactor vmstorage nodes Previously vminsert could write multiple copies of the data to a single vmstorage node when the ingestion rate exceeds the maximum throughput for connections to vmstorage nodes.	2020-05-28 19:59:07 +03:00
Aliaksandr Valialkin	77e5165e7b	app/vminsert: add `-replicationFactor` command-line flag for enabling data replication among available -storageNode instances	2020-05-27 17:29:44 +03:00
Aliaksandr Valialkin	b4e3bffe4b	app/vminsert/netstorage: emit warnings instead of errors when re-routing data to healthy storage nodes	2020-05-27 16:31:41 +03:00
Aliaksandr Valialkin	75f2f3b09d	app/vminsert/netstorage: improve ingestion performance when a single vmstorage node is slower than other vmstorage nodes Previously the ingestion performance has been limited by the slowest vmstorage node. Now vminsert should re-route data from the slowest vmstorage node to the remaining nodes.	2020-05-27 15:08:22 +03:00
Aliaksandr Valialkin	9844845d79	app/vminsert: tune the maximum summary buffer size for pending data to 1/4 of available RAM, since 1/2 of RAM is too big considering GOGC overhead	2020-05-25 02:00:37 +03:00
Aliaksandr Valialkin	4a82631e44	app/vminsert: limit the summary buffer sizes for all the storage nodes to a half of the allowed memory	2020-05-25 01:39:33 +03:00
Aliaksandr Valialkin	4bd3d4b148	app/vminsert/netstorage: do not return error from storageNode.flushBufLocked when the buffer has been successfully re-routed to healthy nodes This should reduce the number of false errors in the log and the number of falsely lost rows	2020-05-22 18:29:43 +03:00
Aliaksandr Valialkin	6edc33d9bb	app/vminsert/netstorage: capture the first error instead of the last error when sending data to vmstorage The first error has more chances to point to the real root cause of the issue.	2020-05-22 17:49:33 +03:00
Aliaksandr Valialkin	bb4a2bf1aa	app/vmauth: fix `make run-vmauth` command	2020-05-22 16:45:19 +03:00
Aliaksandr Valialkin	dcbdc009f5	app/vmagent: check for error returned from flag.Set	2020-05-21 16:30:48 +03:00
Aliaksandr Valialkin	b59e089ac7	app/vmagent: add `-dryRun` option for checking all the configs mentioned in command-line flags without running `vmagent` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/362	2020-05-21 15:23:18 +03:00
Aliaksandr Valialkin	901093279e	app/vmstorage/transport: update stale comment - vmstorage now sends small `ack` packets to `vminsert`	2020-05-21 14:04:52 +03:00
kreedom	2752d6cb26	vmalert add quotes escape function (#510 ) * vmalert add quotes escape function Co-authored-by: kreedom	2020-05-21 12:10:35 +03:00
Aaron France	b26245c48b	Update README.md	2020-05-21 12:10:33 +03:00
Aliaksandr Valialkin	d83c68ca03	app/vmselect/promql: add `ascent_over_time(m[d])` and `descent_over_time(m[d])` functions These functions could be useful in GPS tracking apps for calculating the summary for height gain/loss over the given duration `d`.	2020-05-21 12:06:34 +03:00
Aliaksandr Valialkin	8ff28f5b91	app/vmselect/promql: update numbers after the upgrade of github.com/VictoriaMetrics/metrics from v1.11.2 to v1.11.3	2020-05-20 03:07:07 +03:00
Aliaksandr Valialkin	ddc9e69bd6	docs/vmagent.md: mention an alternative to `refresh_interval` option in scrape configs	2020-05-19 23:10:16 +03:00
Aliaksandr Valialkin	7d46dd452a	app/vmselect/promql: move common code from aggrFuncOutliersK and newAggrFuncRangeTopK into getRangeTopKTimeseries	2020-05-19 16:11:03 +03:00
Aliaksandr Valialkin	37068064dd	app/vmselect/promql: fix `outilersk` calculations	2020-05-19 14:45:10 +03:00
Aliaksandr Valialkin	fc81ea38d4	app/vmselect/promql: add `outliersk(N, m)` aggregate function for anomaly detection across groups of similar time series	2020-05-19 13:52:44 +03:00
Aliaksandr Valialkin	9ca781b8f0	app/vmalert/notifier: go fmt	2020-05-19 13:00:18 +03:00
kreedom	27911ae179	vmalert - add expr to variables, add escape functions (#495 ) * vmalert - add expr to variables, add escape functions Co-authored-by: kreedom	2020-05-19 11:55:03 +03:00
Roman Khavronenko	c7f3e58032	vmalert: avoid sending resolves for pending alerts (#498 ) Before the change we were sending notifications to notifier if following conditions are met: * alert is in Fire state * alert is in Inactive state We were sending Inactive notifications to resolve alert ASAP. Unfortunately, we were sending resolves for Pending alerts that become Inactive, which is wrong. In this change we delete alert from the active list if it was Pending and become Inactive. In this way we now have Inactive alerts only if they were in state Fire before. See test change for example.	2020-05-19 11:55:00 +03:00
Roman Khavronenko	e5f5342e18	vmalert: fix potential race during configuration reloads (#497 ) Configuration reload and rules evaluation can't be executed in same time now. This may make reload time longer but prevents from potential races.	2020-05-19 11:54:55 +03:00
Aliaksandr Valialkin	b99d03a956	app/vmalert: run `make quicktemplate-gen` from the root dir of the repository	2020-05-16 22:45:45 +03:00
Aliaksandr Valialkin	2784015a4d	all: print `--help` output to stdout instead of stderr This is easier to grep and pipe	2020-05-16 12:03:06 +03:00
Aliaksandr Valialkin	dbf8048134	app/vmrestore: document better that `vmrestore` works like `rsync --delete`, i.e. it deletes files in `-storageDataPath`, which are missing in the backup	2020-05-16 09:02:46 +03:00
Aliaksandr Valialkin	e544155a82	app/vmagent/Makefile: fix `make run-vmagent` rule	2020-05-15 19:35:16 +03:00
Aliaksandr Valialkin	6c43ba1cb1	app/vmagent/remotewrite: remove unused import after the commit `93267f143f`	2020-05-15 17:42:31 +03:00
Aliaksandr Valialkin	1d71253653	app/vmagent/remotewrite: allow ingesting time series with multiple samples at once Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/481	2020-05-15 17:37:27 +03:00
Aliaksandr Valialkin	a853869e75	app/vmstorage/transport: prevent from uncontrolled memory usage growth when `vminsert` sends big packets with too long labels Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/490	2020-05-15 15:42:54 +03:00
Aliaksandr Valialkin	1e5c1d7eaa	app/vmstorage: add `vm_slow_metric_name_loads_total` metric, which could be used as an indicator when more RAM is needed for improving query performance	2020-05-15 14:12:24 +03:00
Aliaksandr Valialkin	d6b9a49481	app/vmstorage: add `vm_slow_row_inserts_total` and `vm_slow_per_day_index_inserts_total` metrics for determining whether VictoriaMetrics required more RAM for the current number of active time series	2020-05-15 13:46:57 +03:00
Roman Khavronenko	e850bf0eff	vmalert: fix the access to rules slice element by wrong index (#486 ) During group's update rules deletion was causing slice mutations while slice index was assumed to be unchanged. This caused "slice bounds out of range" errors when multiple rules were deleted sequentially.	2020-05-15 13:26:06 +03:00
hagen1778	d369450f27	vmalert: update README	2020-05-15 13:26:04 +03:00
Aliaksandr Valialkin	3845420a8f	lib: extract common code for returning fast unix timestamp into lib/fasttime	2020-05-14 23:06:50 +03:00
Roman Khavronenko	e208e76222	vmalert: check if remoteRead object was initied before calling Restore (#473 ) The check for non-nil remoteRead was mistakenly dropped during refactoring which caused panics when `vmalert` wasn't configured with `remoteRead` flag.	2020-05-13 22:57:26 +03:00
Roman Khavronenko	1523890742	vmalert: fix flag names and description in README (#475 ) Change also adds the recommendation for `remotewrite` queue error.	2020-05-13 22:57:20 +03:00
肖贝贝	8c3e9adf7f	Feat/vmalert add max queue size (#472 ) * feat: add remoteWrite.maxQueueSize to reduce queue full * rename remote(write\|read) flags to remote(Write\|Read) for the sake of consistency Co-authored-by: xiaobeibei <xiaobeibei@bigo.sg>	2020-05-13 22:57:16 +03:00
Aliaksandr Valialkin	bac9a684e8	docs/vmbackup.md: add a link to vmbackuper tool	2020-05-13 22:57:11 +03:00
Aliaksandr Valialkin	f3d9a5b0ec	app/vmselect/promql: suppress "SA4006: this value of `dstValues` is never used" error in golangci-lint	2020-05-13 11:46:05 +03:00
Aliaksandr Valialkin	3b0f66a227	app/vmagent: fix a bug with improper relabeling when multiple `-remoteWrite.urlRelableConfig` args are set This bug could result in incorrect relabeling and metrics' drop. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/467	2020-05-12 22:03:45 +03:00
Aliaksandr Valialkin	18a0caee43	app/vmselect/promql: fix `any(..)` calculations - return all the data points instead of the first one	2020-05-12 20:36:49 +03:00
Aliaksandr Valialkin	3d3f41b961	app/vmstorage/transport: fix panic during server stop on 32-bit arches See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/212	2020-05-12 20:21:40 +03:00
Aliaksandr Valialkin	81b8811cf4	app/vmselect/promql: remove `-search.maxPointsPerTimeseries` command-line flag Limit the estimated time series count after aggregation with grouping by the number of source time series.	2020-05-12 19:54:44 +03:00
Aliaksandr Valialkin	408ade27a9	app/vmselect/promql: add `any(x) by (y)` aggregate function, which returns any time series from `q` for each group `y`	2020-05-12 19:50:29 +03:00
Aliaksandr Valialkin	21c2982ac8	app/vmselect/promql: support for `sum(x) by (y) limit N` syntax in order to limit the number of output time series after aggregation	2020-05-12 19:50:12 +03:00
Aliaksandr Valialkin	f341c6fcc4	Revert "app/vmselect: add `-search.estimatedSeriesCountAfterAggregation` command-line flag for tuning the probability of OOMs or false-positive `not enough memory` errors" This reverts commit fbb7986dd2380fce2fc8633b7eda8b67f419e74c. Reason for revert: this commit has been removed from single-node version	2020-05-12 19:50:08 +03:00
Aliaksandr Valialkin	d54a93fc81	app/vmagent: fix scraping mTLS targets, which has been broken in v1.35.1 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/470	2020-05-12 17:23:43 +03:00
Aliaksandr Valialkin	405cf44aed	app/vmagent,lib/promscrape: do not set HostClient.DialDualStack, since it isnt used if HostClient.Dial is set	2020-05-12 15:24:53 +03:00
Aliaksandr Valialkin	da6a84e147	app/vmagent/remotewrite: properly dial TCP6 addresses set via `-remoteWrite.url` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/469	2020-05-12 15:24:50 +03:00
Aliaksandr Valialkin	4e237b4670	app/vminsert/influx: support passing AccountID and ProjectID via plain TCP and UDP Now `vminsert` accepts AccountID and ProjectID via `VictoriaMetrics_AccountID` and `VictoriaMetrics_ProjectID` tags when reading Influx line protocol data via plain TCP or UDP (i.e. when `-influxListenAddr` is set).	2020-05-12 13:13:04 +03:00
Aliaksandr Valialkin	f7753b1469	lib/storage: gradually pre-populate per-day inverted index for the next day This should prevent from CPU usage spikes at 00:00 UTC every day when inverted index for new day must be quickly created for all the active time series. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/430	2020-05-12 12:13:32 +03:00
Roman Khavronenko	0157566fdb	vmalert: cleanup and restructure of code to improve maintainability (#471 ) The change introduces new entity `manager` which replaces `watchdog`, decouples requestHandler and groups. Manager supposed to control life cycle of groups, rules and config reloads. Groups export an ID method which returns a hash from filename and group name. ID supposed to be unique identifier across all loaded groups. Some tests were added to improve coverage. Bug with wrong annotation value if $value is used in templates after metrics being restored fixed. Notifier interface was extended to accept context. New set of metrics was introduced for config reload.	2020-05-11 14:35:55 +03:00
Nikolay Khramchikhin	0e8c345ffb	vmalert config reload added config hot reload for vmalert with sighup and api call	2020-05-11 14:35:50 +03:00
Aliaksandr Valialkin	6646b380ef	docs/vmauth.md: fix a link to docker images	2020-05-08 14:11:10 +03:00
Aliaksandr Valialkin	28ad350a31	app/vmagent: return 200 from `/-/reload` endpoint as Prometheus does	2020-05-07 19:29:48 +03:00
Aliaksandr Valialkin	3052b479b7	lib/httpserver: reduce typical duration for http server graceful shutdown Previously the duration for graceful shutdown for http server could take more than a minute because of imporperly set timeouts in setNetworkTimeout. Now typical duration for graceful shutdown should be reduced to less than 5 seconds.	2020-05-07 14:16:38 +03:00
Aliaksandr Valialkin	dc04040781	docs/{vmagent,vmauth}: small clarifications in the docs	2020-05-07 12:55:06 +03:00
Aliaksandr Valialkin	2b403d3f42	app/vmauth: prevent from attacks with `..` in path for accessing resources outside the configured `url_prefix`	2020-05-07 12:55:04 +03:00
Aliaksandr Valialkin	20538a2a5d	app/vmagent: allow setting independent auth configs per each configured `-remoteWrite.url`	2020-05-06 16:52:32 +03:00
Aliaksandr Valialkin	12dbb9e22c	app/vmagent: properly set client-side TLS certificates for `-remoteWrite.url`. Previously they were mistakenly set as server-side	2020-05-06 16:50:37 +03:00
Aliaksandr Valialkin	8665c2edb1	docs/vmagent.md: small fixes	2020-05-06 14:49:25 +03:00
Aliaksandr Valialkin	8ab5e47b5c	lib/promscrape: add Prometheus-compatible DNS-based service discovery aka `dns_sd_configs`	2020-05-06 00:02:41 +03:00
Aliaksandr Valialkin	21b91599c2	docs/{vmauth,vmagent}: fix ports for profiling	2020-05-05 20:16:09 +03:00
Aliaksandr Valialkin	309700ab8c	docs/vmauth.md: mention that we can help creating customized proxy	2020-05-05 12:34:08 +03:00
Aliaksandr Valialkin	20e958789a	docs/{vmagent,vmauth}: add `Profiling` section	2020-05-05 11:45:29 +03:00
Aliaksandr Valialkin	1153f30fee	docs: add vmauth.md	2020-05-05 11:17:45 +03:00
Aliaksandr Valialkin	782fb30cd0	app/vmauth: build fixes	2020-05-05 11:03:25 +03:00
Aliaksandr Valialkin	de31d16154	app/vmauth: add initial version of vmauth. See https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmauth/README.md for details	2020-05-05 10:56:20 +03:00
Aliaksandr Valialkin	61df59b9ea	docs/vmagent.md: `/targets` page doesnt expose infomration about imporperly configured scrape configs now. It is written in error log instead	2020-05-05 10:56:18 +03:00
Roman Khavronenko	abce2b092f	app/vmalert: restore alerts state from datasource metrics (#461 ) * app/vmalert: restore alerts state from datasource metrics Vmalert will restore alerts state for rules that have `rule.For` > 0 from previously written timeseries via `remotewrite.url` flag. * app/vmalert: mention remotewerite and remoteread configuration in README	2020-05-05 00:52:19 +03:00
Aliaksandr Valialkin	89aa6dbf56	lib/promscrape: add Prometheus-compatible service discovery for Consul aka `consul_sd_configs` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/330	2020-05-04 20:53:06 +03:00
Aliaksandr Valialkin	b21b73115a	app/vminsert: add `/-/reload` handler in the same way as for `vmagent`	2020-04-30 02:18:08 +03:00
DexterZhang	ae215e5538	feat(vmagent): add promscrap config reload suppport via http (#450 ) * feat(vmagent): add promscrap config reload suppport via http endpoint `/-/reload` * fix: typo fix	2020-04-30 02:18:01 +03:00
Artem Navoiev	121f7e1d56	Update README.md	2020-04-29 17:41:04 +03:00
Aliaksandr Valialkin	b6d88bac04	vendor: use github.com/VictoriaMetrics/fasthttp instead of github.com/fasthttp/fasthttp The upstream fasthttp may contain issues like `996610f021` , plus a code that isn't used by VictoriaMetrics. So let's use a private copy under our control instead.	2020-04-29 16:43:09 +03:00
Aliaksandr Valialkin	9ed4951ec8	lib/metricsql: move it to a separate repository - github.com/VictoriaMetrics/metrics	2020-04-28 15:30:06 +03:00
Aliaksandr Valialkin	cd1145e5f4	app/vmselect: add `-search.estimatedSeriesCountAfterAggregation` command-line flag for tuning the probability of OOMs or false-positive `not enough memory` errors	2020-04-28 12:51:48 +03:00
Aliaksandr Valialkin	a858b7e393	app/vmalert: added missing comments for public entities	2020-04-28 11:19:48 +03:00
Aliaksandr Valialkin	716bbe79d4	app/vminsert/netstorage: increase timeout for waiting for `ack` message after sending big data block to vmstorage	2020-04-28 11:19:46 +03:00
Aliaksandr Valialkin	50af16baf2	app/vmalert: fix build	2020-04-28 00:34:01 +03:00
Aliaksandr Valialkin	e3db2c73a6	app/vmalert: sync with master branch	2020-04-28 00:19:42 +03:00
Aliaksandr Valialkin	7644f40763	app/vmalert: include it into the next release	2020-04-28 00:11:41 +03:00
Aliaksandr Valialkin	86a1d9cb0c	lib/promscrape: add initial support for Prometheus-compatible service discovery for Amazon EC2 aka `ec2_sd_configs`	2020-04-27 19:29:22 +03:00
Aliaksandr Valialkin	0daa37fa02	lib/promscrape/discovery/gce: allow empty project and zone for gce_sd_config	2020-04-27 11:45:45 +03:00
Aliaksandr Valialkin	989d84cf3f	app/{vminsert,vmstorage}: wait for `ack` from `vmstorage` after each packet sent to it from `vminsert` This should protect from possible data loss when `vmstorage` is stopped while the packet is sent from `vminsert`. This commit switches to new protocol between vminsert and vmstorage, which is incompatible with the previous protocol. So it is required that both vminsert and vmstorage nodes are updated.	2020-04-27 09:53:26 +03:00
Aliaksandr Valialkin	e933cbac16	lib/storage: postpone reading data from blocks during search This eliminates the need for storing block data into temporary files on a single-node VictoriaMetrics during heavy queries, which touch big number of time series over long time ranges. This improves single-node VM performance on heavy queries by up to 2x.	2020-04-27 08:44:01 +03:00
Aliaksandr Valialkin	23a310cc68	app/vmselect/netstorage: substitute sorting packedTimeseries with the natural order of the fetched blocks This should minimize the number of disk seeks when reading data from temporary file.	2020-04-26 16:46:17 +03:00
Aliaksandr Valialkin	31861c5b8e	lib/promscrape/discovery/gce: allow empty `zone` arg in `gce_sd_config` - in this case zones for the given project are automatically discovered	2020-04-26 14:37:38 +03:00

... 2 3 4 5 6 ...

724 Commits