VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-12-22 16:36:27 +01:00

Author	SHA1	Message	Date
Aliaksandr Valialkin	5ae47e8940	app/vmselect/prometheus: properly adjust too big time `time` on `/api/v1/query` Too big `time` must be adjusted to `now()-queryOffset`.	2019-11-19 00:42:07 +02:00
Aliaksandr Valialkin	6ca4b94511	lib/storage: increase the number of created time series in BenchmarkHeadPostingForMatchers in order to be on par with Promethues The previous commit was accidentally creating 10x smaller number of time series than Prometheus and this led to invalid benchmark results. The updated benchmark results: benchmark old ns/op new ns/op delta BenchmarkHeadPostingForMatchers/n="1" 272756688 6194893 -97.73% BenchmarkHeadPostingForMatchers/n="1",j="foo" 138132923 10781372 -92.19% BenchmarkHeadPostingForMatchers/j="foo",n="1" 134723762 10632834 -92.11% BenchmarkHeadPostingForMatchers/n="1",j!="foo" 195823953 10679975 -94.55% BenchmarkHeadPostingForMatchers/i=~"." 7962582919 100118510 -98.74% BenchmarkHeadPostingForMatchers/i=~".+" 7589543864 154955671 -97.96% BenchmarkHeadPostingForMatchers/i=~"" 1142371741 258003769 -77.42% BenchmarkHeadPostingForMatchers/i!="" 9964150263 159783895 -98.40% BenchmarkHeadPostingForMatchers/n="1",i=~".",j="foo" 216995884 10937895 -94.96% BenchmarkHeadPostingForMatchers/n="1",i=~".",i!="2",j="foo" 202541348 10990027 -94.57% BenchmarkHeadPostingForMatchers/n="1",i!="" 486285711 87004349 -82.11% BenchmarkHeadPostingForMatchers/n="1",i!="",j="foo" 350776931 53342793 -84.79% BenchmarkHeadPostingForMatchers/n="1",i=~".+",j="foo" 380888565 54256156 -85.76% BenchmarkHeadPostingForMatchers/n="1",i=~"1.+",j="foo" 89500296 21823279 -75.62% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!="2",j="foo" 379529654 46671359 -87.70% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!~"2.",j="foo" 424563825 53915842 -87.30% VictoriaMetrics uses 1GB of RAM during the benchmark (vs 3.5GB of RAM for Prometheus)	2019-11-18 19:48:27 +02:00
Aliaksandr Valialkin	6f61fd367a	lib/storage: add BenchmarkHeadPostingForMatchers similar to the benchmark from Prometheus See the corresponding benchmark in Prometheus - `23c0299d85/tsdb/head_bench_test.go (L52)` The benchmark allows performing apples-to-apples comparison of time series search in Prometheus and VictoriaMetrics. The following article - https://www.robustperception.io/evaluating-performance-and-correctness - contains incorrect numbers for VictoriaMetrics, since there wasn't this benchmark yet. Fix it. Benchmarks can be repeated with the following commands from Prometheus and VictoriaMetrics source code roots: - Prometheus: GOMAXPROCS=1 go test ./tsdb/ -run=111 -bench=BenchmarkHeadPostingForMatchers - VictoriaMetrics: GOMAXPROCS=1 go test ./lib/storage/ -run=111 -bench=BenchmarkHeadPostingForMatchers Benchmark results: benchmark old ns/op new ns/op delta BenchmarkHeadPostingForMatchers/n="1" 272756688 364977 -99.87% BenchmarkHeadPostingForMatchers/n="1",j="foo" 138132923 1181636 -99.14% BenchmarkHeadPostingForMatchers/j="foo",n="1" 134723762 1141578 -99.15% BenchmarkHeadPostingForMatchers/n="1",j!="foo" 195823953 1148056 -99.41% BenchmarkHeadPostingForMatchers/i=~"." 7962582919 8716755 -99.89% BenchmarkHeadPostingForMatchers/i=~".+" 7589543864 12096587 -99.84% BenchmarkHeadPostingForMatchers/i=~"" 1142371741 16164560 -98.59% BenchmarkHeadPostingForMatchers/i!="" 9964150263 12230021 -99.88% BenchmarkHeadPostingForMatchers/n="1",i=~".",j="foo" 216995884 1173476 -99.46% BenchmarkHeadPostingForMatchers/n="1",i=~".",i!="2",j="foo" 202541348 1299743 -99.36% BenchmarkHeadPostingForMatchers/n="1",i!="" 486285711 11555193 -97.62% BenchmarkHeadPostingForMatchers/n="1",i!="",j="foo" 350776931 5607506 -98.40% BenchmarkHeadPostingForMatchers/n="1",i=~".+",j="foo" 380888565 6380335 -98.32% BenchmarkHeadPostingForMatchers/n="1",i=~"1.+",j="foo" 89500296 2078970 -97.68% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!="2",j="foo" 379529654 6561368 -98.27% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!~"2.",j="foo" 424563825 6757132 -98.41% The first column (old) is for Prometheus, the second column (new) is for VictoriaMetrics. Prometheus was using 3.5GB of RAM during the benchmark, while VictoriaMetrics was using 400MB of RAM.	2019-11-18 18:47:02 +02:00
Aliaksandr Valialkin	77bb66a5be	app/vmselect/promql: properly calculate `integrate(q[d])`	2019-11-13 21:11:03 +02:00
Aliaksandr Valialkin	c33640664a	app/vmselect/promql: use universal approach for determining maxByteSliceLen on 32-bit and 64-bit archs Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/235	2019-11-13 20:26:07 +02:00
Aliaksandr Valialkin	d297b65089	lib/storage: add `vm_cache_size_bytes{type="storage/hour_metric_ids"}` metric	2019-11-13 20:26:05 +02:00
Aliaksandr Valialkin	31376fd353	deployment/docker: update docker image tag from v1.29.2-cluster to v1.29.3-cluster	2019-11-13 18:32:08 +02:00
Aliaksandr Valialkin	494ad0fdb3	lib/storage: remove inmemory index for recent hour, since it uses too much memory Production workload shows that the index requires ~4Kb of RAM per active time series. This is too much for high number of active time series, so let's delete this index. Now the queries should fall back to the index for the current day instead of the index for the recent hour. The query performance for the current day index should be good enough given the 100M rows/sec scan speed per CPU core.	2019-11-13 18:08:58 +02:00
Aliaksandr Valialkin	90bde025f0	deployment/docker: update image tag from v1.29.0-cluster to v1.29.2-cluster	2019-11-13 15:24:44 +02:00
Aliaksandr Valialkin	633dd81bb5	lib/storage: add `-disableRecentHourIndex` flag for disabling inmemory index for recent hour This may be useful for saving RAM on high number of time series aka high cardinality	2019-11-13 15:10:12 +02:00
Aliaksandr Valialkin	f1620ba7c0	lib/storage: fix inmemory inverted index issues found in v1.29 Issues fixed: - Slow startup times. Now the index is loaded from cache during start. - High memory usage related to superflouos index copies every 10 seconds.	2019-11-13 13:35:38 +02:00
Aliaksandr Valialkin	87b39222be	Revert "lib/fs: do not postpone directory removal on NFS error" This reverts commit 21aeb02b46649ac9906cb37733f7b155a77a0db9.	2019-11-12 16:29:50 +02:00
Mike Poindexter	955a592106	Add test for invalid caching of tsids (#232 ) * Add test for invalid caching of tsids * Clean up error handling	2019-11-12 15:52:46 +02:00
Roman Khavronenko	ce8cc76a42	add links and fix cache metric name (#233 )	2019-11-12 15:06:56 +02:00
Aliaksandr Valialkin	6afb7a50a9	deployment/docker: upgrade Grafana release from v6.4.3 to v6.4.4	2019-11-12 03:50:54 +02:00
Aliaksandr Valialkin	5b677a57e3	deployment/docker: upgrade Go from v1.13.4 to v1.13.4	2019-11-12 03:49:07 +02:00
Aliaksandr Valialkin	d420871d79	deployment/docker: upgrade docker image tag from v1.28.3-cluster to v1.29.0-cluster	2019-11-12 03:44:45 +02:00
Aliaksandr Valialkin	584d8362c8	deployment: update Prometheus from v2.13.0 to v2.14.0	2019-11-12 03:43:59 +02:00
Roman Khavronenko	828f0a2a4b	prepare dashboard for external sharing (#231 )	2019-11-12 00:23:24 +02:00
Oleg Kovalov	74ba42d111	fix misspelled words (#229 )	2019-11-12 00:18:24 +02:00
Aliaksandr Valialkin	c48e39eea9	lib/storage: add tests for dateMetricIDCache	2019-11-11 13:21:05 +02:00
Aliaksandr Valialkin	bdc9045485	README.md: mention that replication doesnt save from disaster	2019-11-11 00:58:08 +02:00
Aliaksandr Valialkin	01801e9e03	dashboards: there will no 1.28.4 release. It will be 1.29.0	2019-11-10 22:05:10 +02:00
Aliaksandr Valialkin	6bdde0d6d4	lib/storage: eliminate data race when updating lastSyncTime in dateMetricIDCache.Has	2019-11-10 22:04:23 +02:00
Roman Khavronenko	7247a7862d	add description, churn rate panel, storage.ingestion rate panel (#228 )	2019-11-10 20:32:10 +02:00
Aliaksandr Valialkin	5f52eb7653	lib/fs: do not postpone directory removal on NFS error Continue trying to remove NFS directory on temporary errors for up to a minute. The previous async removal process breaks in the following case during VictoriaMetrics start - VictoriaMetrics opens index, finds incomplete merge transactions and starts replaying them. - The transaction instructs removing old directories for parts, which were already merged into bigger part. - VictoriaMetrics removes these directories, but their removal is delayed due to NFS errors. - VictoriaMetrics scans partition directory after all the incomplete merge transactions are finished and finds directories, which should be removed, but weren't still removed due to NFS errors. - VictoriaMetrics panics when it finds unexpected empty directory. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/162	2019-11-10 13:27:16 +02:00
Aliaksandr Valialkin	9ea2bd822e	lib/storage: implement per-day inverted index	2019-11-10 00:20:32 +02:00
Aliaksandr Valialkin	5d8de72414	app/vmrestore: the upcoming release would be 1.29.0	2019-11-10 00:20:18 +02:00
Aliaksandr Valialkin	dea2f3efed	lib/storage: use specialized cache for (date, metricID) entries This improves ingestion performance.	2019-11-09 23:09:18 +02:00
Aliaksandr Valialkin	9a43902bd8	lib/storage: remove unused code from getMetricIDsForTimeRange: it is expected that time range is always non-zero	2019-11-09 19:03:51 +02:00
Aliaksandr Valialkin	c16e17dede	lib/storage: properly set time range when deleting time series	2019-11-09 18:50:02 +02:00
Aliaksandr Valialkin	8126007c15	lib/storage: obtain all the time series ids from (tag->metricIDs) rows instead of (metricID->TSID) rows, since this much faster	2019-11-09 18:04:26 +02:00
Aliaksandr Valialkin	50773348d3	lib/storage: small code prettifying	2019-11-09 14:01:24 +02:00
Aliaksandr Valialkin	44fa8226df	lib/uint64set: remove superflouos check for item existence before deleting it in Set.Subtract	2019-11-09 14:01:24 +02:00
Aliaksandr Valialkin	0bc54c23ce	lib/storage: inmemoryInvertedIndex prettifying	2019-11-09 14:01:24 +02:00
Aliaksandr Valialkin	46e67bb78c	lib/storage: export `vm_new_timeseries_created_total` metric for determining time series churn rate	2019-11-08 19:58:21 +02:00
Aliaksandr Valialkin	0063c857f5	lib/storage: add inmemory inverted index for the last hour It should improve performance for `last N hours` dashboards with update intervals smaller than 1 hour.	2019-11-08 19:37:46 +02:00
Aliaksandr Valialkin	33abbec6b4	app/vmselect/promql: adjust memory limits calculations for incremental aggregate functions Incremental aggregate functions don't keep all the selected time series in memory - they keep only up to GOMAXPROCS time series for incremental aggregations. Take into account that the number of time series in RAM can be higher if they are split into many groups with `by (...)` or `without (...)` modifiers. This should reduce the number of `not enough memory for processing ... data points` false positive errors.	2019-11-08 19:37:43 +02:00
Aliaksandr Valialkin	7d7fbf890e	app/{vmbackup,vmrestore}: add `vmbackup` and `vmrestore` tools for creating backups on s3 or gcs from instant snapshots Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/203 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/38	2019-11-07 21:26:43 +02:00
Roman Khavronenko	4e7a2a41a4	Cluster dashboard (#222 ) * add dashboard for cluster version * fix queries and panels * review fixes * use resident memory for memory usage panel * fix job selectors	2019-11-07 12:09:27 +02:00
Aliaksandr Valialkin	89c03a5464	lib/storage: populate partition names from both `small` and `big` directories Certain partition directories may be missing after restoring from backups if they had no data. Re-create such directories on start.	2019-11-06 19:50:21 +02:00
Aliaksandr Valialkin	1c777e0245	lib/storage: substitute error message about unsorted items in the index block after metricIDs merge with counter The origin of the error has been detected and documented in the code, so it is enough to export a counter for such errors at `vm_index_blocks_with_metric_ids_incorrect_order_total`, so it could be monitored and alerted on high error rates. Export also the counter for processed index blocks with metricIDs - `vm_index_blocks_with_metric_ids_processed_total`, so its' rate could be compared to `rate(vm_index_blocks_with_metric_ids_incorrect_order_total)`.	2019-11-06 14:32:41 +02:00
Aliaksandr Valialkin	c567a4353a	lib/storage: take into account the requested time range when caching TSIDs for the given tag filters	2019-11-06 14:32:41 +02:00
Aliaksandr Valialkin	c6564c5d26	lib/storage: dump incorrectly sorted items on a single line; this should simplify error reporting	2019-11-05 18:41:50 +02:00
Aliaksandr Valialkin	2ef5082ead	deployment/docker: update docker images from v1.28.2-cluster to v1.28.3-cluster	2019-11-05 18:08:50 +02:00
Aliaksandr Valialkin	a10c4cad85	lib/storage: return back finalPartsToMerge from 2 to 3 in order to prevent from excessive merges in old partitions	2019-11-05 17:28:57 +02:00
Aliaksandr Valialkin	e5b1fa0c38	lib/storage: separate the max inverted index scan loops per metric into fast and slow loops Slow loops could require seeks and expensive regexp matching, while fast loops just scans all the metricIDs for the given `tag=value` prefix. So these operations must have separate max loops multiplier.	2019-11-05 17:28:57 +02:00
Aliaksandr Valialkin	f93c4f2493	lib/storage: skip repeated useless work when intersection of metricIDs with the given filter is too expensive This should improve performance for query filters over big number of time series.	2019-11-05 14:35:55 +02:00
Aliaksandr Valialkin	f48e97263c	lib/storage: reduce the maximum inverted index scans before giving up to label filters matching by metric name The new value reduces the amount of wasted work during index scans over big number of time series.	2019-11-05 14:35:53 +02:00
Aliaksandr Valialkin	d2f688c550	lib/storage: try potentially faster tag filters at first, then apply slower tag filters The fastest tag filters are non-negative non-regexp, since they are the most specific. The slowest tag filters are negative regexp, since they require scanning all the entries for the given label.	2019-11-05 14:35:48 +02:00

... 167 168 169 170 171 ...

8918 Commits