VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-12-22 16:36:27 +01:00

Author	SHA1	Message	Date
Aliaksandr Valialkin	972713bd79	lib/storage: add fast path for the previous indexdb search if it doesn't contain per-day inverted index yet	2020-03-31 12:51:21 +03:00
Aliaksandr Valialkin	5d99ca6cfc	lib/storage: optimize per-day inverted index search for tag filters matching big number of time series - Sort tag filters in the ascending number of matching time series in order to apply the most specific filters first. - Fall back to metricName search for filters matching big number of time series (usually this are negative filters or regexp filters).	2020-03-31 00:48:35 +03:00
Aliaksandr Valialkin	df91d2d91f	lib/storage: remove obsolete code	2020-03-13 22:48:17 +02:00
Aliaksandr Valialkin	f9289b804a	lib/storage: reduce memory allocations when merging metricID sets	2020-01-17 22:10:44 +02:00
Aliaksandr Valialkin	a247236f61	lib/storage: fall back to global inverted index if a filter match too many time series in per-day index Previously this resulted to error message. The query may succeed via search in global index.	2019-12-03 14:48:31 +02:00
Aliaksandr Valialkin	f52874dab4	lib/storage: optimize regexp filter search	2019-12-03 00:43:12 +02:00
Aliaksandr Valialkin	20812008a7	lib/storage: remove metricID with missing metricID->metricName entry The metricID->metricName entry can be missing in the indexdb after unclean shutdown when only a part of entries for new time series is written into indexdb. Recover from such a situation by removing the broken metricID. New metricID will be automatically created for time series with the given metricName when new data point will arive to it.	2019-12-02 20:46:44 +02:00
Aliaksandr Valialkin	da98703748	app/vmselect/promql: optimize binary search over big number of samples during rollup calculations	2019-11-25 14:01:46 +02:00
Aliaksandr Valialkin	f652c0f40f	lib/storage: move non-matching tag filters to the top at matchTagFilters This should reduce the amount of useless work needed for matching the next metricNames.	2019-11-21 21:35:13 +02:00
Aliaksandr Valialkin	b8cde6cce1	lib/storage: speed up time series search for queries with multiple filters Use optimized specialized binary search for uint64 metricIDs instead of generic sort.Search.	2019-11-21 18:43:17 +02:00
Aliaksandr Valialkin	2ab4cea5e5	lib/storage: always start using per-day inverted index on the next day after its creation The current day could miss entries for already stopped time series before enabling per-day index. This fixes the issue when queries return empty results during the first hour after upgrading to v1.29.*	2019-11-16 12:11:25 +02:00
Aliaksandr Valialkin	86a1cd700b	lib/storage: remove inmemory index for recent hour, since it uses too much memory Production workload shows that the index requires ~4Kb of RAM per active time series. This is too much for high number of active time series, so let's delete this index. Now the queries should fall back to the index for the current day instead of the index for the recent hour. The query performance for the current day index should be good enough given the 100M rows/sec scan speed per CPU core.	2019-11-13 17:58:07 +02:00
Aliaksandr Valialkin	33895d4a0f	lib/storage: add missing increment for recentHourInvertedIndexSearchCalls	2019-11-13 15:13:51 +02:00
Aliaksandr Valialkin	c57eb0ff83	lib/storage: add `-disableRecentHourIndex` flag for disabling inmemory index for recent hour This may be useful for saving RAM on high number of time series aka high cardinality	2019-11-13 15:02:51 +02:00
Aliaksandr Valialkin	ca259864e2	lib/storage: return back inmemory inverted index for recent hour Issues fixed: - Slow startup times. Now the index is loaded from cache during start. - High memory usage related to superflouos index copies every 10 seconds.	2019-11-13 13:11:04 +02:00
Aliaksandr Valialkin	01bb3c06c7	lib/storage: remove inmemory inverted index for recent hours Production load with >10M active time series showed it could slow down VictoriaMetrics startup times and could eat all the memory leading to OOM. Remove inmemory inverted index for recent hours until thorough testing on production data shows it works OK.	2019-11-13 10:45:53 +02:00
Aliaksandr Valialkin	8e8f98f712	lib/storage: add tests for dateMetricIDCache	2019-11-11 13:21:57 +02:00
Aliaksandr Valialkin	3956003dd0	lib/storage: reorganize the code in getStartDateForPerDayInvertedIndex according to golangci-lint	2019-11-10 00:38:59 +02:00
Aliaksandr Valialkin	ee7765b10d	lib/storage: implement per-day inverted index	2019-11-10 00:02:46 +02:00
Aliaksandr Valialkin	5810ba57c2	lib/storage: use specialized cache for (date, metricID) entries This improves ingestion performance.	2019-11-09 23:06:11 +02:00
Aliaksandr Valialkin	e573ef2126	lib/storage: remove unused code from getMetricIDsForTimeRange: it is expected that time range is always non-zero	2019-11-09 19:03:34 +02:00
Aliaksandr Valialkin	823fa085ef	lib/storage: properly set time range when deleting time series	2019-11-09 18:49:49 +02:00
Aliaksandr Valialkin	695c1dc5eb	lib/storage: obtain all the time series ids from (tag->metricIDs) rows instead of (metricID->TSID) rows, since this much faster	2019-11-09 18:04:33 +02:00
Aliaksandr Valialkin	cdbe848102	lib/storage: small code prettifying	2019-11-09 14:19:52 +02:00
Aliaksandr Valialkin	6ad7fe8eeb	lib/storage: export `vm_new_timeseries_created_total` metric for determining time series churn rate	2019-11-08 21:21:07 +02:00
Aliaksandr Valialkin	d888b21657	lib/storage: add inmemory inverted index for the last hour It should improve performance for `last N hours` dashboards with update intervals smaller than 1 hour.	2019-11-08 21:21:07 +02:00
Aliaksandr Valialkin	e472f0b23b	lib/storage: substitute error message about unsorted items in the index block after metricIDs merge with counter The origin of the error has been detected and documented in the code, so it is enough to export a counter for such errors at `vm_index_blocks_with_metric_ids_incorrect_order_total`, so it could be monitored and alerted on high error rates. Export also the counter for processed index blocks with metricIDs - `vm_index_blocks_with_metric_ids_processed_total`, so its' rate could be compared to `rate(vm_index_blocks_with_metric_ids_incorrect_order_total)`.	2019-11-06 14:28:11 +02:00
Aliaksandr Valialkin	c51ca04a43	lib/storage: take into account the requested time range when caching TSIDs for the given tag filters	2019-11-06 14:28:11 +02:00
Aliaksandr Valialkin	e37f06dc52	lib/storage: dump incorrectly sorted items on a single line; this should simplify error reporting	2019-11-05 18:44:22 +02:00
Aliaksandr Valialkin	885ba17905	lib/storage: separate the max inverted index scan loops per metric into fast and slow loops Slow loops could require seeks and expensive regexp matching, while fast loops just scans all the metricIDs for the given `tag=value` prefix. So these operations must have separate max loops multiplier.	2019-11-05 17:27:48 +02:00
Aliaksandr Valialkin	b9a06e8e74	lib/storage: skip repeated useless work when intersection of metricIDs with the given filter is too expensive This should improve performance for query filters over big number of time series.	2019-11-05 14:19:13 +02:00
Aliaksandr Valialkin	30c8301b11	lib/storage: reduce the maximum inverted index scans before giving up to label filters matching by metric name The new value reduces the amount of wasted work during index scans over big number of time series.	2019-11-05 14:19:06 +02:00
Aliaksandr Valialkin	e53f9e553d	lib/storage: try potentially faster tag filters at first, then apply slower tag filters The fastest tag filters are non-negative non-regexp, since they are the most specific. The slowest tag filters are negative regexp, since they require scanning all the entries for the given label.	2019-11-05 14:19:01 +02:00
Aliaksandr Valialkin	02e0b19a62	lib/storage: tune the returned value from adjustMaxMetricsAdaptive	2019-11-04 00:44:37 +02:00
Aliaksandr Valialkin	6be4456d88	lib/{storage,uint64set}: add Set.Union() function and use it	2019-11-04 00:44:37 +02:00
Aliaksandr Valialkin	97ce4e03a5	all: add support for GOARCH=386 and fix all the issues related to 32-bit architectures such as GOARCH=arm Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/212	2019-10-17 18:23:23 +03:00
Aliaksandr Valialkin	f6334bffa1	lib/storage: harden the check that the original items are sorted after mergeTagToMetricIDsRows fails to preserve sort order	2019-10-09 12:13:17 +03:00
Aliaksandr Valialkin	c1cf7d9f93	lib/storage: add tests for mergeTagToMetricIDsRows and return the original items if the function breaks items` ordering. This should save from data corruption issues revealed in the previous releases up to v1.28.0-beta5.	2019-10-08 16:27:35 +03:00
Aliaksandr Valialkin	c39355921e	lib/storage: verify whether items are sorted in the end of call to mergeTagToMetricIDsRows This should prevent from inverted index corruption if bug in mergeTagToMetricIDsRows is discovered.	2019-09-26 13:13:41 +03:00
Aliaksandr Valialkin	2444433d83	lib/storage: add missing break in removeDuplicateMetricIDs	2019-09-25 18:23:43 +03:00
Aliaksandr Valialkin	ea4c828bae	lib/storage: remove duplicate MetricIDs in `tag->metricIDs` items before writing them into inverted index	2019-09-25 17:55:13 +03:00
Aliaksandr Valialkin	aebc45ad26	lib/{mergeset,storage}: do not cache inverted index blocks containing `tag->metricIDs` items This should reduce the amounts of used RAM during queries with filters over big number of time series.	2019-09-25 14:02:15 +03:00
Aliaksandr Valialkin	b986516fbe	lib/storage: create and use `lib/uint64set` instead of `map[uint64]struct{}` This should improve inverted index search performance for filters matching big number of time series, since `lib/uint64set.Set` is faster than `map[uint64]struct{}` for both `Add` and `Has` calls. See the corresponding benchmarks in `lib/uint64set`.	2019-09-24 21:17:55 +03:00
Aliaksandr Valialkin	ef2296e420	lib/storage: typo fix: return dstData instead of data from mergeTagToMetricIDsRows	2019-09-24 19:32:34 +03:00
Aliaksandr Valialkin	a6086cde78	lib/storage: limit the number of metricIDs in tag->metricIDs row This reduces the overhead on index and metaindex in lib/mergeset	2019-09-24 00:49:51 +03:00
Aliaksandr Valialkin	c9063ece66	lib/storage: share tsids across all the partSearch instances This should reduce memory usage when big number of time series matches the given query.	2019-09-23 22:35:15 +03:00
Aliaksandr Valialkin	4e26ad869b	lib/{storage,mergeset}: verify PrepareBlock callback results Do not touch the first and the last item passed to PrepareBlock in order to preserve sort order of mergeset blocks.	2019-09-23 20:43:13 +03:00
Aliaksandr Valialkin	0adebae1f8	lib/storage: generate the first tag->metricIDs item in a mergeset block with a single metricID The first item from each mergeset block goes into index (lib/mergeset.blockHeader), so it must be short in order to reduce index size.	2019-09-22 19:21:33 +03:00
Aliaksandr Valialkin	0686ac52c3	lib/{storage,mergeset}: merge `tag->metricID` rows into `tag->metricIDs` rows for common `tag` values This should improve lookup performance if the same `label=value` pair exists in big number of time series. This should also reduce memory usage for mergeset data cache, since `tag->metricIDs` rows occupy less space than the original `tag->metricID` rows.	2019-09-20 22:06:41 +03:00
Aliaksandr Valialkin	a544f49c2b	lib/storage: optimize selecting all the metricIDs by scanning MetricID->TSID entries instead of tag->MetricID entries The number of MetricID->TSID entries is smaller than the number of tag->MetricID entries and MetricID->TSID entries are usually shorter than tag->MetricID entries. This should improve performance when selecting all the metricIDs.	2019-09-20 11:54:10 +03:00

1 2

83 Commits