VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-12-15 16:30:55 +01:00

Author	SHA1	Message	Date
Aliaksandr Valialkin	540f65cc49	lib/storage: move the conversion of tag filters to composite tag filters into indexSearch.searchMetricIDsInternal This makes the code less fragile - it is harder to skip the convertToCompositeTagFilterss() call now. While at it, call indexSearch.containsTimeRange() inside indexSearch.searchMetricIDsInternal() in order to quickly terminate search of time series in the old indexdb for new time ranges. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5055 This is a follow-up for `2d31fd7855`	2024-03-12 02:59:04 +02:00
Aliaksandr Valialkin	293f03f2dd	lib/storage: use composite indexes (metricName, label=value) when searching for matching time series at /api/v1/labels, /api/v1/label/.../values and /api/v1/status/tsdb This should improve query performance when match[], extra_filters[] or extra_label args are passed to these APIs Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5055	2024-03-12 02:56:35 +02:00
Aliaksandr Valialkin	22acd84019	lib/storage: use unsafe.Slice instead of deprecated reflect.SliceHeader	2024-02-29 17:24:44 +02:00
Aliaksandr Valialkin	a1baf25c2e	lib/storage: consistently use atomic.* types instead of atomic.* function calls on ordinary types See `ea9e2b19a5`	2024-02-24 00:33:07 +02:00
Aliaksandr Valialkin	d0538d11d3	lib/mergeset: consistently use atomic.* types instead of atomic.* function calls on ordinary types See `ea9e2b19a5`	2024-02-24 00:29:12 +02:00
Aliaksandr Valialkin	e7dfcdfff6	lib/storage: consistently use atomic.* type for refCount and mustDrop fields in indexDB, table and partition structs See `ea9e2b19a5`	2024-02-24 00:26:26 +02:00
Aliaksandr Valialkin	68d76b1436	lib/storage: compress metricIDs, which match the given filters, before storing them in tagFiltersToMetricIDsCache This allows reducing the indexdb/tagFiltersToMetricIDs cache size by 8 on average. The cache size can be checked via vm_cache_size_bytes{type="indexdb/tagFiltersToMetricIDs"} metric exposed at /metrics page.	2024-01-23 16:13:25 +02:00
Aliaksandr Valialkin	36a1fdca6c	all: consistently use %w instead of %s in when error is passed to fmt.Errorf() This allows consistently using errors.Is() for verifying whether the given error wraps some other known error.	2023-10-26 09:44:40 +02:00
Aliaksandr Valialkin	281eb0c377	lib/storage: log fatal error inside searchMetricName() instead of propagating it to the caller This simplifies the code a bit at searchMetricName() and searchMetricNameWithCache() call sites This is a result of investigating https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4972	2023-09-22 11:37:55 +02:00
Aliaksandr Valialkin	24cec79763	lib/storage: handle fatal errors inside indexSearch.getTSIDByMetricID() instead of returning them to the caller This simplifies the code a bit at caller side	2023-09-15 12:01:06 +02:00
Aliaksandr Valialkin	d8afd7fe98	Makefile: update golangci-lint from v1.51.2 to v1.54.2 See https://github.com/golangci/golangci-lint/releases/tag/v1.54.2	2023-09-01 10:25:49 +02:00
Aliaksandr Valialkin	bde876f7c9	all: refer to https://docs.victoriametrics.com/#resource-usage-limits in the error message about -search.max* limit Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4827	2023-08-14 02:02:12 -07:00
Nikolay	30b32583f4	lib/storage: pre-create timeseries before indexDB rotation (#4652 ) * lib/storage: pre-create timeseries before indexDB rotation during an hour before indexDB rotation start creating records at the next indexDB it must improve performance during switch for the next indexDB and remove ingestion issues. Since there is no need for creation new index records for timeseries already ingested into current indexDB https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4563 * lib/storage: further work on indexdb rotation optimization - Document the change at docs/CHAGNELOG.md - Move back various caches from indexDB to Storage. This makes the change less intrusive. The dateMetricIDCache now takes into account indexDB generation, so it stores (date, metricID) entries for both the current and the next indexDB. - Consolidate the code responsible for idbNext pre-filling into prefillNextIndexDB() function. This improves code readability and maintainability a bit. - Rewrite and simplify the code responsible for calculating the next retention timestamp. Add various tests for corner cases of this code. - Remove indexdb pre-filling from RegisterMetricNames() function, since this function is rarely called. It is OK to add indexdb entries on demand in this function. This simplifies the code. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 * docs/CHANGELOG.md: refer to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4563 --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-07-22 15:23:14 -07:00
Aliaksandr Valialkin	fbddb4ad32	lib/storage: typo fix after `e1cf962bad` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401	2023-07-13 21:29:02 -07:00
Aliaksandr Valialkin	e1cf962bad	lib/storage: switch from global to per-day index for `MetricName -> TSID` mapping Previously all the newly ingested time series were registered in global `MetricName -> TSID` index. This index was used during data ingestion for locating the TSID (internal series id) for the given canonical metric name (the canonical metric name consists of metric name plus all its labels sorted by label names). The `MetricName -> TSID` index is stored on disk in order to make sure that the data isn't lost on VictoriaMetrics restart or unclean shutdown. The lookup in this index is relatively slow, since VictoriaMetrics needs to read the corresponding data block from disk, unpack it, put the unpacked block into `indexdb/dataBlocks` cache, and then search for the given `MetricName -> TSID` entry there. So VictoriaMetrics uses in-memory cache for speeding up the lookup for active time series. This cache is named `storage/tsid`. If this cache capacity is enough for all the currently ingested active time series, then VictoriaMetrics works fast, since it doesn't need to read the data from disk. VictoriaMetrics starts reading data from `MetricName -> TSID` on-disk index in the following cases: - If `storage/tsid` cache capacity isn't enough for active time series. Then just increase available memory for VictoriaMetrics or reduce the number of active time series ingested into VictoriaMetrics. - If new time series is ingested into VictoriaMetrics. In this case it cannot find the needed entry in the `storage/tsid` cache, so it needs to consult on-disk `MetricName -> TSID` index, since it doesn't know that the index has no the corresponding entry too. This is a typical event under high churn rate, when old time series are constantly substituted with new time series. Reading the data from `MetricName -> TSID` index is slow, so inserts, which lead to reading this index, are counted as slow inserts, and they can be monitored via `vm_slow_row_inserts_total` metric exposed by VictoriaMetrics. Prior to this commit the `MetricName -> TSID` index was global, e.g. it contained entries sorted by `MetricName` for all the time series ever ingested into VictoriaMetrics during the configured -retentionPeriod. This index can become very large under high churn rate and long retention. VictoriaMetrics caches data from this index in `indexdb/dataBlocks` in-memory cache for speeding up index lookups. The `indexdb/dataBlocks` cache may occupy significant share of available memory for storing recently accessed blocks at `MetricName -> TSID` index when searching for newly ingested time series. This commit switches from global `MetricName -> TSID` index to per-day index. This allows significantly reducing the amounts of data, which needs to be cached in `indexdb/dataBlocks`, since now VictoriaMetrics consults only the index for the current day when new time series is ingested into it. The downside of this change is increased indexdb size on disk for workloads without high churn rate, e.g. with static time series, which do no change over time, since now VictoriaMetrics needs to store identical `MetricName -> TSID` entries for static time series for every day. This change removes an optimization for reducing CPU and disk IO spikes at indexdb rotation, since it didn't work correctly - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 . At the same time the change fixes the issue, which could result in lost access to time series, which stop receving new samples during the first hour after indexdb rotation - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698 The issue with the increased CPU and disk IO usage during indexdb rotation will be addressed in a separate commit according to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401#issuecomment-1553488685 This is a follow-up for `1f28b46ae9`	2023-07-13 17:03:50 -07:00
Aliaksandr Valialkin	0ebfb91aba	lib/storage: revert the migration from global to per-day index for (MetricName -> TSID) This reverts the following commits: - `e0e16a2d36` - `2ce02a7fe6` The reason for revert: the updated logic breaks assumptions made when fixing https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698 . For example, if a time series stop receiving new samples during the first day after the indexdb rotation, there are chances that the time series won't be registered in the new indexdb. This is OK until the next indexdb rotation, since the time series is registered in the previous indexdb, so it can be found during queries. But the time series will become invisible for search after the next indexdb rotation, while its data is still there. There is also incompletely solved issue with the increased CPU and disk IO resource usage just after the indexdb rotation. There was an attempt to fix it, but it didn't fix it in full, while introducing the issue mentioned above. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 TODO: to find out the solution, which simultaneously solves the following issues: - increased memory usage for setups high churn rate and long retention (e.g. what the reverted commit does) - increased CPU and disk IO usage during indexdb rotation ( https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 ) - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698	2023-05-18 11:28:54 -07:00
Aliaksandr Valialkin	67beb8c856	lib/storage: follow-up after `2ce02a7fe6` - Document the change at docs/CHANGELOG.md - Clarify comments for non-trivial code touched by the commit - Improve the logic behind maybeCreateIndexes(): - Correctly create per-day indexes if the indexdb rotation is performed during the first hour or the last hour of the day by UTC. Previously there was a possibility of missing index entries on that day. - Increase the duration for creating new indexes in the current indexdb for up to 22 hours after indexdb rotation. This should reduce the increased resource usage after indexdb rotation. It is safe to postpone index creation for the current day until the last hour of the current day after indexdb rotation by UTC, since the corresponding (date, ...) entries exist in the previous indexdb. - Search for TSID by (date, MetricName) in both the current and the previous indexdb. Previously the search was performed only in the current indexdb. This could lead to excess creation of per-day indexes for the current day just after indexdb rotation. - Search for (date, metricID) entries in both the current and the previous indexdb. Previously the search was performed only in the current indexdb. This could lead to excess creation of per-day indexes for the current day just after indexdb rotation.	2023-05-16 23:31:59 -07:00
Roman Khavronenko	c3b1d9ee21	lib/storage: introduce per-day MetricName=>TSID index (#4252 ) The new index substitutes global MetricName=>TSID index used for locating TSIDs on ingestion path. For installations with high ingestion and churn rate, global MetricName=>TSID index can grow enormously making index lookups too expensive. This also results into bigger than expected cache growth for indexdb blocks. New per-day index supposed to be much smaller and more efficient. This should improve ingestion speed and reliability during re-routings in cluster. The negative outcome could be occupied disk size, since per-day index is more expensive comparing to global index. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-05-16 23:18:11 -07:00
Aliaksandr Valialkin	cf4701db65	lib/fs: add MustReadDir() function Use fs.MustReadDir() instead of os.ReadDir() across the code in order to reduce the code verbosity. The fs.MustReadDir() logs the error with the directory name and the call stack on error before exit. This information should be enough for debugging the cause of the error.	2023-04-14 22:11:40 -07:00
Oleksandr Redko	0e1c395609	app,lib: fix typos in comments (#3804 )	2023-02-13 09:32:35 -08:00
Aliaksandr Valialkin	eb9a542c1f	lib/storage: simplify the fix from `488940502c` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3566	2023-01-07 01:11:35 -08:00
Dmytro Kozlov	f739e44802	lib/storage: fix returning camelcase label names (#3608 ) * lib/storage: fix returning camelcase label names * doc: add change log * Update docs/CHANGELOG.md * Update docs/CHANGELOG.md Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-01-07 01:11:10 -08:00
Aliaksandr Valialkin	1ff62629f4	lib/storage: clear the err if it is set to io.EOF when searching for the TSID by metricID This is expected error after when recently added indexdb data isn't available for search yet or wasn't flushed to disk after unclean shutdown of VictoriaMetrics. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3515	2022-12-20 14:05:53 -08:00
Aliaksandr Valialkin	2184de3bf2	lib/storage: do not check for the result returned by db.doExtDB() where this isn't necessary This simplifies the code a bit	2022-12-19 13:23:30 -08:00
Aliaksandr Valialkin	11bd290201	lib/storage: search for TSIDs for the given metricIDs in the previous indexdb if they aren't found in the current indexdb The issue triggers after the indexdb rotation for time series, which stop receiving new samples. This results in missing data for such time series in query responses. This commit should address the https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3502 The issue has been introduced in `2dd93449d8`	2022-12-19 11:56:49 -08:00
Aliaksandr Valialkin	512c73cef9	lib/storage: properly set buf capacity inside marshalMetricID Previously it was always set to 0. In theory this could result into incorrect marshaling of metricIDs. The issue has been introduced in `5e4dfe50c6`	2022-12-19 10:31:38 -08:00
Aliaksandr Valialkin	a13d21513e	lib/mergeset: panic when too long item is passed to Table.AddItems()	2022-12-03 23:37:20 -08:00
Zakhar Bessarab	e407e7243a	{app/vmstorage,app/vmselect}: add API to get list of existing tenants (#3348 ) * {app/vmstorage,app/vmselect}: add API to get list of existing tenants * {app/vmstorage,app/vmselect}: add API to get list of existing tenants * app/vmselect: fix error message * {app/vmstorage,app/vmselect}: fix error messages * app/vmselect: change log level for error handling * wip Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2022-11-25 10:32:45 -08:00
Aliaksandr Valialkin	5d0a91afd5	lib/storage: properly pass uint64 constant to fmt.Errorf on 32-bit platforms	2022-10-23 12:48:43 +03:00
Aliaksandr Valialkin	2dd93449d8	lib/storage: subsitute searchTSIDs functions with more lightweight searchMetricIDs function The searchTSIDs function was searching for metricIDs matching the the given tag filters and then was locating the corresponding TSID entries for the found metricIDs. The TSID entries aren't needed when searching for time series names (aka MetricName), so this commit removes the uneeded TSID search from the implementation of /api/v1/series API. This improves perfromance of /api/v1/series calls. This commit also improves performance a bit for /api/v1/query and /api/v1/query_range calls, since now these calls cache small metricIDs instead of big TSID entries in the indexdb/tagFilters cache (now this cache is named indexdb/tagFiltersToMetricIDs) without the need to compress the saved entries in order to save cache space. This commit also removes concurrency limiter during searching for matching time series, which was introduced in `8f16388428`, since the concurrency for all the read queries is already limited with -search.maxConcurrentRequests command-line flag. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648	2022-10-23 12:43:44 +03:00
Aliaksandr Valialkin	fe52378f45	lib/storage: substitute remaining calls to fs.MustRemoveAll with fs.MustRemoveDirAtomic	2022-09-13 15:49:25 +03:00
Roman Khavronenko	2c59c83191	lib/storage: fix the search for empty label name (#2991 ) * lib/storage: fix the search for empty label name Signed-off-by: hagen1778 <roman@victoriametrics.com> * Apply suggestions from code review Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2022-08-19 11:05:09 +03:00
Aliaksandr Valialkin	1a363192ff	lib/storage: typo fix in comments after `f830edc0bc`	2022-08-16 13:45:32 +03:00
Aliaksandr Valialkin	dc929e0d16	lib/storage: improve performance for /api/v1/labels and /api/v1/label/.../values endpoints when `match[]` filter matches small number of time series Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2978	2022-08-16 13:34:23 +03:00
Aliaksandr Valialkin	9f95099cf4	lib/storage: explain why the GetOrCreateTSIDByName function doesnt check whether the per-day entry for the given date exists if TSID is found in global index	2022-08-02 09:13:41 +03:00
Aliaksandr Valialkin	586d267a44	lib/storage: do not compress small number of tsids when storing them in tagFiltersCache This speeds up tsids retreival from the cache for 0-2 tsids	2022-07-30 00:11:14 +03:00
Aliaksandr Valialkin	5afa54e845	lib/vmselectapi: use string type for tagKey and tagValuePrefix args at TagValueSuffixes() This improves the API consistency	2022-07-06 12:46:21 +03:00
Aliaksandr Valialkin	78f9a8aafd	lib/storage: put the (date, metricID) entry in dateMetricIDCache just after the corresponding series is registered in the per-day inverted index Previously the time series could be put into dateMetricIDCache without registering in the per-day inverted index if GetOrCreateTSIDByName finds TSID entry in the global index. This could lead to missing series in query results. The issue has been introduced in the commit `55e7afae3a`, which has been included in VictoriaMetrics v1.78.0	2022-07-05 14:56:55 +03:00
Aliaksandr Valialkin	4fb0f15322	all: readability improvements for query traces - show dates in human-readable format, e.g. 2022-05-07, instead of a numeric value - limit the maximum length of queries and filters shown in trace messages	2022-06-30 18:19:43 +03:00
Aliaksandr Valialkin	926fccbb8d	lib/storage: add querytracer to more contexts querytracer has been added to the following storage.Storage methods: - RegisterMetricNames - DeleteMetrics - SearchTagValueSuffixes - SearchGraphitePaths	2022-06-27 12:53:49 +03:00
Aliaksandr Valialkin	270ad39359	lib/storage: properly take into account already registered series when `-storage.maxHourlySeries` or `-storage.maxDailySeries` limits are enabled The commit `5fb45173ae` takes into account only newly registered series when applying cardinality limits. This means that the cardinality limit could be exceeded with already registered series. This commit returns back accounting for already registered series when applying cardinality limits.	2022-06-20 13:53:41 +03:00
Aliaksandr Valialkin	7a79e7c0ef	lib/storage: create per-day indexes together with global indexes when registering new time series Previously the creation of per-day indexes and global indexes for the newly registered time series was decoupled. Now global indexes and per-day indexes for the current day are created toghether for new time series. This should speed up registering new time series a bit.	2022-06-19 22:32:41 +03:00
Aliaksandr Valialkin	88e1221b35	lib/storage: do not register new series if `-storage.maxHourlySeries` or `-storage.maxDailySeries` limits are exceeded Previously samples for new series weren't added as expected when series limits were reached, but new series were still registered in indexdb.	2022-06-19 22:03:02 +03:00
Aliaksandr Valialkin	45fa9d798d	app/vmselect: accept `focusLabel` query arg at /api/v1/status/tsdb	2022-06-14 18:39:00 +03:00
Aliaksandr Valialkin	fb77843639	lib/storage: show top labels with the highest number of series in cardinality explorer	2022-06-14 16:34:13 +03:00
Aliaksandr Valialkin	3167fbc21d	lib/storage: improve error message when -search.max* command-line flag values are exceeded	2022-06-14 13:28:21 +03:00
Aliaksandr Valialkin	61e03f172b	app/vmselect: optimize `/api/v1/labels` and `/api/v1/label/.../values` handlers when `match[]` query arg is passed to them	2022-06-12 14:06:24 +03:00
Aliaksandr Valialkin	cb39eada77	all: improve query tracing coverage for indexdb search Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1403	2022-06-09 20:04:02 +03:00
Dmytro Kozlov	f2754c3e90	Cardinality explorer (#2625 ) * Cardinality explorer * vmui, vmselect: updated field name, added description to spinner * make vmui-update * updated const name, make vmui-update * lib/storage: changes calculation for totalSeries values * added static files * wip * wip * wip * wip * docs/CHANGELOG.md: document cardinality explorer feature See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2233 Co-authored-by: f41gh7 <nik@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2022-06-08 18:54:27 +03:00
Roman Khavronenko	e9ee043879	lib/storage: make `indexdb/tagFilters` cache size configurable (#2667 ) The default size of `indexdb/tagFilters` now can be overridden via `storage.cacheSizeIndexDBTagFilters` flag. Please, be careful with changing default size since it may lead to inefficient work of the vmstorage or OOM exceptions. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2663 Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Nikolay <nik@victoriametrics.com>	2022-06-01 14:57:39 +03:00

1 2 3 4 5

241 Commits