VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-12-19 15:06:25 +01:00

Author	SHA1	Message	Date
Aliaksandr Valialkin	7094fa38bc	lib/storage: switch from global to per-day index for `MetricName -> TSID` mapping Previously all the newly ingested time series were registered in global `MetricName -> TSID` index. This index was used during data ingestion for locating the TSID (internal series id) for the given canonical metric name (the canonical metric name consists of metric name plus all its labels sorted by label names). The `MetricName -> TSID` index is stored on disk in order to make sure that the data isn't lost on VictoriaMetrics restart or unclean shutdown. The lookup in this index is relatively slow, since VictoriaMetrics needs to read the corresponding data block from disk, unpack it, put the unpacked block into `indexdb/dataBlocks` cache, and then search for the given `MetricName -> TSID` entry there. So VictoriaMetrics uses in-memory cache for speeding up the lookup for active time series. This cache is named `storage/tsid`. If this cache capacity is enough for all the currently ingested active time series, then VictoriaMetrics works fast, since it doesn't need to read the data from disk. VictoriaMetrics starts reading data from `MetricName -> TSID` on-disk index in the following cases: - If `storage/tsid` cache capacity isn't enough for active time series. Then just increase available memory for VictoriaMetrics or reduce the number of active time series ingested into VictoriaMetrics. - If new time series is ingested into VictoriaMetrics. In this case it cannot find the needed entry in the `storage/tsid` cache, so it needs to consult on-disk `MetricName -> TSID` index, since it doesn't know that the index has no the corresponding entry too. This is a typical event under high churn rate, when old time series are constantly substituted with new time series. Reading the data from `MetricName -> TSID` index is slow, so inserts, which lead to reading this index, are counted as slow inserts, and they can be monitored via `vm_slow_row_inserts_total` metric exposed by VictoriaMetrics. Prior to this commit the `MetricName -> TSID` index was global, e.g. it contained entries sorted by `MetricName` for all the time series ever ingested into VictoriaMetrics during the configured -retentionPeriod. This index can become very large under high churn rate and long retention. VictoriaMetrics caches data from this index in `indexdb/dataBlocks` in-memory cache for speeding up index lookups. The `indexdb/dataBlocks` cache may occupy significant share of available memory for storing recently accessed blocks at `MetricName -> TSID` index when searching for newly ingested time series. This commit switches from global `MetricName -> TSID` index to per-day index. This allows significantly reducing the amounts of data, which needs to be cached in `indexdb/dataBlocks`, since now VictoriaMetrics consults only the index for the current day when new time series is ingested into it. The downside of this change is increased indexdb size on disk for workloads without high churn rate, e.g. with static time series, which do no change over time, since now VictoriaMetrics needs to store identical `MetricName -> TSID` entries for static time series for every day. This change removes an optimization for reducing CPU and disk IO spikes at indexdb rotation, since it didn't work correctly - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 . At the same time the change fixes the issue, which could result in lost access to time series, which stop receving new samples during the first hour after indexdb rotation - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698 The issue with the increased CPU and disk IO usage during indexdb rotation will be addressed in a separate commit according to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401#issuecomment-1553488685 This is a follow-up for `1f28b46ae9`	2023-07-13 16:07:30 -07:00
Aliaksandr Valialkin	1f28b46ae9	lib/storage: revert the migration from global to per-day index for (MetricName -> TSID) This reverts the following commits: - `e0e16a2d36` - `2ce02a7fe6` The reason for revert: the updated logic breaks assumptions made when fixing https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698 . For example, if a time series stop receiving new samples during the first day after the indexdb rotation, there are chances that the time series won't be registered in the new indexdb. This is OK until the next indexdb rotation, since the time series is registered in the previous indexdb, so it can be found during queries. But the time series will become invisible for search after the next indexdb rotation, while its data is still there. There is also incompletely solved issue with the increased CPU and disk IO resource usage just after the indexdb rotation. There was an attempt to fix it, but it didn't fix it in full, while introducing the issue mentioned above. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 TODO: to find out the solution, which simultaneously solves the following issues: - increased memory usage for setups high churn rate and long retention (e.g. what the reverted commit does) - increased CPU and disk IO usage during indexdb rotation ( https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 ) - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698 Possible solution - to create the new indexdb in one hour before the indexdb rotation and to gradually pre-populate it with the needed index data during the last hour before indexdb rotation. Then the new indexdb will contain all the needed data just after the rotation, so it won't trigger increased CPU and disk IO.	2023-05-18 11:30:49 -07:00
Haleygo	1531d757ea	fix lint check	2023-05-17 13:51:36 +02:00
Aliaksandr Valialkin	e0e16a2d36	lib/storage: follow-up after `2ce02a7fe6` - Document the change at docs/CHANGELOG.md - Clarify comments for non-trivial code touched by the commit - Improve the logic behind maybeCreateIndexes(): - Correctly create per-day indexes if the indexdb rotation is performed during the first hour or the last hour of the day by UTC. Previously there was a possibility of missing index entries on that day. - Increase the duration for creating new indexes in the current indexdb for up to 22 hours after indexdb rotation. This should reduce the increased resource usage after indexdb rotation. It is safe to postpone index creation for the current day until the last hour of the current day after indexdb rotation by UTC, since the corresponding (date, ...) entries exist in the previous indexdb. - Search for TSID by (date, MetricName) in both the current and the previous indexdb. Previously the search was performed only in the current indexdb. This could lead to excess creation of per-day indexes for the current day just after indexdb rotation. - Search for (date, metricID) entries in both the current and the previous indexdb. Previously the search was performed only in the current indexdb. This could lead to excess creation of per-day indexes for the current day just after indexdb rotation.	2023-05-16 23:19:27 -07:00
Roman Khavronenko	2ce02a7fe6	lib/storage: introduce per-day MetricName=>TSID index (#4252 ) The new index substitutes global MetricName=>TSID index used for locating TSIDs on ingestion path. For installations with high ingestion and churn rate, global MetricName=>TSID index can grow enormously making index lookups too expensive. This also results into bigger than expected cache growth for indexdb blocks. New per-day index supposed to be much smaller and more efficient. This should improve ingestion speed and reliability during re-routings in cluster. The negative outcome could be occupied disk size, since per-day index is more expensive comparing to global index. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-05-16 15:46:42 -07:00
Aliaksandr Valialkin	3727251910	lib/fs: add MustReadDir() function Use fs.MustReadDir() instead of os.ReadDir() across the code in order to reduce the code verbosity. The fs.MustReadDir() logs the error with the directory name and the call stack on error before exit. This information should be enough for debugging the cause of the error.	2023-04-14 22:10:46 -07:00
Oleksandr Redko	9fff48c3e3	app,lib: fix typos in comments (#3804 )	2023-02-13 13:27:13 +01:00
Aliaksandr Valialkin	41e00a0df7	lib/storage: simplify the fix from `488940502c` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3566	2023-01-07 01:04:43 -08:00
Dmytro Kozlov	488940502c	lib/storage: fix returning camelcase label names (#3608 ) * lib/storage: fix returning camelcase label names * doc: add change log * Update docs/CHANGELOG.md * Update docs/CHANGELOG.md Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-01-07 00:50:14 -08:00
Aliaksandr Valialkin	4e55b67a44	lib/storage: clear the err if it is set to io.EOF when searching for the TSID by metricID This is expected error after when recently added indexdb data isn't available for search yet or wasn't flushed to disk after unclean shutdown of VictoriaMetrics. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3515	2022-12-20 14:05:29 -08:00
Aliaksandr Valialkin	944effca54	lib/storage: do not check for the result returned by db.doExtDB() where this isn't necessary This simplifies the code a bit	2022-12-19 13:23:13 -08:00
Aliaksandr Valialkin	6c98b56935	lib/storage: search for TSIDs for the given metricIDs in the previous indexdb if they aren't found in the current indexdb The issue triggers after the indexdb rotation for time series, which stop receiving new samples. This results in missing data for such time series in query responses. This commit should address the https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3502 The issue has been introduced in `2dd93449d8`	2022-12-19 12:03:09 -08:00
Aliaksandr Valialkin	057fb2120b	lib/storage: properly set buf capacity inside marshalMetricID Previously it was always set to 0. In theory this could result into incorrect marshaling of metricIDs. The issue has been introduced in `5e4dfe50c6`	2022-12-19 10:14:38 -08:00
Aliaksandr Valialkin	33dda2809b	lib/mergeset: panic when too long item is passed to Table.AddItems()	2022-12-03 23:32:16 -08:00
Aliaksandr Valialkin	d0a9ca1bc2	lib/storage: properly pass uint64 constant to fmt.Errorf on 32-bit platforms	2022-10-23 12:48:00 +03:00
Aliaksandr Valialkin	5e4dfe50c6	lib/storage: subsitute searchTSIDs functions with more lightweight searchMetricIDs function The searchTSIDs function was searching for metricIDs matching the the given tag filters and then was locating the corresponding TSID entries for the found metricIDs. The TSID entries aren't needed when searching for time series names (aka MetricName), so this commit removes the uneeded TSID search from the implementation of /api/v1/series API. This improves perfromance of /api/v1/series calls. This commit also improves performance a bit for /api/v1/query and /api/v1/query_range calls, since now these calls cache small metricIDs instead of big TSID entries in the indexdb/tagFilters cache (now this cache is named indexdb/tagFiltersToMetricIDs) without the need to compress the saved entries in order to save cache space. This commit also removes concurrency limiter during searching for matching time series, which was introduced in `8f16388428`, since the concurrency for all the read queries is already limited with -search.maxConcurrentRequests command-line flag. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648	2022-10-23 12:23:47 +03:00
Aliaksandr Valialkin	042a532f70	lib/storage: substitute remaining calls to fs.MustRemoveAll with fs.MustRemoveDirAtomic Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3038	2022-09-13 16:17:38 +03:00
Roman Khavronenko	31f922944e	lib/storage: fix the search for empty label name (#2991 ) * lib/storage: fix the search for empty label name Signed-off-by: hagen1778 <roman@victoriametrics.com> * Apply suggestions from code review Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2022-08-17 21:32:25 +03:00
Aliaksandr Valialkin	b0e1bb517e	lib/storage: typo fix in comments after `f830edc0bc`	2022-08-16 13:44:45 +03:00
Aliaksandr Valialkin	f830edc0bc	lib/storage: improve performance for /api/v1/labels and /api/v1/label/.../values endpoints when `match[]` filter matches small number of time series Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2978	2022-08-16 13:32:40 +03:00
Aliaksandr Valialkin	5a4c58f9a2	lib/storage: explain why the GetOrCreateTSIDByName function doesnt check whether the per-day entry for the given date exists if TSID is found in global index	2022-08-02 09:12:29 +03:00
Aliaksandr Valialkin	78520f2702	lib/storage: do not compress small number of tsids when storing them in tagFiltersCache This speeds up tsids retreival from the cache for 0-2 tsids	2022-07-30 00:08:51 +03:00
Aliaksandr Valialkin	a60e03b3a7	lib/vmselectapi: use string type for tagKey and tagValuePrefix args at TagValueSuffixes() This improves the API consistency	2022-07-06 12:37:53 +03:00
Aliaksandr Valialkin	edc76286ac	lib/storage: put the (date, metricID) entry in dateMetricIDCache just after the corresponding series is registered in the per-day inverted index Previously the time series could be put into dateMetricIDCache without registering in the per-day inverted index if GetOrCreateTSIDByName finds TSID entry in the global index. This could lead to missing series in query results. The issue has been introduced in the commit `55e7afae3a`, which has been included in VictoriaMetrics v1.78.0	2022-07-05 14:54:03 +03:00
Aliaksandr Valialkin	3e2dd85f7d	all: readability improvements for query traces - show dates in human-readable format, e.g. 2022-05-07, instead of a numeric value - limit the maximum length of queries and filters shown in trace messages	2022-06-30 18:20:33 +03:00
Aliaksandr Valialkin	ba514284f1	lib/storage: add querytracer to more contexts querytracer has been added to the following storage.Storage methods: - RegisterMetricNames - DeleteMetrics - SearchTagValueSuffixes - SearchGraphitePaths	2022-06-27 13:45:51 +03:00
Aliaksandr Valialkin	b958fc7846	lib/storage: properly take into account already registered series when `-storage.maxHourlySeries` or `-storage.maxDailySeries` limits are enabled The commit `5fb45173ae` takes into account only newly registered series when applying cardinality limits. This means that the cardinality limit could be exceeded with already registered series. This commit returns back accounting for already registered series when applying cardinality limits.	2022-06-20 13:47:47 +03:00
Aliaksandr Valialkin	55e7afae3a	lib/storage: create per-day indexes together with global indexes when registering new time series Previously the creation of per-day indexes and global indexes for the newly registered time series was decoupled. Now global indexes and per-day indexes for the current day are created toghether for new time series. This should speed up registering new time series a bit.	2022-06-19 22:42:10 +03:00
Aliaksandr Valialkin	5fb45173ae	lib/storage: do not register new series if `-storage.maxHourlySeries` or `-storage.maxDailySeries` limits are exceeded Previously samples for new series weren't added as expected when series limits were reached, but new series were still registered in indexdb.	2022-06-19 22:42:09 +03:00
Aliaksandr Valialkin	ec7963208d	app/vmselect: accept `focusLabel` query arg at /api/v1/status/tsdb This allows filling the seriesCountByFocusLabelValue list in the /api/v1/status/tsdb response with label values for the specified focusLabel, which contain the highest number of time series. TODO: add this to Cardinality explorer at VMUI - https://docs.victoriametrics.com/#cardinality-explorer	2022-06-14 18:36:54 +03:00
Aliaksandr Valialkin	b6c1ca12b7	lib/storage: show top labels with the highest number of series in cardinality explorer	2022-06-14 16:32:38 +03:00
Aliaksandr Valialkin	a75e59700f	lib/storage: improve error message when -search.max* command-line flag values are exceeded	2022-06-14 13:27:59 +03:00
Aliaksandr Valialkin	374beb350e	app/vmselect: optimize `/api/v1/labels` and `/api/v1/label/.../values` handlers when `match[]` query arg is passed to them	2022-06-12 04:32:13 +03:00
Aliaksandr Valialkin	2bcb960f17	all: improve query tracing coverage for indexdb search Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1403	2022-06-09 20:07:07 +03:00
Dmytro Kozlov	018d2303c4	Cardinality explorer (#2625 ) * Cardinality explorer * vmui, vmselect: updated field name, added description to spinner * make vmui-update * updated const name, make vmui-update * lib/storage: changes calculation for totalSeries values * added static files * wip * wip * wip * wip * docs/CHANGELOG.md: document cardinality explorer feature See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2233 Co-authored-by: f41gh7 <nik@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2022-06-08 18:43:05 +03:00
Aliaksandr Valialkin	ea06d2fd3c	lib/storage: stop background merge when storage enters read-only mode This should prevent from `no space left on device` errors when VictoriaMetrics under-estimates the additional disk space needed for background merge. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2603	2022-06-01 14:36:45 +03:00
Roman Khavronenko	642eb1c534	lib/storage: make `indexdb/tagFilters` cache size configurable (#2667 ) The default size of `indexdb/tagFilters` now can be overridden via `storage.cacheSizeIndexDBTagFilters` flag. Please, be careful with changing default size since it may lead to inefficient work of the vmstorage or OOM exceptions. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2663 Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Nikolay <nik@victoriametrics.com>	2022-06-01 10:07:53 +02:00
Aliaksandr Valialkin	41958ed5dd	all: add initial support for query tracing See https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#query-tracing Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1403	2022-06-01 02:29:23 +03:00
Aliaksandr Valialkin	a1add5c2c7	lib/storage: `make fmt`	2022-05-31 12:54:37 +03:00
Aliaksandr Valialkin	bac75ea8a2	lib/storage: do not take into account series from the next day when `match[]` filter is passed to /api/v1/status/tsdb	2022-05-31 12:15:26 +03:00
Aliaksandr Valialkin	54de0531a4	app/vmstorage: properly handle `maxSeries` limit passed from vmselect to vmstorage	2022-04-12 11:23:04 +03:00
Aliaksandr Valialkin	50cf74ce4b	lib/storage: reuse sync.WaitGroup objects This reduces GC load by up to 10% according to memory profiling	2022-04-06 13:34:04 +03:00
Aliaksandr Valialkin	6e364e19ef	app/vmselect: add fine-grained limits for the number of returned/scanned time series for various APIs	2022-03-26 11:29:49 +02:00
jduncan0000	e5868b9c29	Fix for issue #2255 - matchTagFilters for positive empty-match filters (#2304 ) * fix for issue 2255 - matchTagFilters for positive empty-match filters * add example to comments * formatting * add test for positive empty match * formatting	2022-03-18 12:58:22 +02:00
Aliaksandr Valialkin	7e99bbb967	lib/storage: document why job-like and instance-like labels must be stored at mn.Tags[0] and mn.Tags[1] Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2244	2022-02-25 13:21:07 +02:00
Aliaksandr Valialkin	8bf3fb917a	lib/storage: add a comment to indexSearch.containsTimeRange() on why it allows false positives Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2239	2022-02-24 12:47:27 +02:00
Aliaksandr Valialkin	62b46007c5	lib/workingsetcache: reduce the default cache rotation period from hour to 20 minutes This should reduce memory usage under high time series churn rate	2022-02-23 13:41:45 +02:00
Aliaksandr Valialkin	f72b35665f	lib/storage: optimize `/api/v1/status/tsdb` call by skipping all the artificially created tag entries at once This is a follow-up for `b71be42d90`	2022-02-21 18:23:35 +02:00
Aliaksandr Valialkin	2b87b4d183	lib/storage: typo fix after `c3affb0c4f`	2022-02-17 12:55:54 +02:00
Aliaksandr Valialkin	c3affb0c4f	lib/storage: simplify code for searching for label values This is a follow-up after `9dd191b27c`	2022-02-17 12:29:38 +02:00

1 2 3 4 5

228 Commits