VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-12-16 00:41:24 +01:00

Author	SHA1	Message	Date
Aliaksandr Valialkin	11bd290201	lib/storage: search for TSIDs for the given metricIDs in the previous indexdb if they aren't found in the current indexdb The issue triggers after the indexdb rotation for time series, which stop receiving new samples. This results in missing data for such time series in query responses. This commit should address the https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3502 The issue has been introduced in `2dd93449d8`	2022-12-19 11:56:49 -08:00
Aliaksandr Valialkin	7d5c64eb7a	all: add `-inmemoryDataFlushInterval` command-line flag for controlling the frequency of saving in-memory data to disk The main purpose of this command-line flag is to increase the lifetime of low-end flash storage with the limited number of write operations it can perform. Such flash storage is usually installed on Raspberry PI or similar appliances. For example, `-inmemoryDataFlushInterval=1h` reduces the frequency of disk write operations to up to once per hour if the ingested one-hour worth of data fits the limit for in-memory data. The in-memory data is searchable in the same way as the data stored on disk. VictoriaMetrics automatically flushes the in-memory data to disk on graceful shutdown via SIGINT signal. The in-memory data is lost on unclean shutdown (hardware power loss, OOM crash, SIGKILL). Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3337	2022-12-05 15:28:09 -08:00
Aliaksandr Valialkin	a13d21513e	lib/mergeset: panic when too long item is passed to Table.AddItems()	2022-12-03 23:37:20 -08:00
Aliaksandr Valialkin	106332cd9f	lib/storage: properly pass retentionMsecs to OpenStorage() at TestIndexDBRepopulateAfterRotation	2022-12-03 23:03:30 -08:00
Zakhar Bessarab	e407e7243a	{app/vmstorage,app/vmselect}: add API to get list of existing tenants (#3348 ) * {app/vmstorage,app/vmselect}: add API to get list of existing tenants * {app/vmstorage,app/vmselect}: add API to get list of existing tenants * app/vmselect: fix error message * {app/vmstorage,app/vmselect}: fix error messages * app/vmselect: change log level for error handling * wip Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2022-11-25 10:32:45 -08:00
Aliaksandr Valialkin	a6d4711ac6	lib/storage: add support for retention filters (aka multiple retentions for distinct sets of time series) Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/143 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/289	2022-10-24 16:41:59 +03:00
Aliaksandr Valialkin	2fc82b846e	lib/storage: do not pass retentionMsecs and isReadOnly args explicitly - access them via Storage arg This makes code easier to read. This is a follow-up after `d2d30581a0`	2022-10-24 01:32:56 +03:00
Aliaksandr Valialkin	2dd93449d8	lib/storage: subsitute searchTSIDs functions with more lightweight searchMetricIDs function The searchTSIDs function was searching for metricIDs matching the the given tag filters and then was locating the corresponding TSID entries for the found metricIDs. The TSID entries aren't needed when searching for time series names (aka MetricName), so this commit removes the uneeded TSID search from the implementation of /api/v1/series API. This improves perfromance of /api/v1/series calls. This commit also improves performance a bit for /api/v1/query and /api/v1/query_range calls, since now these calls cache small metricIDs instead of big TSID entries in the indexdb/tagFilters cache (now this cache is named indexdb/tagFiltersToMetricIDs) without the need to compress the saved entries in order to save cache space. This commit also removes concurrency limiter during searching for matching time series, which was introduced in `8f16388428`, since the concurrency for all the read queries is already limited with -search.maxConcurrentRequests command-line flag. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648	2022-10-23 12:43:44 +03:00
Aliaksandr Valialkin	fe52378f45	lib/storage: substitute remaining calls to fs.MustRemoveAll with fs.MustRemoveDirAtomic	2022-09-13 15:49:25 +03:00
Aliaksandr Valialkin	78f9a8aafd	lib/storage: put the (date, metricID) entry in dateMetricIDCache just after the corresponding series is registered in the per-day inverted index Previously the time series could be put into dateMetricIDCache without registering in the per-day inverted index if GetOrCreateTSIDByName finds TSID entry in the global index. This could lead to missing series in query results. The issue has been introduced in the commit `55e7afae3a`, which has been included in VictoriaMetrics v1.78.0	2022-07-05 14:56:55 +03:00
Aliaksandr Valialkin	270ad39359	lib/storage: properly take into account already registered series when `-storage.maxHourlySeries` or `-storage.maxDailySeries` limits are enabled The commit `5fb45173ae` takes into account only newly registered series when applying cardinality limits. This means that the cardinality limit could be exceeded with already registered series. This commit returns back accounting for already registered series when applying cardinality limits.	2022-06-20 13:53:41 +03:00
Aliaksandr Valialkin	7a79e7c0ef	lib/storage: create per-day indexes together with global indexes when registering new time series Previously the creation of per-day indexes and global indexes for the newly registered time series was decoupled. Now global indexes and per-day indexes for the current day are created toghether for new time series. This should speed up registering new time series a bit.	2022-06-19 22:32:41 +03:00
Aliaksandr Valialkin	45fa9d798d	app/vmselect: accept `focusLabel` query arg at /api/v1/status/tsdb	2022-06-14 18:39:00 +03:00
Aliaksandr Valialkin	fb77843639	lib/storage: show top labels with the highest number of series in cardinality explorer	2022-06-14 16:34:13 +03:00
Aliaksandr Valialkin	4af43a4a75	lib/storage: test GetTSDBStatusWithFiltersForDate on a global time range	2022-06-12 14:28:37 +03:00
Aliaksandr Valialkin	61e03f172b	app/vmselect: optimize `/api/v1/labels` and `/api/v1/label/.../values` handlers when `match[]` query arg is passed to them	2022-06-12 14:06:24 +03:00
Aliaksandr Valialkin	cb39eada77	all: improve query tracing coverage for indexdb search Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1403	2022-06-09 20:04:02 +03:00
Dmytro Kozlov	f2754c3e90	Cardinality explorer (#2625 ) * Cardinality explorer * vmui, vmselect: updated field name, added description to spinner * make vmui-update * updated const name, make vmui-update * lib/storage: changes calculation for totalSeries values * added static files * wip * wip * wip * wip * docs/CHANGELOG.md: document cardinality explorer feature See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2233 Co-authored-by: f41gh7 <nik@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2022-06-08 18:54:27 +03:00
Aliaksandr Valialkin	fedfc9e686	lib/storage: stop background merge when storage enters read-only mode This should prevent from `no space left on device` errors when VictoriaMetrics under-estimates the additional disk space needed for background merge. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2603	2022-06-01 14:22:12 +03:00
Aliaksandr Valialkin	afced37c0b	all: add initial support for query tracing See https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#query-tracing Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1403	2022-06-01 02:31:44 +03:00
Aliaksandr Valialkin	b843f0e229	app/vmselect: add fine-grained limits for the number of returned/scanned time series for various APIs	2022-03-26 11:28:14 +02:00
Aliaksandr Valialkin	a6d65fc824	lib/storage: typo fix after `e7831ae154`	2022-03-18 16:53:19 +02:00
jduncan0000	e7831ae154	Fix for issue #2255 - matchTagFilters for positive empty-match filters (#2304 ) * fix for issue 2255 - matchTagFilters for positive empty-match filters * add example to comments * formatting * add test for positive empty match * formatting	2022-03-18 13:08:54 +02:00
Aliaksandr Valialkin	244c23ea2c	lib/workingsetcache: reduce the default cache rotation period from hour to 20 minutes This should reduce memory usage under high time series churn rate	2022-02-23 13:42:27 +02:00
Roman Khavronenko	d107f86fbc	lib/index: reduce read/write load after indexDB rotation (#2177 ) * lib/index: reduce read/write load after indexDB rotation IndexDB in VM is responsible for storing TSID - ID's used for identifying time series. The index is stored on disk and used by both ingestion and read path. IndexDB is stored separately to data parts and is global for all stored data. It can't be deleted partially as VM deletes data parts. Instead, indexDB is rotated once in `retention` interval. The rotation procedure means that `current` indexDB becomes `previous`, and new freshly created indexDB struct becomes `current`. So in any time, VM holds indexDB for current and previous retention periods. When time series is ingested or queried, VM checks if its TSID is present in `current` indexDB. If it is missing, it checks the `previous` indexDB. If TSID was found, it gets copied to the `current` indexDB. In this way `current` indexDB stores only series which were active during the retention period. To improve indexDB lookups, VM uses a cache layer called `tsidCache`. Both write and read path consult `tsidCache` and on miss the relad lookup happens. When rotation happens, VM resets the `tsidCache`. This is needed for ingestion path to trigger `current` indexDB re-population. Since index re-population requires additional resources, every index rotation event may cause some extra load on CPU and disk. While it may be unnoticeable for most of the cases, for systems with very high number of unique series each rotation may lead to performance degradation for some period of time. This PR makes an attempt to smooth out resource usage after the rotation. The changes are following: 1. `tsidCache` is no longer reset after the rotation; 2. Instead, each entry in `tsidCache` gains a notion of indexDB to which they belong; 3. On ingestion path after the rotation we check if requested TSID was found in `tsidCache`. Then we have 3 branches: 3.1 Fast path. It was found, and belongs to the `current` indexDB. Return TSID. 3.2 Slow path. It wasn't found, so we generate it from scratch, add to `current` indexDB, add it to `tsidCache`. 3.3 Smooth path. It was found but does not belong to the `current` indexDB. In this case, we add it to the `current` indexDB with some probability. The probability is based on time passed since the last rotation with some threshold. The more time has passed since rotation the higher is chance to re-populate `current` indexDB. The default re-population interval in this PR is set to `1h`, during which entries from `previous` index supposed to slowly re-populate `current` index. The new metric `vm_timeseries_repopulated_total` was added to identify how many TSIDs were moved from `previous` indexDB to the `current` indexDB. This metric supposed to grow only during the first `1h` after the last rotation. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 Signed-off-by: hagen1778 <roman@victoriametrics.com> * wip * wip Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2022-02-12 00:34:44 +02:00
Aliaksandr Valialkin	6d6cf1b6e0	lib/storage: verify that the tsidsFound contain the needed tsids in tests added at `f4dead529f`	2021-09-11 11:02:56 +03:00
Aliaksandr Valialkin	c2f37f049b	lib/storage: properly search series by multiple tag filters matching empty labels such as foo{bar=~"baz\|",x=~"y\|"} Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1601 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/395	2021-09-09 21:12:53 +03:00
Aliaksandr Valialkin	c473d8ffe1	li/storage: re-use the per-day inverted index search code for searching in global index This allows removing a big pile of outdated code for global index search. This may help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1486	2021-07-30 10:28:20 +03:00
Aliaksandr Valialkin	b133de1e37	lib/storage: move deletedMetricIDs set from indexDB to Storage This makes consitent the list of deleted metricIDs when it is used from both the current indexDB and the previous indexDB (aka extDB). This should fix the issue, which could lead to storing new samples under deleted metricIDs after indexDB rotation. See more details at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1347#issuecomment-861232136 . Thanks to @tangqipengleoo for the initial analysis and the pull request - https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1383 . This commit resolves the issue in more generic way compared to https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1383 . The downside of the commit is the deletedMetricIDs set isn't cleaned from the metricIDs outside the retention. It needs app restart. This should be OK in most cases.	2021-06-15 15:07:54 +03:00
Aliaksandr Valialkin	ce10bdc82a	lib/storage: reset cache on disk during series deletion and during indexdb rotation This should prevent from inconsistent behavior (aka partially missing data for some time series) after unclean shutdown. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1347	2021-06-11 12:54:36 +03:00
Aliaksandr Valialkin	402a8ca710	lib/storage: do not populate MetricID->MetricName cache during data ingestion This cache isn't needed during data ingestion, so there is no need in spending RAM on it. This reduces RAM usage on data ingestion path by 30%	2021-05-24 03:06:40 +03:00
Aliaksandr Valialkin	2839055513	lib/storage: substitute GetTSDBStatusForDate with GetTSDBStatusWithFiltersForDate with nil tfss	2021-05-13 09:01:05 +03:00
Nikolay	be87be34a4	Adds tsdb match filters (#1282 ) * init work on filters * init propose for status filters * fixes tsdb status adds test * fix bug * removes checks from test	2021-05-12 17:16:58 +03:00
Aliaksandr Valialkin	40e47935e7	lib/storage: improve Search.NextMetricBlock performance by using MetricID->MetricName cache	2021-03-22 23:02:59 +02:00
Aliaksandr Valialkin	587132555f	lib/mergeset: reduce memory usage for inmemoryBlock by using more compact items representation This also should reduce CPU time spent by GC, since inmemoryBlock.items don't have pointers now, so GC doesn't need visiting them.	2021-02-21 22:09:10 +02:00
Aliaksandr Valialkin	148422bcba	lib/storage: disable composite index usage when querying old data	2021-02-10 14:57:58 +02:00
Aliaksandr Valialkin	c5e6c5f5a6	app/vmselect: optimize querying for `/api/v1/labels` and `/api/v1/label/<name>/values` when `start` and `end` args are set	2020-11-05 01:19:29 +02:00
Aliaksandr Valialkin	f3a7e6f6e3	lib/storage: remove obsolete code	2020-11-02 19:17:30 +02:00
Aliaksandr Valialkin	fd7dd5064a	lib/storage: code cleanup after `10f2eedee0` Remove the code that uses metricIDs caches for the current and the previous hour during metricIDs search, since this code became unused after implementing per-day inverted index almost a year ago. While at it, fix a bug, which could prevent from finding time series with names containing dots (aka Graphite-like names such as `foo.bar.baz`).	2020-10-01 19:12:04 +03:00
Aliaksandr Valialkin	94cc677b0c	lib/storage: slightly reduce code difference between single-node and cluster versions	2020-07-24 01:18:05 +03:00
Aliaksandr Valialkin	fb3d1380ac	lib/storage: respect `-search.maxQueryDuration` when searching for time series in inverted index Previously the time spent on inverted index search could exceed the configured `-search.maxQueryDuration`. This commit stops searching in inverted index on query timeout.	2020-07-23 21:22:05 +03:00
Aliaksandr Valialkin	be0ab4fbfe	lib/storage: reset `MetricName->TSID` cache after marking metricIDs as deleted This is a follow-up commit after `12b16077c4` , which didn't reset the `tsidCache` in all the required places. This could result in indefinite errors like: missing metricName by metricID ...; this could be the case after unclean shutdown; deleting the metricID, so it could be re-created next time Fix this by resetting the cache inside deleteMetricIDs function.	2020-07-14 14:05:19 +03:00
Aliaksandr Valialkin	d962568e93	all: use %w instead of %s for wrapping errors in `fmt.Errorf` This will simplify examining the returned errors such as httpserver.ErrorWithStatusCode . See https://blog.golang.org/go1.13-errors for details.	2020-06-30 23:33:46 +03:00
Aliaksandr Valialkin	c40f29f783	lib/storage: properly match `{tag!="\|foo"}` filters Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/546	2020-06-10 19:34:37 +03:00
Aliaksandr Valialkin	eca1afdc20	lib/storage: fix Graphite wildcard matching, which has been broken in v1.36.0	2020-05-28 11:58:47 +03:00
Aliaksandr Valialkin	b0131c79b6	lib/storage: improve search speed for time series matching Graphite whildcards such as `foo..bar.baz` Add index for reverse Graphite-like metric names with dots. Use this index during search for filters like `__name__=~"foo\\.[^.]\\.bar\\.baz"` which end with non-empty suffix with dots, i.e. `.bar.baz` in this case. This change may "hide" historical time series during queries. The workaround is to add `[.]` to the end of regexp label filter, i.e. "foo\\.[^.]\\.bar\\.baz" should be substituted with "foo\\.[^.]\\.bar\\.baz[.]".	2020-05-27 21:48:08 +03:00
Aliaksandr Valialkin	67e331ac62	lib/storage: optimize ingestion pefrormance for new time series	2020-05-15 12:12:19 +03:00
Aliaksandr Valialkin	f9526809e5	app/vmselect: add `/api/v1/status/tsdb` page with useful stats for locating root cause for high cardinality issues See https://prometheus.io/docs/prometheus/latest/querying/api/#tsdb-stats Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/425 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/268	2020-04-22 22:03:23 +03:00
Aliaksandr Valialkin	0ad7aaf535	lib/storage: add missing reset for tagFilter.matchesEmptyValue on tagFilter.Init	2020-04-01 17:40:27 +03:00
Aliaksandr Valialkin	494ad0fdb3	lib/storage: remove inmemory index for recent hour, since it uses too much memory Production workload shows that the index requires ~4Kb of RAM per active time series. This is too much for high number of active time series, so let's delete this index. Now the queries should fall back to the index for the current day instead of the index for the recent hour. The query performance for the current day index should be good enough given the 100M rows/sec scan speed per CPU core.	2019-11-13 18:08:58 +02:00

1 2

67 Commits