VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-12-15 08:23:34 +01:00

Author	SHA1	Message	Date
Aliaksandr Valialkin	11bd290201	lib/storage: search for TSIDs for the given metricIDs in the previous indexdb if they aren't found in the current indexdb The issue triggers after the indexdb rotation for time series, which stop receiving new samples. This results in missing data for such time series in query responses. This commit should address the https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3502 The issue has been introduced in `2dd93449d8`	2022-12-19 11:56:49 -08:00
Aliaksandr Valialkin	512c73cef9	lib/storage: properly set buf capacity inside marshalMetricID Previously it was always set to 0. In theory this could result into incorrect marshaling of metricIDs. The issue has been introduced in `5e4dfe50c6`	2022-12-19 10:31:38 -08:00
Aliaksandr Valialkin	a13d21513e	lib/mergeset: panic when too long item is passed to Table.AddItems()	2022-12-03 23:37:20 -08:00
Zakhar Bessarab	e407e7243a	{app/vmstorage,app/vmselect}: add API to get list of existing tenants (#3348 ) * {app/vmstorage,app/vmselect}: add API to get list of existing tenants * {app/vmstorage,app/vmselect}: add API to get list of existing tenants * app/vmselect: fix error message * {app/vmstorage,app/vmselect}: fix error messages * app/vmselect: change log level for error handling * wip Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2022-11-25 10:32:45 -08:00
Aliaksandr Valialkin	5d0a91afd5	lib/storage: properly pass uint64 constant to fmt.Errorf on 32-bit platforms	2022-10-23 12:48:43 +03:00
Aliaksandr Valialkin	2dd93449d8	lib/storage: subsitute searchTSIDs functions with more lightweight searchMetricIDs function The searchTSIDs function was searching for metricIDs matching the the given tag filters and then was locating the corresponding TSID entries for the found metricIDs. The TSID entries aren't needed when searching for time series names (aka MetricName), so this commit removes the uneeded TSID search from the implementation of /api/v1/series API. This improves perfromance of /api/v1/series calls. This commit also improves performance a bit for /api/v1/query and /api/v1/query_range calls, since now these calls cache small metricIDs instead of big TSID entries in the indexdb/tagFilters cache (now this cache is named indexdb/tagFiltersToMetricIDs) without the need to compress the saved entries in order to save cache space. This commit also removes concurrency limiter during searching for matching time series, which was introduced in `8f16388428`, since the concurrency for all the read queries is already limited with -search.maxConcurrentRequests command-line flag. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648	2022-10-23 12:43:44 +03:00
Aliaksandr Valialkin	fe52378f45	lib/storage: substitute remaining calls to fs.MustRemoveAll with fs.MustRemoveDirAtomic	2022-09-13 15:49:25 +03:00
Roman Khavronenko	2c59c83191	lib/storage: fix the search for empty label name (#2991 ) * lib/storage: fix the search for empty label name Signed-off-by: hagen1778 <roman@victoriametrics.com> * Apply suggestions from code review Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2022-08-19 11:05:09 +03:00
Aliaksandr Valialkin	1a363192ff	lib/storage: typo fix in comments after `f830edc0bc`	2022-08-16 13:45:32 +03:00
Aliaksandr Valialkin	dc929e0d16	lib/storage: improve performance for /api/v1/labels and /api/v1/label/.../values endpoints when `match[]` filter matches small number of time series Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2978	2022-08-16 13:34:23 +03:00
Aliaksandr Valialkin	9f95099cf4	lib/storage: explain why the GetOrCreateTSIDByName function doesnt check whether the per-day entry for the given date exists if TSID is found in global index	2022-08-02 09:13:41 +03:00
Aliaksandr Valialkin	586d267a44	lib/storage: do not compress small number of tsids when storing them in tagFiltersCache This speeds up tsids retreival from the cache for 0-2 tsids	2022-07-30 00:11:14 +03:00
Aliaksandr Valialkin	5afa54e845	lib/vmselectapi: use string type for tagKey and tagValuePrefix args at TagValueSuffixes() This improves the API consistency	2022-07-06 12:46:21 +03:00
Aliaksandr Valialkin	78f9a8aafd	lib/storage: put the (date, metricID) entry in dateMetricIDCache just after the corresponding series is registered in the per-day inverted index Previously the time series could be put into dateMetricIDCache without registering in the per-day inverted index if GetOrCreateTSIDByName finds TSID entry in the global index. This could lead to missing series in query results. The issue has been introduced in the commit `55e7afae3a`, which has been included in VictoriaMetrics v1.78.0	2022-07-05 14:56:55 +03:00
Aliaksandr Valialkin	4fb0f15322	all: readability improvements for query traces - show dates in human-readable format, e.g. 2022-05-07, instead of a numeric value - limit the maximum length of queries and filters shown in trace messages	2022-06-30 18:19:43 +03:00
Aliaksandr Valialkin	926fccbb8d	lib/storage: add querytracer to more contexts querytracer has been added to the following storage.Storage methods: - RegisterMetricNames - DeleteMetrics - SearchTagValueSuffixes - SearchGraphitePaths	2022-06-27 12:53:49 +03:00
Aliaksandr Valialkin	270ad39359	lib/storage: properly take into account already registered series when `-storage.maxHourlySeries` or `-storage.maxDailySeries` limits are enabled The commit `5fb45173ae` takes into account only newly registered series when applying cardinality limits. This means that the cardinality limit could be exceeded with already registered series. This commit returns back accounting for already registered series when applying cardinality limits.	2022-06-20 13:53:41 +03:00
Aliaksandr Valialkin	7a79e7c0ef	lib/storage: create per-day indexes together with global indexes when registering new time series Previously the creation of per-day indexes and global indexes for the newly registered time series was decoupled. Now global indexes and per-day indexes for the current day are created toghether for new time series. This should speed up registering new time series a bit.	2022-06-19 22:32:41 +03:00
Aliaksandr Valialkin	88e1221b35	lib/storage: do not register new series if `-storage.maxHourlySeries` or `-storage.maxDailySeries` limits are exceeded Previously samples for new series weren't added as expected when series limits were reached, but new series were still registered in indexdb.	2022-06-19 22:03:02 +03:00
Aliaksandr Valialkin	45fa9d798d	app/vmselect: accept `focusLabel` query arg at /api/v1/status/tsdb	2022-06-14 18:39:00 +03:00
Aliaksandr Valialkin	fb77843639	lib/storage: show top labels with the highest number of series in cardinality explorer	2022-06-14 16:34:13 +03:00
Aliaksandr Valialkin	3167fbc21d	lib/storage: improve error message when -search.max* command-line flag values are exceeded	2022-06-14 13:28:21 +03:00
Aliaksandr Valialkin	61e03f172b	app/vmselect: optimize `/api/v1/labels` and `/api/v1/label/.../values` handlers when `match[]` query arg is passed to them	2022-06-12 14:06:24 +03:00
Aliaksandr Valialkin	cb39eada77	all: improve query tracing coverage for indexdb search Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1403	2022-06-09 20:04:02 +03:00
Dmytro Kozlov	f2754c3e90	Cardinality explorer (#2625 ) * Cardinality explorer * vmui, vmselect: updated field name, added description to spinner * make vmui-update * updated const name, make vmui-update * lib/storage: changes calculation for totalSeries values * added static files * wip * wip * wip * wip * docs/CHANGELOG.md: document cardinality explorer feature See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2233 Co-authored-by: f41gh7 <nik@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2022-06-08 18:54:27 +03:00
Roman Khavronenko	e9ee043879	lib/storage: make `indexdb/tagFilters` cache size configurable (#2667 ) The default size of `indexdb/tagFilters` now can be overridden via `storage.cacheSizeIndexDBTagFilters` flag. Please, be careful with changing default size since it may lead to inefficient work of the vmstorage or OOM exceptions. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2663 Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Nikolay <nik@victoriametrics.com>	2022-06-01 14:57:39 +03:00
Aliaksandr Valialkin	fedfc9e686	lib/storage: stop background merge when storage enters read-only mode This should prevent from `no space left on device` errors when VictoriaMetrics under-estimates the additional disk space needed for background merge. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2603	2022-06-01 14:22:12 +03:00
Aliaksandr Valialkin	afced37c0b	all: add initial support for query tracing See https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#query-tracing Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1403	2022-06-01 02:31:44 +03:00
Aliaksandr Valialkin	945e9fa8c4	lib/storage: `make fmt`	2022-05-31 12:42:48 +03:00
Aliaksandr Valialkin	727cc119b6	lib/storage: do not take into account series from the next day when `match[]` filter is passed to /api/v1/status/tsdb	2022-05-31 12:42:48 +03:00
Aliaksandr Valialkin	81b7a31cb1	app/vmstorage: properly handle `maxSeries` limit passed from vmselect to vmstorage	2022-04-12 11:19:07 +03:00
Aliaksandr Valialkin	123a88bb65	lib/storage: reuse sync.WaitGroup objects This reduces GC load by up to 10% according to memory profiling	2022-04-06 14:00:50 +03:00
Aliaksandr Valialkin	b843f0e229	app/vmselect: add fine-grained limits for the number of returned/scanned time series for various APIs	2022-03-26 11:28:14 +02:00
jduncan0000	e7831ae154	Fix for issue #2255 - matchTagFilters for positive empty-match filters (#2304 ) * fix for issue 2255 - matchTagFilters for positive empty-match filters * add example to comments * formatting * add test for positive empty match * formatting	2022-03-18 13:08:54 +02:00
Aliaksandr Valialkin	28b610db07	lib/storage: document why job-like and instance-like labels must be stored at mn.Tags[0] and mn.Tags[1] Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2244	2022-02-25 13:21:53 +02:00
Aliaksandr Valialkin	d1881fa582	lib/storage: add a comment to indexSearch.containsTimeRange() on why it allows false positives Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2239	2022-02-24 12:48:33 +02:00
Aliaksandr Valialkin	244c23ea2c	lib/workingsetcache: reduce the default cache rotation period from hour to 20 minutes This should reduce memory usage under high time series churn rate	2022-02-23 13:42:27 +02:00
Aliaksandr Valialkin	fcaa0c5202	lib/storage: optimize `/api/v1/status/tsdb` call by skipping all the artificially created tag entries at once This is a follow-up for `b71be42d90`	2022-02-21 19:00:04 +02:00
Aliaksandr Valialkin	5260d7a954	lib/storage: typo fix after `c3affb0c4f`	2022-02-17 12:56:33 +02:00
Aliaksandr Valialkin	d9bdb42219	lib/storage: simplify code for searching for label values This is a follow-up after `9dd191b27c`	2022-02-17 12:39:14 +02:00
Aliaksandr Valialkin	2ebc3d21c3	lib/storage: properly skip composite tag entries when searching for tag names or tag values This is a follow-up for `b71be42d90` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2200	2022-02-16 23:02:18 +02:00
Aliaksandr Valialkin	ee066aa0d5	lib/storage: use binary search instead of full scan for skipping artificial tags when searching for tag names or tag values This should improve performance for /api/v1/labels and /api/v1/label/<label_name>/values See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2200	2022-02-16 18:17:27 +02:00
Aliaksandr Valialkin	53c2135d2a	lib/storage: tune the logic for pre-populating of the per-day inverted index for the next day - Postpone the pre-poulation to the last hour of the current day. This should reduce the number of useless entries in the next per-day index, which shouldn't be created there, when the corresponding time series are stopped to be pushed during the current day. - Make the pre-population more smooth in time by using the hash of MetricID instead of MetricID itself when calculating the need for for the given MetricID pre-population. - Sync the logic for pre-population of the next day inverted index with the logic of pre-populating tsid cache after indexdb rotation. This should improve code maintainability. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/430 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401	2022-02-12 16:39:33 +02:00
Roman Khavronenko	d107f86fbc	lib/index: reduce read/write load after indexDB rotation (#2177 ) * lib/index: reduce read/write load after indexDB rotation IndexDB in VM is responsible for storing TSID - ID's used for identifying time series. The index is stored on disk and used by both ingestion and read path. IndexDB is stored separately to data parts and is global for all stored data. It can't be deleted partially as VM deletes data parts. Instead, indexDB is rotated once in `retention` interval. The rotation procedure means that `current` indexDB becomes `previous`, and new freshly created indexDB struct becomes `current`. So in any time, VM holds indexDB for current and previous retention periods. When time series is ingested or queried, VM checks if its TSID is present in `current` indexDB. If it is missing, it checks the `previous` indexDB. If TSID was found, it gets copied to the `current` indexDB. In this way `current` indexDB stores only series which were active during the retention period. To improve indexDB lookups, VM uses a cache layer called `tsidCache`. Both write and read path consult `tsidCache` and on miss the relad lookup happens. When rotation happens, VM resets the `tsidCache`. This is needed for ingestion path to trigger `current` indexDB re-population. Since index re-population requires additional resources, every index rotation event may cause some extra load on CPU and disk. While it may be unnoticeable for most of the cases, for systems with very high number of unique series each rotation may lead to performance degradation for some period of time. This PR makes an attempt to smooth out resource usage after the rotation. The changes are following: 1. `tsidCache` is no longer reset after the rotation; 2. Instead, each entry in `tsidCache` gains a notion of indexDB to which they belong; 3. On ingestion path after the rotation we check if requested TSID was found in `tsidCache`. Then we have 3 branches: 3.1 Fast path. It was found, and belongs to the `current` indexDB. Return TSID. 3.2 Slow path. It wasn't found, so we generate it from scratch, add to `current` indexDB, add it to `tsidCache`. 3.3 Smooth path. It was found but does not belong to the `current` indexDB. In this case, we add it to the `current` indexDB with some probability. The probability is based on time passed since the last rotation with some threshold. The more time has passed since rotation the higher is chance to re-populate `current` indexDB. The default re-population interval in this PR is set to `1h`, during which entries from `previous` index supposed to slowly re-populate `current` index. The new metric `vm_timeseries_repopulated_total` was added to identify how many TSIDs were moved from `previous` indexDB to the `current` indexDB. This metric supposed to grow only during the first `1h` after the last rotation. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 Signed-off-by: hagen1778 <roman@victoriametrics.com> * wip * wip Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2022-02-12 00:34:44 +02:00
Aliaksandr Valialkin	80fc3fda07	lib/storage: follow-up for `38bf5fc136`	2022-01-05 16:02:17 +02:00
weng zhao	1e0fe615ad	vmstorage: fix query like `{foo=~"bar\|"}` return extra timeseries cause by negative filter transformation malfunction (#2032 ) 1. L2749 make kb.B remain the value of comonPrefix instead of tf.prefix 2. L2762 avoid change tf.value from "bar\|" to ".+r\|"	2022-01-05 15:57:54 +02:00
Aliaksandr Valialkin	ab4be24397	app/vmstorage: export vm_cache_size_max_bytes metrics for determining capacity of various caches The vm_cache_size_max_bytes metric can be used for determining caches which reach their capacity via the following query: vm_cache_size_bytes / vm_cache_size_max_bytes > 0.9	2021-12-02 10:30:01 +02:00
Aliaksandr Valialkin	c2f37f049b	lib/storage: properly search series by multiple tag filters matching empty labels such as foo{bar=~"baz\|",x=~"y\|"} Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1601 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/395	2021-09-09 21:12:53 +03:00
Aliaksandr Valialkin	c473d8ffe1	li/storage: re-use the per-day inverted index search code for searching in global index This allows removing a big pile of outdated code for global index search. This may help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1486	2021-07-30 10:28:20 +03:00
Aliaksandr Valialkin	a846febc89	Revert "lib/uint64set: allow reusing bucket16 structs inside uint64set.Set via uint64set.Release method" This reverts commit `7c6d3981bf`. Reason for revert: high contention at bucket16Pool on systems with big number of CPU cores. This slows down query processing significantly.	2021-07-06 18:26:56 +03:00

1 2 3 4 5

217 Commits