VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-11-23 12:31:07 +01:00

Author	SHA1	Message	Date
Nikolay	1985110de2	lib/storage: properly check for minMissingTimestamps After changes at commit `787b9cd`. Minimal timestamps for extDB check was performed without context of the index search prefix. It worked fine for Single node version, but for cluster version a different prefix was used for metricID search requests. It may lead to incomplete results, if minimal missing timestamp was cached for the tenant with different ingestion patterns. Minimal reproducible case is: - metrics were ingested for tenants 0 and 1 - at some point in time metrics ingestion for tenant 1 stopped - index records have the following timestamps layout: tenant 0: 1,2,3,4,5,6 tenant 1: 1,2,3,4 - after indexDB rotation, containsTimeRange lookups may produce incorrect results: time range request for tenant 1 - 5:6 caches 5 as min timestamp request for the same or smaller time range for tenant 0 now returns empty results. Second case: - requests for the tenant without metrics always updates atomic value with incorrect minimal time range for other tenants. This commit replaces single atomic with map of search prefix keys. It should have slight performance overhead, but work consistently for cluster version. minMissingTimestamp is cached by prefix search key, which included tenantID. Since it will be only populated at runtime, it doesn't hold unused tenants for queries. Related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7417	2024-11-15 16:25:13 +01:00
Zakhar Bessarab	9f9cc24e4c	Revert "lib/mergeset: add sparse indexdb cache (#7269 )" This reverts commit `837d0d136d`.	2024-11-04 10:29:14 -03:00
Zakhar Bessarab	4e50d6eed3	lib/storage/partition: prevent panic in case resulting in-memory part is empty after merge (#7329 ) Some checks failed publish-docs / Build (push) Waiting to run Details build / Build (push) Has been cancelled Details CodeQL Go / Analyze (push) Has been cancelled Details main / lint (push) Has been cancelled Details main / test (test-full) (push) Has been cancelled Details main / test (test-full-386) (push) Has been cancelled Details main / test (test-pure) (push) Has been cancelled Details It is possible for in-memory part to be empty if ingested samples are removed by retention filters. In this case, data will not be discarded due to retention before creating in memory part. After in-memory parts merge samples will be removed resulting in creating completely empty part at destination. This commit checks for resulting part and skips it, if it's empty. --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2024-10-27 20:40:13 +01:00
Zakhar Bessarab	837d0d136d	lib/mergeset: add sparse indexdb cache (#7269 ) Related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7182 - add a separate index cache for searches which might read through large amounts of random entries. Primary use-case for this is retention and downsampling filters, when applying filters background merge needs to fetch large amount of random entries which pollutes an index cache. Using different caches allows to reduce effect on memory usage and cache efficiency of the main cache while still having high cache hit rate. A separate cache size is 5% of allowed memory. - reduce size of indexdb/dataBlocks cache in order to free memory for new sparse cache. Reduced size by 5% and moved this to a separate cache. - add a separate metricName search which does not cache metric names - this is needed in order to allow disabling metric name caching when applying downsampling/retention filters. Applying filters during background merge accesses random entries, this fills up cache and does not provide an actual improvement due to random access nature. Merge performance and memory usage stats before and after the change: - before ![image](https://github.com/user-attachments/assets/485fffbb-c225-47ae-b5c5-bc8a7c57b36e) - after ![image](https://github.com/user-attachments/assets/f4ba3440-7c1c-4ec1-bc54-4d2ab431eef5) --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2024-10-24 15:21:17 +02:00
Artem Fetishev	6b9f57e5f7	lib/storage: Fix flaky test: TestStorageRotateIndexDB (#7267 ) Some checks are pending build / Build (push) Waiting to run Details CodeQL Go / Analyze (push) Waiting to run Details main / lint (push) Waiting to run Details main / test (test-full) (push) Blocked by required conditions Details main / test (test-full-386) (push) Blocked by required conditions Details main / test (test-pure) (push) Blocked by required conditions Details publish-docs / Build (push) Waiting to run Details This commit fixes the TestStorageRotateIndexDB flaky test reported at: #6977. Sample test failure: https://pastebin.com/bTSs8HP1 The test fails because one goroutine adds items to the indexDB table while another goroutine is closing that table. This may happen if indexDB rotation happens twice during one Storage.add() operation: - Storage.add() takes the current indexDB and adds index recods to it - First index db rotation makes the current index DB a previous one (still ok at this point) - Second index db rotation removes the indexDB that was current two rotations earlier. It does this by setting the mustDrop flag to true and decrementing the ref counter. The ref counter reaches zero which cases the underlying indexdb table to release its resources gracefully. Graceful release assumes that the table is not written anymore. But Storage.add() still adds items to it. The solution is to increment the indexDB ref counters while it is used inside add(). The unit test has been changed a little so that the test fails reliably. The idea is to make add() function invocation to last much longer, therefore the test inserts not just one record at a time but thouthands of them. To see the test fail, just replace the idbsLocked() func with: ```go unc (s Storage) idbsLocked2() (indexDB, *indexDB, func()) { return s.idbCurr.Load(), s.idbNext.Load(), func() {} } ``` --------- Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>	2024-10-23 11:48:21 +02:00
Zhu Jiekun	8c50c38a80	vmstorage: auto calculate maxUniqueTimeseries based on resources (#6961 ) ### Describe Your Changes Add support for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6930 Calculate `-search.maxUniqueTimeseries` by `-search.maxConcurrentRequests` and remaining memory if it's not set or less equal than 0. The remaining memory is affected by `-memory.allowedPercent`, `-memory.allowedBytes` and cgroup memory limit. ### Checklist The following checks are mandatory: - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Roman Khavronenko <roman@victoriametrics.com> (cherry picked from commit `85f60237e2`) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-10-18 14:00:14 +02:00
Roman Khavronenko	0d4f4b8f7d	(app\|lib)/vmstorage: do not increment `vm_rows_ignored_total` on NaNs (#7166 ) `vm_rows_ignored_total` metric is a metric for users to signalize about ingestion issues, such as bad timestamp or parsing error. In commit `a5424e95b3` this metric started to increment each time vmstorage gets NaN. But NaN is a valid value for Prometheus data model and for Prometheus metrics exposition format. Exporters from Prometheus ecosystem could expose NaNs as values for metrics and these values will be delivered to vmstorage and increment the metric. Since there is nothing user can do with this, in opposite to parsing errors or bad timestamps, there is not much sense in incrementing this metric. So this commit rolls-back `reason="nan_value"` increments. ### Describe Your Changes Please provide a brief description of the changes you made. Be as specific as possible to help others understand the purpose and impact of your modifications. ### Checklist The following checks are mandatory: - [ ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-10-02 12:37:27 +02:00
Artem Fetishev	ed5da38ede	Introduce a flag for limiting the number of time series to delete (#7091 ) ### Describe Your Changes Introduce the `-search.maxDeleteSeries` flag that limits the number of time series that can be deleted with a single `/api/v1/admin/tsdb/delete_series` call. Currently, any number can be deleted and if the number is big (millions) then the operation may result in unaccounted CPU and memory usage spikes which in some cases may result in OOM kill (see #7027). The flag limits the number to 30k by default and the users may override it if needed at the vmstorage start time. --------- Signed-off-by: Artem Fetishev <rtm@victoriametrics.com> Co-authored-by: Nikolay <nik@victoriametrics.com>	2024-09-30 10:02:21 +02:00
Artem Fetishev	55febc0920	lib/storage: restore ability to put empty metric ID list into tagFiltersToMetricIDsCache (#7064 ) ### Describe Your Changes Currently it the metricID list is empty it won't be mashalled and as the result won't be put into the tagFiltersToMetricIDsCache which causes the cache misses for the corresponding tagFilters. In some setups this causes severe search speed detradation (see #7009). The empty metric IDs was covered before but then was accidentally removed in `6c21439`. This PR restores the coverage of this case. A new unit test can be used as a proof that empty metricID lists are not added to the cache (just remove the fix in index_db.go and run the test to see the result) Also a benchmark has been added to see the implications of the compression. ``` user@laptop:~/p/github.com/rtm0/VictoriaMetrics/01/src$ go test ./lib/storage/ -run=NONE -bench BenchmarkMarshalUnmarshalMetricIDs --loggerLevel=ERROR goos: linux goarch: amd64 pkg: github.com/VictoriaMetrics/VictoriaMetrics/lib/storage cpu: 13th Gen Intel(R) Core(TM) i7-1355U BenchmarkMarshalUnmarshalMetricIDs/numMetricIDs-0-12 3237240 363.5 ns/op 0 compression-rate BenchmarkMarshalUnmarshalMetricIDs/numMetricIDs-1-12 2831049 451.8 ns/op 0.4706 compression-rate BenchmarkMarshalUnmarshalMetricIDs/numMetricIDs-10-12 1152764 1009 ns/op 1.667 compression-rate BenchmarkMarshalUnmarshalMetricIDs/numMetricIDs-100-12 297055 3998 ns/op 5.755 compression-rate BenchmarkMarshalUnmarshalMetricIDs/numMetricIDs-1000-12 31172 34566 ns/op 8.484 compression-rate BenchmarkMarshalUnmarshalMetricIDs/numMetricIDs-10000-12 4900 289659 ns/op 9.416 compression-rate BenchmarkMarshalUnmarshalMetricIDs/numMetricIDs-100000-12 447 2341173 ns/op 9.456 compression-rate BenchmarkMarshalUnmarshalMetricIDs/numMetricIDs-1000000-12 42 24926928 ns/op 9.468 compression-rate BenchmarkMarshalUnmarshalMetricIDs/numMetricIDs-10000000-12 5 204098872 ns/op 9.467 compression-rate PASS ok github.com/VictoriaMetrics/VictoriaMetrics/lib/storage 15.018s ``` ### Checklist The following checks are mandatory: - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: Artem Fetishev <wwctrsrx@gmail.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-09-20 17:21:53 +02:00
Aliaksandr Valialkin	787b9cd9a0	lib/storage: improve performance for indexSearch.containsTimeRange() The indexSearch.containsTimeRange() function is called for the current indexDB and the previous indexDB every time when searching for metricIDs by label filters. This function consumes a lot of additional CPU time for cases when queries with lightweight label filters are sent to VictoriaMetrics at high rate (e.g. thousands of RPS), like in the issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7009 . Optimize indexSearch.containsTimeRange() function in the following ways: - Unconditionally return true if this function is called for the current indexDB, since there are very high chances that the current indexDB contains the data with timestamps in the requested time range. - Cache the minimum timestamp, which is missing in the indexed data for the previous indexDB. This is safe to do, since the previous indexDB is readonly. This optimization eliminates potentially slow lookup in the previous indexDB for typical use cases when the requested time range is close to the current time.	2024-09-20 13:07:20 +02:00
Aliaksandr Valialkin	6f61e9d49d	lib/storage: simplify indexDB.doExtDB() usage by removing the returned value Previously indexDB.doExtDB() was returning boolean value, which was indicating whether f callback was called. There is no need in returning this boolean value, since the f callback can determine on itself whether it was called. This simplifies the code a bit. While at it, document indexDB.doExtDB().	2024-09-20 11:59:57 +02:00
Roman Khavronenko	218c533874	lib/storage: follow-up after `d8f8822fa5` (#7036 ) Make function name and comments more clear. `d8f8822fa5` Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Nikolay <nik@victoriametrics.com>	2024-09-20 11:50:47 +02:00
Nikolay	d8f8822fa5	lib/storage: consistently check for missing metricID index records (#6967 ) * Previously, only metricID->metricName missing index records were tracked with deadline But it was possible a case for missing metricID->TSID index records. IndexDB metrics fix exposed misleading metric for such missing records. * This commit adds check for metricID->TSID missing index records. And delete missing metricID entry if it hit 60 second deadline. Related issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6931 Signed-off-by: f41gh7 <nik@victoriametrics.com>	2024-09-16 10:05:08 +02:00
Artem Fetishev	a5424e95b3	lib/storage: adds metrics that count records that failed to insert ### Describe Your Changes Add storage metrics that count records that failed to insert: - `RowsReceivedTotal`: the number of records that have been received by the storage from the clients - `RowsAddedTotal`: the number of records that have actually been persisted. This value must be equal to `RowsReceivedTotal` if all the records have been valid ones. But it will be smaller otherwise. The values of the metrics below should provide the insight of why some records hasn't been added - `NaNValueRows`: the number of records whose value was `NaN` - `StaleNaNValueRows`: the number of records whose value was `Stale NaN` - `InvalidRawMetricNames`: the number of records whose raw metric name has failed to unmarshal. The following metrics existed before this PR and are listed here for completeness: - `TooSmallTimestampRows`: the number of records whose timestamp is negative or is older than retention period - `TooBigTimestampRows`: the number of records whose timestamp is too far in the future. - `HourlySeriesLimitRowsDropped`: the number of records that have not been added because the hourly series limit has been exceeded. - `DailySeriesLimitRowsDropped`: the number of records that have not been added because the daily series limit has been exceeded. --- Signed-off-by: Artem Fetishev <wwctrsrx@gmail.com>	2024-09-06 17:57:21 +02:00
Artem Fetishev	39294b4919	lib/storage: do not drop stale NaN samples (#6936 ) This patch reverts `1fd3385` After discussing it we've come to conclusion that this is a valid behavior which can be avoided by deleting the time series only once the corresponding stale NaNs have been received. On the other hand, the fix leads to lost stale NaNs in some rare but valid use cases. For example: - In a cluster configuration the samples for a given time series are normally sent to the same vmstorage replica. However, wminsert may reroute the samples to another replica because the original one is down or is overloaded. In this case the stale NaN may end up on a replica that has no data for that time series, but we still want to record that sample. Thus, reverting that fix. --- related issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5069 Signed-off-by: Artem Fetishev <wwctrsrx@gmail.com>	2024-09-05 16:45:09 +02:00
Hui Wang	b48f5f3e59	lib/storage: fix metric `vm_object_references{type="indexdb"}` (#6937 ) follow up `4ecc370acb` ### Describe Your Changes Please provide a brief description of the changes you made. Be as specific as possible to help others understand the purpose and impact of your modifications. ### Checklist The following checks are mandatory: - [ ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/).	2024-09-05 16:42:49 +02:00
rtm0	4df243d530	lib/storage: improve the message of the tooManyTimeseries error (#6893 ) ### Describe Your Changes This is a follow-up for #6836. Per @valyala's [comment](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6836#discussion_r1730291704), the error message does not reflect which flag needs to be adjusted. ### Checklist The following checks are mandatory: - [x ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: Artem Fetishev <wwctrsrx@gmail.com>	2024-09-03 10:28:03 +02:00
rtm0	2c856c6951	tests: check Metrics.RowsAddedTotal in unit tests (#6895 ) ### Describe Your Changes This is a follow-up PR: Unit tests introduced in #6872 can now use RowsAddedTotal counter whose scope was fixed in #6841. ### Checklist The following checks are mandatory: - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). Signed-off-by: Artem Fetishev <wwctrsrx@gmail.com> Co-authored-by: Nikolay <nik@victoriametrics.com>	2024-08-30 14:31:15 +02:00
Nikolay	4ecc370acb	lib/storage: properly add previous indexDB metrics (#6890 ) Previously, some extIndexDB metrics were not registered. It resulted into missing metrics, if metric value was added to the extIndexDB. It's a usual case for search requests at both indexes. Current commit updates all metrics from extIndexDB according to the current IndexDB. It must fix such cases Related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6868 ### Describe Your Changes Please provide a brief description of the changes you made. Be as specific as possible to help others understand the purpose and impact of your modifications. ### Checklist The following checks are mandatory: - [ ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/).	2024-08-28 11:14:28 +02:00
rtm0	9fcfba3927	lib/storage: properly handle maxMetrics limit at metricID search `TL;DR` This PR improves the metric IDs search in IndexDB: - Avoid seaching for metric IDs twice when `maxMetrics` limit is exceeded - Use correct error type for indicating that the `maxMetrics` limit is exceded - Simplify the logic of deciding between per-day and global index search A unit test has been added to ensure that this refactoring does not break anything. --- Function calls before the fix: ``` idb.searchMetricIDs \|__ is.searchMetricIDs \|__ is.searchMetricIDsInternal \|__ is.updateMetricIDsForTagFilters \|__ is.tryUpdatingMetricIDsForDateRange \| \| \|__ is.getMetricIDsForDateAndFilters ``` - `searchMetricIDsInternal` searches metric IDs for each filter set. It maintains a metric ID set variable which is updated every time the `updateMetricIDsForTagFilters` function is called. After each successful call, the function checks the length of the updated metric ID set and if it is greater than `maxMetrics`, the function returns `too many timeseries` error. - `updateMetricIDsForTagFilters` uses either per-day or global index to search metric IDs for the given filter set. The decision of which index to use is made is made within the `tryUpdatingMetricIDsForDateRange` function and if it returns `fallback to global search` error then the function uses global index by calling `getMetricIDsForDateAndFilters` with zero date. - `tryUpdatingMetricIDsForDateRange` first checks if the given time range is larger than 40 days and if so returns `fallback to global search` error. Otherwise it proceeds to searching for metric IDs within that time range by calling `getMetricIDsForDateAndFilters` for each date. - `getMetricIDsForDateAndFilters` searches for metric IDs for the given date and returns `fallback to global search` error if the number of found metric IDs is greater than `maxMetrics`. Problems with this solution: 1. The `fallback to global search` error returned by `getMetricIDsForDateAndFilters` in case when maxMetrics is exceeded is misleading. 2. If `tryUpdatingMetricIDsForDateRange` proceeds to date range search and returns `fallback to global search` error (because `getMetricIDsForDateAndFilters` returns it) then this will trigger global search in `updateMetricIDsForTagFilters`. However the global search uses the same maxMetrics value which means this search is destined to fail too. I.e. the same search is performed twice and fails twice. 3. `too many timeseries` error is already handled in `searchMetricIDsInternal` and therefore handing this error in `updateMetricIDsForTagFilters` is redundant 4. updateMetricIDsForTagFilters is a better place to make a decision on whether to use per-day or global index. Solution: 1. Use a dedicated error for `too many timeseries` case 2. Handle `too many timeseries` error in `searchMetricIDsInternal` only 3. Move the per-day or global search decision from `tryUpdatingMetricIDsForDateRange` to `updateMetricIDsForTagFilters` and remove `fallback to global search` error. --------- Signed-off-by: Artem Fetishev <wwctrsrx@gmail.com> Co-authored-by: Nikolay <nik@victoriametrics.com>	2024-08-27 21:39:03 +02:00
rtm0	eef6943084	lib/storage: properly register index records with RegisterMetricNames Once the timeseries is in tsidCache, new entries won't be created in per-day index because the RegisterMetricNames() code does consider different dates for the same timeseries. So this case has been added. The same bug exists for AddRows() but it is not manifested because the index entries are finally created in updatePerDateData(). RegisterMetricNames also updated to increase the newTimeseriesCreated counter because it actually creates new time series in index. A unit tests has been added that check all possible data patterns (different metric names and dates) and code branches in both RegisterMetricNames and AddRows. The total number of new unit tests is around 100 which increaded the running time of storage tests by 50%. --------- Signed-off-by: Artem Fetishev <wwctrsrx@gmail.com> Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>	2024-08-27 21:33:53 +02:00
rtm0	30f98916f9	Move rowsAddedTotal counter to Storage (#6841 ) ### Describe Your Changes Reduced the scope of rowsAddedTotal variable from global to Storage. This metric clearly belongs to a given Storage object as it counts the number of records added by a given Storage instance. Reducing the scope improves the incapsulation and allows to reset this variable during the unit tests (i.e. every time a new Storage object is created by a test, that object gets a new variable). Signed-off-by: Artem Fetishev <wwctrsrx@gmail.com>	2024-08-27 21:30:37 +02:00
Aliaksandr Valialkin	8551fbe9f3	Revert "refactor(vmstorage): Refactor the code to reduce the time complexity of `MustAddRows` and improve readability (#6629 )" This reverts commit `e280d90e9a`. Reason for revert: the updated code doesn't improve the performance of table.MustAddRows for the typical case when rows contain timestamps belonging to ptws[0]. The performance may be improved in theory for the case when all the rows belong to partiton other than ptws[0], but this partition is automatically moved to ptws[0] by the code at lines `6aad1d43e9/lib/storage/table.go (L287-L298)` , so the next time the typical case will work. Also the updated code makes the code harder to follow, since it introduces an additional level of indirection with non-trivial semantics inside table.MustAddRows - the partition.TimeRangeInPartition() function. This function needs to be inspected and understood when reading the code at table.MustAddRows(). This function depends on minTsInRows and maxTsInRows vars, which are defined and initialized many lines above the partition.TimeRangeInPartition() call. This complicates reading and understanding the code even more. The previous code was using clearer loop over rows with the clear call to partition.HasTimestamp() for every timestamp in the row. The partition.HasTimestamp() call is used in the table.MustAddRows() function multiple times. This makes the use of partition.HasTimestamp() call more consistent, easier to understand and easier to maintain comparing to the mix of partition.HasTimestamp() and partition.TimeRangeInPartition() calls. Aslo, there is no need in documenting some hardcore software engineering refactoring at docs/CHANGLELOG.md, since the docs/CHANGELOG.md is intended for VictoriaMetrics users, who may not know software engineering. The docs/CHANGELOG.md must document user-visible changes, and the docs must be concise and clear for VictoriaMetrics users. See https://docs.victoriametrics.com/contributing/#pull-request-checklist for more details. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6629	2024-07-25 14:32:09 +02:00
Ruixiang Tan	e280d90e9a	refactor(vmstorage): Refactor the code to reduce the time complexity of `MustAddRows` and improve readability (#6629 ) ### Describe Your Changes The original logic is not only highly complex but also poorly readable, so it can be modified to increase readability and reduce time complexity. --------- Co-authored-by: Zhu Jiekun <jiekun@victoriametrics.com>	2024-07-25 08:55:12 +02:00
rtm0	bdc0e688e8	Fix inconsistent error handling in Storage.AddRows() (#6583 ) ### Describe Your Changes `Storage.AddRows()` returns an error only in one case: when `Storage.updatePerDateData()` fails to unmarshal a `metricNameRaw`. But the same error is treated as a warning when it happens inside `Storage.add()` or returned by `Storage.prefillNextIndexDB()`. This commit fixes this inconsistency by treating the error returned by `Storage.updatePerDateData()` as a warning as well. As a result `Storage.add()` does not need a return value anymore and so doesn't `Storage.AddRows()`. Additionally, this commit adds a unit test that checks all cases that result in a row not being added to the storage. --------- Signed-off-by: Artem Fetishev <wwctrsrx@gmail.com> Co-authored-by: Nikolay <nik@victoriametrics.com>	2024-07-17 12:07:14 +02:00
Aliaksandr Valialkin	784327ea30	lib/uint64set: optimize Set.Has() for nil Set - it should be inlined now This makes unnecessary the checkDeleted variable at lib/storage/index_db.go This is a follow-up for `b984f4672e` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6342	2024-07-15 23:59:20 +02:00
Aliaksandr Valialkin	c995ccad93	lib/{storage,mergeset}: do not allow setting dataFlushInterval to values smaller than pending{Items,Rows}FlushInterval Pending rows and items unconditionally remain in memory for up to pending{Items,Rows}FlushInterval, so there is no any sense in setting dataFlushInterval (the interval for guaranteed flush of in-memory data to disk) to values smaller than pending{Items,Rows}FlushInterval, since this doesn't affect the interval for flushing pending rows and items from memory to disk. This is a follow-up for `4c80b17027` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6221	2024-07-15 10:08:15 +02:00
Aliaksandr Valialkin	3c02937a34	all: consistently use 'any' instead of 'interface{}' 'any' type is supported starting from Go1.18. Let's consistently use it instead of 'interface{}' type across the code base, since `any` is easier to read than 'interface{}'.	2024-07-10 00:20:37 +02:00
rtm0	a42bd59ee4	Fix Date metricid cache consistency under concurrent use (#6534 ) ### Describe Your Changes Fix Date metricid cache consistency under concurrent use. When one goroutine calls Has() and does not find the cache entry in the immutable map it will acquire a lock and check the mutable map. And it is possible that before that lock is acquired, the entry is moved from the mutable map to the immutable map by another goroutine causing a cache miss. The fix is to check the immutable map again once the lock is acquired. ### Checklist The following checks are mandatory: - [x ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: Artem Fetishev <wwctrsrx@gmail.com> Co-authored-by: Nikolay <nik@victoriametrics.com>	2024-06-26 17:33:38 +02:00
Roman Khavronenko	b984f4672e	lib/storage: filter deleted label names and values from `/api/v1/labe… (#6342 ) …ls` and `/api/v1/label/.../values` Check for deleted metrics when `match[]` filter matches small number of time series (optimized path). The issue was introduced [v1.81.0](https://docs.victoriametrics.com/changelog_2022/#v1810). Related issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6300 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2978 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-05-29 14:07:44 +02:00
Aliaksandr Valialkin	4b458370c1	lib/logstorage: work-in-progress	2024-05-24 03:06:55 +02:00
Nikolay	a5d1013042	lib/storage: change default value for maxLabelValueLen to 1024 (#6313 ) * It must reduce memory usage for misbehaving clients. Since VictoriaMetrics stores sparse index inmemory. * Reduce disk space usage for indexdb. * Prevent possible indexDB items drops. * It may trigger slow insert and new timeseries registration due to default value for flag change https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6176 --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-05-22 21:53:53 +02:00
Aliaksandr Valialkin	ad505a7a9a	lib/logstorage: work-in-progress	2024-05-20 04:08:30 +02:00
Aliaksandr Valialkin	cc2647d212	lib/encoding: optimize UnmarshalVarUint64, UnmarshalVarInt64 and UnmarshalBytes a bit Change the return values for these functions - now they return the unmarshaled result plus the size of the unmarshaled result in bytes, so the caller could re-slice the src for further unmarshaling. This improves performance of these functions in hot loops of VictoriaLogs a bit.	2024-05-14 01:23:54 +02:00
Hui Wang	4c80b17027	storage: correctly apply `-inmemoryDataFlushInterval` when it's set t… (#6221 ) …o minimum supported value 1s pendingRowsFlushInterval was bumped to 2s in `73f0a805e2`	2024-05-13 16:44:30 +02:00
Aliaksandr Valialkin	590160ddbb	lib/slicesutil: add helper functions for setting slice length and extending its capacity The added helper functions - SetLength() and ExtendCapacity() - replace error-prone code with simple function calls.	2024-05-12 11:32:17 +02:00
Aliaksandr Valialkin	f20d452196	lib/storage: remove outdated misleading comments	2024-05-12 10:24:04 +02:00
Aliaksandr Valialkin	6b1cc9b946	lib/storage: search for all the values for the given label before applying filters and limits It is incorrect applying the limit on the number of values to search without applying filters, since the returned subset of label values may miss the label values matching the given filters. This is a follow-up for `66630c7960`	2024-04-18 20:29:36 +02:00
Aliaksandr Valialkin	66630c7960	lib/storage: improve performance for /api/v1/label/labelName/values when match[] contains only a single filter on labelName This speeds up auto-suggestion for metric names in VMUI and Grafana, which use the following query in this case: /api/v1/label/__name__/values?match[]={__name__=~".some_value."} When the user types `some_value` in the query input field.	2024-04-18 01:15:20 +02:00
Aliaksandr Valialkin	85d09e5a2d	lib/{mergeset,storage}: log deleting directories inside partitions if they are missing in parts.json This should improve debuggability of unexpected deletion of directories inside partitions. While at it, log the proper path to parts.json when the directory for big part is missing in the partition. parts.json is located inside directory with small parts, and there is no parts.json file inside directory with big parts.	2024-04-16 19:11:32 +02:00
Aliaksandr Valialkin	6bcc6c938b	lib/storage: improve comments inside functions responsible for creating indexes for newly registered time series	2024-04-16 19:11:32 +02:00
Aliaksandr Valialkin	918cccaddf	all: fix golangci-lint(revive) warnings after `0c0ed61ce7` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6001	2024-04-02 23:16:29 +03:00
Aliaksandr Valialkin	c3a72b6cdb	lib/storage: consistently use stopCh instead of stop	2024-04-02 21:24:57 +03:00
Zakhar Bessarab	af3922b1df	lib/storage: add ability to use downsampling for the given series filter (#733 ) * lib/storage: add ability to use downsampling for the given series filter Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * docs: add information about downsampling filters Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * docs: fix MetricsQL filter Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/storage/downsampling: treat missing downsampling filter as a bug Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/storage/part_header: verify correctness of downsampling filters when opening partition Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/storage/downsampling: save only appliable rules in part metadata Filter and save only rules which are appliable to partition based on MinTimestamp of stored data. Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/storage/downsampling: update log messages for final dedup Properly specify a reason of re-running deduplication for partition. Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/storage: consistently use MaxTimestamp to determine deduplication/downsampling rules Using MinTimestamp leads to applying downsampling to parts which are only partially covered by downsampling rule. For example, partition covers range [1000-2000]. At t=2100 and rule offset 500 data with t=2100-500 => 1600 must be downsampled. The range check against MinTimestamp evaluates to true even though partition contains range which must not be downsampled - [1600:2000]. Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * Follow-up - Apply the first matching downsampling period if multiple filters match the given time series. This allows fine-tuning the downsampling config for the specific needs. - Take into account downsampling filters during search queries. - Reduce the difference between community and enterprise branches. This should simplify further maintenance of these branches. - Properly parse series filters with colons inside them. - Document the feature at docs/CHANGELOG.md. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4960 --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-03-30 04:12:23 +02:00
Aliaksandr Valialkin	131f357098	lib/storage/table.go: reduce the difference with enterprise branch	2024-03-30 03:22:51 +02:00
Aliaksandr Valialkin	4001ca36b8	lib/storage/partition.go: reduce code difference a bit with enterprise branch	2024-03-30 01:39:27 +02:00
Nikolay	a05303eaa0	lib/storage: adds metrics for downsampling (#382 ) * lib/storage: adds metrics for downsampling vm_downsampling_partitions_scheduled - shows the number of parts, that must be downsampled vm_downsampling_partitions_scheduled_size_bytes - shows total size in bytes for parts, the must be donwsampled These two metrics answer the questions - is downsampling running? how many parts scheduled for downsampling and how many of them currently downsampled? Storage space that it occupies. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2612 * wip Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-03-30 01:11:49 +02:00
Aliaksandr Valialkin	4a359d5f67	lib/storage: follow-up for `76f00cea6b` Store the deadline when the metricID entries must be deleted from indexdb if metricID->metricName entry isn't found after the deadline. This should make the code more clear comparing the the previous version, where the timestamp of the first metricID->metricName lookup miss was stored in missingMetricIDs. Remove the misleading comment about the importance of the order for creating entries in the inverted index when registering new time series. The order doesn't matter, since any subset of the created entries can become visible for search before any other subset after registering in indexdb. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5948 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5959	2024-03-27 11:41:28 +02:00
Zakhar Bessarab	51f5ac1929	lib/storage/table: wait for merges to be completed when closing a table (#5965 ) * lib/storage/table: properly wait for force merges to be completed during shutdown Properly keep track of running background merges and wait for merges completion when closing the table. Previously, force merge was not in sync with overall storage shutdown which could lead to holding ptw ref. Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * docs: add changelog entry Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2024-03-26 13:49:09 +01:00
Aliaksandr Valialkin	76f00cea6b	lib/storage: wait for up to 60 seconds before deciding to delete metricID entries from indexdb if metricID->metricName entry is missing during search The metricID->metricName entry can remain invisible for search for some time after registering new metricName. This is expected condition. So wait for up to 60 seconds in the hope that the metricID->metricName entry will become visible before deleting all the entries from indexdb, which are associated with the given metricID. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5959 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5948 See also `20812008a7`	2024-03-18 00:34:32 +02:00

1 2 3 4 5 ...

793 Commits