VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-11-23 20:37:12 +01:00

Author	SHA1	Message	Date
Artem Fetishev	ed5da38ede	Introduce a flag for limiting the number of time series to delete (#7091 ) ### Describe Your Changes Introduce the `-search.maxDeleteSeries` flag that limits the number of time series that can be deleted with a single `/api/v1/admin/tsdb/delete_series` call. Currently, any number can be deleted and if the number is big (millions) then the operation may result in unaccounted CPU and memory usage spikes which in some cases may result in OOM kill (see #7027). The flag limits the number to 30k by default and the users may override it if needed at the vmstorage start time. --------- Signed-off-by: Artem Fetishev <rtm@victoriametrics.com> Co-authored-by: Nikolay <nik@victoriametrics.com>	2024-09-30 10:02:21 +02:00
YuDong Tang	295f2aa8ca	app/vmselect:add command-line flag -search.inmemoryBufSizeBytes (#6869 ) add command-line flag `-search.inmemoryBufSizeBytes` for configuring size of in-memory buffers used by vmselect during processing of vmstorage responses. A new summary metric `vm_tmp_blocks_inmemory_file_size_bytes` is exposed to show the size of the buffer during requests processing. The new setting can be used by experienced users to adjust memory usage by vmselect when processing many small read requests. Instead of allocating 4MB buffers each time, vmselect can be instructed to lower the buffer size via `-search.inmemoryBufSizeBytes`. To make the decision whether this flag needs to be adjusted users can consult with `vm_tmp_blocks_inmemory_file_size_bytes` which shows the actual size of buffers used during query processing. ---------- The detailed information of this PR can be found in https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6851 ### Checklist The following checks are mandatory: - [ ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `cab3ef8294`)	2024-08-26 14:48:53 +02:00
hagen1778	f283126084	fix typos in comments Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-08-06 14:54:49 +02:00
Aliaksandr Valialkin	3c02937a34	all: consistently use 'any' instead of 'interface{}' 'any' type is supported starting from Go1.18. Let's consistently use it instead of 'interface{}' type across the code base, since `any` is easier to read than 'interface{}'.	2024-07-10 00:20:37 +02:00
Aliaksandr Valialkin	6a0cf2cd29	app/vmselect/netstorage: add a comment explaining why all the samples in block are taken into account when checking the -search.maxSamplesPerQuery limit Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5851 This is a follow-up for `b07a02c516`	2024-06-25 03:01:43 +02:00
Aliaksandr Valialkin	b07a02c516	Revert "app/vmselect: fix the way of counting raw samples in single query (#6464 )" This reverts commit `6e395048d3`. Reason for revert: the previous logic was correct. The purpose of `-search.maxSamplesPerQuery` command-line flag is to limit the amounts of CPU resources, which could be taken by a single query - see https://docs.victoriametrics.com/#resource-usage-limits . VictoriaMetrics processes samples in blocks during querying - it reads the block, then unpacks it, then filters out samples outside the selected time range. This means that it _spends CPU time_ on reading and unpacking of _all the samples_ in every block on the requested time range, even if only a single sample per each block matches the given time range. The previous logic was effectively limiting CPU time a single query could take. The new logic fails limiting CPU time a single query could take in some pathological cases when only a small fraction of samples per each requested block fit the requested time range. This allows performing multiplication DoS-attacks by querying very narrow time ranges over historical blocks, which tend to be full. For example, if the `-search.maxSamplesPerQuery` equals to a billion, and the query requests a single sample out of 8K samples per each block, this means that the query may unpack a billion of such blocks without exceeding the limit, e.g. it may unpack and process 8K*1e9=8e12 samples. This is not what the resource usage limits were created for originally - see https://docs.victoriametrics.com/#resource-usage-limits Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5851 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6464	2024-06-25 02:43:57 +02:00
Hui Wang	6e395048d3	app/vmselect: fix the way of counting raw samples in single query (#6464 ) The limit is specified with command-line flag `-search.maxSamplesPerQuery`. Previously, samples might be over-counted and query can't be fixed by reducing time range. address https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5851	2024-06-14 15:40:30 +02:00
Aliaksandr Valialkin	c79bf3925c	Revert "app/vmselect: make vmselect resilient to absence of cache folder (#5987 )" This reverts commit `cb23685681`. Reason for revert: the "fix" may hide programming bugs related to incorrect creation of folders before their use. This may complicate detecting and fixing such bugs in the future. There are the following fixes for the issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5985 : - To configure the OS to do not drop data from the system-wide temporary directory (aka /tmp). - To run VictoriaMetrics with -cacheDataPath command-line flag, which points to the directory, which cannot be removed automatically by the OS. The case when the user accidentally deletes the directory with some files created by VictoriaMetrics shouldn't be considered as expected, so VictoriaMetrics shouldn't try resolving this case automatically. It is much better from operation and debuggability PoV is to crash with the clear `directory doesn't exist` error in this case.	2024-03-30 07:29:24 +02:00
Zakhar Bessarab	af3922b1df	lib/storage: add ability to use downsampling for the given series filter (#733 ) * lib/storage: add ability to use downsampling for the given series filter Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * docs: add information about downsampling filters Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * docs: fix MetricsQL filter Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/storage/downsampling: treat missing downsampling filter as a bug Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/storage/part_header: verify correctness of downsampling filters when opening partition Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/storage/downsampling: save only appliable rules in part metadata Filter and save only rules which are appliable to partition based on MinTimestamp of stored data. Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/storage/downsampling: update log messages for final dedup Properly specify a reason of re-running deduplication for partition. Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/storage: consistently use MaxTimestamp to determine deduplication/downsampling rules Using MinTimestamp leads to applying downsampling to parts which are only partially covered by downsampling rule. For example, partition covers range [1000-2000]. At t=2100 and rule offset 500 data with t=2100-500 => 1600 must be downsampled. The range check against MinTimestamp evaluates to true even though partition contains range which must not be downsampled - [1600:2000]. Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * Follow-up - Apply the first matching downsampling period if multiple filters match the given time series. This allows fine-tuning the downsampling config for the specific needs. - Take into account downsampling filters during search queries. - Reduce the difference between community and enterprise branches. This should simplify further maintenance of these branches. - Properly parse series filters with colons inside them. - Document the feature at docs/CHANGELOG.md. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4960 --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-03-30 04:12:23 +02:00
Roman Khavronenko	cb23685681	app/vmselect: make vmselect resilient to absence of cache folder (#5987 ) vmselect uses a cache folder in file system for two purposes: 1. Storing rollup cache results on shutdown; 2. Storing temporary search results from vmstorage during query executions. It could happen that cache folder is deleted accidentally by user, or by OS during cleanup routines. This would cause vmselect to: 1. panic on /metrics call, because `MustGetFreeSpace` will fail; 2. return query error user, as it won't be able to store temporary search results. The changes in this commit are the following: 1. Make `MustGetFreeSpace` to try re-creating the cache folder if it is missing; 2. Make vmselect to try re-creating the cache folder if it can't persist tmp search results. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5985 Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Nikolay <nik@victoriametrics.com>	2024-03-26 12:59:50 +01:00
Aliaksandr Valialkin	146fccc22d	app/vmselect/netstorage: usae unsafe.SliceData instead of deprecated reflect.SliceHeader	2024-02-29 17:36:28 +02:00
Aliaksandr Valialkin	6697da73e5	app: consistently use atomic.* types instead of atomic.* functions See `ea9e2b19a5`	2024-02-24 02:44:24 +02:00
Aliaksandr Valialkin	f46eaf92eb	app/vmselect: add -search.maxLabelsAPIDuration and -search.maxLabelsAPISeries options for fine-tuning CPU and RAM usage for /api/v1/series , /api/v1/labels and /api/v1/label/.../values This commit returns back limits for these endpoints, which have been removed at `5d66ee88bd` , since it has been appeared that missing limits result in high CPU usage, while the introduced concurrency limiter results in failed lightweight requests to these endpoints because of timeout when heavyweight requests are executed. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5055	2024-02-23 02:57:16 +02:00
Dan Dascalescu	17cf031fa1	app/vmselect: simplify wording for `too many samples` error (#5827 )	2024-02-20 16:26:38 +01:00
Aliaksandr Valialkin	0e3c532bf7	app/vmselect/netstorage: prevent from disk write IO when closing temporary files Remove temporary file before closing it in order to signal the OS that it shouldn't store the file contents from page cache to disk when the file is closed. Gracefully handle the case when the file cannot be removed before being closed - in this case remove the file after closing it. This allows working on Windows. Also remove superflouos opening of temporary file for reading - re-use already opened file handle for writing. This is a follow-up for `9b1e002287` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4020 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70	2024-02-01 19:12:44 +02:00
Aliaksandr Valialkin	db4623efc2	app/vmselect/netstorage: properly handle the case when an empty brsPool points to the end of brs.brs This case is possible after a new brsPool is allocated. The fix is to verify whether len(brsPool) >= len(brs.brs) before trying to append a new item to brsPool and sharing its contents with brs.brs. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5733	2024-01-31 10:27:50 +02:00
Aliaksandr Valialkin	1c58c00618	app/vmselect/netstorage: limit the initial size for brsPoolCap with 32Kb This should reduce the number of expensive memory allocations with sizes bigger than 32Kb	2024-01-23 22:29:39 +02:00
Aliaksandr Valialkin	43ecd5d258	app/vmselect/netstorage: pre-allocate memory for metricNamesBuf This should reduce the number of metricNamesBuf re-allocations in append()	2024-01-23 21:34:16 +02:00
Aliaksandr Valialkin	41456d9569	app/vmselect/netstorage: limit the maximum brsPool size to 32Kb at ProcessSearchQuery() This avoids slow path in Go runtime for allocating objects bigger than 32Kb - see `704401ffa0/src/runtime/malloc.go (L11)` This also reduces memory usage a bit for vmselect and single-node VictoriaMetrics after the commit `5dd37ad836` . Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5527	2024-01-23 14:04:49 +02:00
Aliaksandr Valialkin	1f1768d7af	app/vmselect/netstorage: limit the size of metricNamesBuf to 32Kb in order to avoid slow path at Go runtime for allocating a byte slice of bigger size See `704401ffa0/src/runtime/malloc.go (L11)` This also reduces the average memory usage a bit for vmselect and single-node VictoriaMetrics after the commit `508c608062` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5527	2024-01-23 13:46:37 +02:00
Aliaksandr Valialkin	e0399ec29a	app/vmselect/netstorage: remove tswPool, since it isnt efficient	2024-01-23 02:28:30 +02:00
Aliaksandr Valialkin	72a838a2a1	app/vmselect/netstorage: avoid metricName->blockRef lookup when processing multiple blocks for the same time series This saves a few CPU cycles for common case	2024-01-23 02:28:29 +02:00
Aliaksandr Valialkin	5dd37ad836	app/vmselect/netstorage: use []blockRef from blockRefPool in order to reduce memory allocations	2024-01-23 02:28:29 +02:00
Aliaksandr Valialkin	7345567c29	app/vmselect/netstorage: substitute pointer to blockRefs by brssPool index at the metricName->blockRefs map This should reduce the pressure on Go GC, since it will see lower number of pointers. This change has been extracted from https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5527	2024-01-23 02:28:29 +02:00
Aliaksandr Valialkin	678234e9f0	app/vmselect/netstorage: reduce the number of allocations for blockRefs objects in ProcessSearchQuery() This should reduce pressure on Go GC at vmselect The change has been extracted from https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5527	2024-01-23 02:28:28 +02:00
Aliaksandr Valialkin	508c608062	app/vmselect/netstorage: reduce the number of memory allocations in ProcessSearchQuery() by storing all the metric names in a single byte slice This reduces the number of memory allocations at the cost of possible memory usage increase, since now different metric name strings may hold references to the previous byte slice. This is good tradeoff, since ProcessSearchQuery is called in vmselect, and vmselect isn't usually limited by memory. This change has been extracted from https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5527	2024-01-23 02:28:28 +02:00
Nikolay	c9f39fd51f	app/vmselect/netstorage (#5649 ) * app/vmselect/netstorage correctly handle errGlobal set * wip Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5649 --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-01-21 02:47:29 +02:00
Roman Khavronenko	b8b6e120ff	app/vmselect: limit the number of parallel workers by 32 (#5195 ) * app/vmselect: limit the number of parallel workers by 32 The change should improve performance and memory usage during query processing on machines with big number of CPU cores. The number of parallel workers for query processing is controlled via `-search.maxWorkersPerQuery` command-line flag. By default, the number of workers is limited by the number of available CPU cores, but not more than 32. The limit can be increased via `-search.maxWorkersPerQuery`. Signed-off-by: hagen1778 <roman@victoriametrics.com> * wip - The `-search.maxWorkersPerQuery` command-line flag doesn't limit resource usage, so move it from the `resource usage limits` to `troubleshooting` chapter at docs/Single-server-VictoriaMetrics.md - Make more clear the description for the `-search.maxWorkersPerQuery` command-line flag - Add the description of `-search.maxWorkersPerQuery` to docs/Cluster-VictoriaMetrics.md - Limit the maximum value, which can be passed to `-search.maxWorkersPerQuery`, to GOMAXPROCS, because bigger values may worsen query performance and increase CPU usage - Improve the the description of the change at docs/CHANGELOG.md. Mark it as FEATURE instead of BUGFIX, since it is closer to a feature than to a bugfix. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5087 --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-10-18 19:51:37 +02:00
Aliaksandr Valialkin	214be01dfa	app/vmselect/netstorage: remove duplicate `see` word from the error message This is a follow-up for `ac6c40e896` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4827	2023-08-14 02:05:44 -07:00
Aliaksandr Valialkin	ac6c40e896	all: refer to https://docs.victoriametrics.com/#resource-usage-limits in the error message about -search.max* limit Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4827	2023-08-14 01:57:34 -07:00
Aliaksandr Valialkin	45e345806c	app/vmselect/netstorage: remove runtime.Gosched() call from unpackWorker() This should improve scalability of unpackWorker() on systems with many CPU cores. This is a follow-up for `a2ecf4fa4a` and `16f3b279a2` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3966	2023-07-06 10:05:58 -07:00
Aliaksandr Valialkin	036a7b7365	lib/fs: replace MkdirAllIfNotExist->MustMkdirIfNotExist and MkdirAllFailIfExist->MustMkdirFailIfExist Callers of these functions log the returned error and then exit. The returned error already contains the path to directory, which was failed to be created. So let's just log the error together with the call stack inside these functions. This leaves the debuggability of the returned error at the same level while allows simplifying the code at callers' side. While at it, properly use MustMkdirFailIfExist instead of MustMkdirIfNotExist inside inmemoryPart.MustStoreToDisk(). It is expected that the inmemoryPart.MustStoreToDick() must fail if there is already a directory under the given path.	2023-04-13 22:11:59 -07:00
Nikolay	9b1e002287	app/vmselect: properly remove temp files at windows system (#4020 ) With non-posix compliant systems it's not possible to remove unclosed files. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70	2023-03-27 18:10:15 -07:00
Aliaksandr Valialkin	5832242b44	app/vmselect/netstorage: reduce the contention at fs.ReaderAt stats collection on systems with big number of CPU cores This optimization is based on the profile provided at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3966#issuecomment-1483208419	2023-03-25 16:37:07 -07:00
Aliaksandr Valialkin	a1e496ced6	app/vmselect/netstorage: document why runtime.Gosched() is removed at `28f054bb00` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3966	2023-03-25 16:36:51 -07:00
Zakhar Bessarab	28f054bb00	vmselect/netstorage: remove direct calls to `Gosched` to reduce amount of locks for global scope using `runtime.Gosched` requires acquiring global lock to check if there are any other goroutines to perform tasks. with the latest versions of runtime it can pause running goroutines automatically without requiring to call `Gosched` directly. Updates #3966 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2023-03-25 16:34:03 -07:00
Aliaksandr Valialkin	70959d5dab	app/vmselect/netstorage: reduce the number of calls to runtime.Gosched() at timeseriesWorker() and unpackWorker() Call runtime.Gosched() only when there is a work to steal from other workers. Simplify the timeseriesWorker() and unpackWroker() code a bit by inlining stealTimeseriesWork() and stealUnpackWork(). This should reduce CPU usage when processing queries on systems with big number of CPU cores. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3966	2023-03-20 20:31:02 -07:00
Aliaksandr Valialkin	4856a4cf5a	app/vmselect: optimize incremental aggregates a bit Substitute sync.Map with an ordinary slice indexed by workerID. This should reduce the overhead when updating the incremental aggregate state	2023-03-20 15:37:06 -07:00
Aliaksandr Valialkin	b5db69fe05	app/vmselect/netstorage: do not intern string representation of MetricName for time series received from vmstorage It has been appeared that this interning may lead to increased memory usage and increased CPU usage when vmselect performs queries, which select big number of time series. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3692 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3863	2023-03-12 00:52:35 -08:00
Oleksandr Redko	9fff48c3e3	app,lib: fix typos in comments (#3804 )	2023-02-13 13:27:13 +01:00
Aliaksandr Valialkin	be8fba9b6a	app/vmselect/netstorage: tune the number of blocks per series which should be unpacked by a single goroutine instead of spinning up multiple goroutines This reduces overhead on time series data unpacking for typical cases, this reducing CPU usage at vmselect	2023-01-12 09:31:44 -08:00
Aliaksandr Valialkin	53d871d0b1	app/vmselect/netstorage: reduce tail latency during query processing Previously the selected time series were split evenly among available CPU cores for further processing - e.g unpacking the data and applying the given rollup function to the unpacked data. Some time series could be processed slower than others. This could result in uneven work distribution among available CPU cores, e.g. some CPU cores could complete their work sooner than others. This could slow down query execution. The new algorithm allows stealing time series to process from other CPU cores when all the local work is done. This should reduce the maximum time needed for query execution (aka tail latency). The new algorithm should also scale better on systems with many CPU cores, since every CPU processes locally assigned time series without inter-CPU communications. The inter-CPU communications are used only when all the local work is finished and the pending work from other CPUs needs to be stealed.	2023-01-10 13:43:14 -08:00
Aliaksandr Valialkin	e640ff72f1	app/vmselect/netstorage: reduce memory allocations when unpacking time series Unpack time series with less than 400K samples in the currently running goroutine. Previously a new goroutine was being started for unpacking the samples. This was requiring additional memory allocations.	2023-01-09 23:18:17 -08:00
Aliaksandr Valialkin	df2a494a7c	app/vmselect/netstorage: pre-allocate 4 block references per each time series during querying Usually the number of blocks returned per each time series during queries is around 4. So it is a good idea to pre-allocate 4 block references per time series in order to reduce the number of memory allocations.	2023-01-09 22:03:23 -08:00
Aliaksandr Valialkin	c5e0f527bc	app/vmselect/netstorage: cache canonical MetricName for time series returned from the storage This reduces memory allocations for repeated queries, which return (almost) the same set of time series.	2023-01-09 21:53:10 -08:00
Aliaksandr Valialkin	7afcca0c51	all: use metricsql.CompileRegexp instead of regexp.Compile for compiling regexps used in graphite queries This should speed up repeated queries, since metricsql.CompileRegexp returns regexps from the cache on subsequent calls for the same input regexp.	2023-01-09 21:43:08 -08:00
Aliaksandr Valialkin	c38a10e143	app/vmselect/netstorage: eliminate memory allocation for sortBlocksHeap arg when calling mergeSortBlocks()	2023-01-09 21:08:51 -08:00
Aliaksandr Valialkin	1f9d605988	app/vmselect/netstorage: consistently select the sample with the biggest value out of samples with identical timestamps Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3333 This fix is based on https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3620 , but doesn't slow down the common case with merging replicated data blocks so significantly. Benchmark results: Before the change: BenchmarkMergeSortBlocks/replicationFactor-1-4 13968 85643 ns/op 956.53 MB/s 1700 B/op 1 allocs/op BenchmarkMergeSortBlocks/replicationFactor-2-4 10806 109171 ns/op 1500.77 MB/s 2191 B/op 1 allocs/op BenchmarkMergeSortBlocks/replicationFactor-3-4 8887 130623 ns/op 1881.45 MB/s 2660 B/op 1 allocs/op BenchmarkMergeSortBlocks/replicationFactor-4-4 7440 157348 ns/op 2082.52 MB/s 3174 B/op 1 allocs/op BenchmarkMergeSortBlocks/replicationFactor-5-4 6534 184473 ns/op 2220.38 MB/s 3612 B/op 1 allocs/op BenchmarkMergeSortBlocks/overlapped-blocks-bestcase-4 13419 85205 ns/op 961.44 MB/s 2213 B/op 1 allocs/op BenchmarkMergeSortBlocks/overlapped-blocks-worstcase-4 579 1894900 ns/op 43.23 MB/s 46760 B/op 1 allocs/op After the change: BenchmarkMergeSortBlocks/replicationFactor-1-4 13832 85298 ns/op 960.40 MB/s 1716 B/op 1 allocs/op BenchmarkMergeSortBlocks/replicationFactor-2-4 8833 134222 ns/op 1220.66 MB/s 2675 B/op 1 allocs/op BenchmarkMergeSortBlocks/replicationFactor-3-4 6487 184830 ns/op 1329.65 MB/s 3636 B/op 1 allocs/op BenchmarkMergeSortBlocks/replicationFactor-4-4 4977 236318 ns/op 1386.61 MB/s 4733 B/op 1 allocs/op BenchmarkMergeSortBlocks/replicationFactor-5-4 4088 296734 ns/op 1380.36 MB/s 5761 B/op 1 allocs/op BenchmarkMergeSortBlocks/overlapped-blocks-bestcase-4 14083 84067 ns/op 974.47 MB/s 2110 B/op 1 allocs/op BenchmarkMergeSortBlocks/overlapped-blocks-worstcase-4 536 2043534 ns/op 40.09 MB/s 50511 B/op 1 allocs/op	2023-01-09 13:01:48 -08:00
Roman Khavronenko	7c0ae3a86a	lib/storage: keep sample with the biggest value on timestamp conflict (#3421 ) The change leaves raw sample with the biggest value for identical timestamps per each `-dedup.minScrapeInterval` discrete interval when the deduplication is enabled. ``` benchstat old.txt new.txt name old time/op new time/op delta DeduplicateSamples/minScrapeInterval=1s-10 817ns ± 2% 832ns ± 3% ~ (p=0.052 n=10+10) DeduplicateSamples/minScrapeInterval=2s-10 1.56µs ± 1% 2.12µs ± 0% +35.19% (p=0.000 n=9+7) DeduplicateSamples/minScrapeInterval=5s-10 1.32µs ± 3% 1.65µs ± 2% +25.57% (p=0.000 n=10+10) DeduplicateSamples/minScrapeInterval=10s-10 1.13µs ± 2% 1.50µs ± 1% +32.85% (p=0.000 n=10+10) name old speed new speed delta DeduplicateSamples/minScrapeInterval=1s-10 10.0GB/s ± 2% 9.9GB/s ± 3% ~ (p=0.052 n=10+10) DeduplicateSamples/minScrapeInterval=2s-10 5.24GB/s ± 1% 3.87GB/s ± 0% -26.03% (p=0.000 n=9+7) DeduplicateSamples/minScrapeInterval=5s-10 6.22GB/s ± 3% 4.96GB/s ± 2% -20.37% (p=0.000 n=10+10) DeduplicateSamples/minScrapeInterval=10s-10 7.28GB/s ± 2% 5.48GB/s ± 1% -24.74% (p=0.000 n=10+10) ``` https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3333 Signed-off-by: hagen1778 <roman@victoriametrics.com> Signed-off-by: hagen1778 <roman@victoriametrics.com>	2022-12-08 18:06:11 -08:00
Aliaksandr Valialkin	cae0f37edd	app/vmselect/netstorage: remove superflouos map lookup at ProcessSearchQuery This should reduce CPU usage a bit during querying	2022-11-18 13:40:04 +02:00

1 2 3 4

164 Commits