VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-12-21 07:56:26 +01:00

Author	SHA1	Message	Date
Aliaksandr Valialkin	934a7f485c	app/vmselect: log locations of sendPrometheusError() calls Previously the location inside the sendPrometheusError() was logged. This could make hard investigating error locations via `vm_log_messages_total` metric.	2023-05-18 20:39:50 -07:00
Aliaksandr Valialkin	1ff67bb036	app/vmselect/vmui: run `make vmui-update` after `39c1b0f8d1`	2023-05-18 12:15:22 -07:00
Alexander Marshalov	d321ea91f2	fixed typos in documentation and commandline flags descriptions (#4275 )	2023-05-10 02:22:06 -07:00
Aliaksandr Valialkin	8703b2fa87	app/vmselect: small cleanup after `4f3f9950d0` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3807	2023-05-09 22:45:02 -07:00
Aliaksandr Valialkin	fbc28810b1	app/vmselect: small cleanup after `68e31a6000` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3811	2023-05-09 22:43:59 -07:00
Aliaksandr Valialkin	5dbaffe2c6	app/{vmselect,vmctl}: move ParseTime() to lib/promutils Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4091 This is a follow-up for `e2053baf32`	2023-05-09 22:42:35 -07:00
Roman Khavronenko	5bc8d8f290	vmselect: exit early from queue on context cancel (#4223 ) * vmselect: exit early from queue on context cancel When `-search.maxConcurrentRequests` is reached, vmselect puts request in the queue. It is expected, that requests in the queue will be processed as soon as it would be enough capacity to do so. However, it could happen that while request was waiting its turn, the client could have already cancel it (close the connection, or just close the tab with UI). In this case, we should de-queue such requests to avoid spending extra resources on them. Signed-off-by: hagen1778 <roman@victoriametrics.com> * app/vmselect: address review comments Signed-off-by: hagen1778 <roman@victoriametrics.com> --------- Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-05-08 22:58:05 -07:00
Aliaksandr Valialkin	1a7794735e	app/vmselect: fix the build after fb8889820aba710508033cbf6826eb63a357532a	2023-05-08 17:32:18 -07:00
Yury Molodov	de35cbf251	vmui: Integrate WITH template playground (#3831 ) * feat: add WithTemplate page * app/vmselect/prometheus: enable json mode for expand with expr API * app/vmselect/prometheus: enable CORS and add content type * feat: add api for expand with templates * fix: remove console from useExpandWithExprs * app/vmselect/prometheus: fix escaping * vmui: integrate WITH template * app/vmctl: check content type instead of form param * fix: add content-type for fetch with-exprs * fix: add a header to the server's response that allows the "Content-Type" header * app/vmctl: added comment and cleanup * app/vmctl: use format query param --------- Co-authored-by: dmitryk-dk <kozlovdmitriyy@gmail.com>	2023-05-08 14:35:35 -07:00
Aliaksandr Valialkin	cf4701db65	lib/fs: add MustReadDir() function Use fs.MustReadDir() instead of os.ReadDir() across the code in order to reduce the code verbosity. The fs.MustReadDir() logs the error with the directory name and the call stack on error before exit. This information should be enough for debugging the cause of the error.	2023-04-14 22:11:40 -07:00
Aliaksandr Valialkin	c4638553a3	lib/fs: rename WriteFileAtomically to MustWriteAtomic Callers of this function log the returned error and exit. So let's just log the error with the given filepath and the call stack inside the function itself and then exit. This simplifies the code at callers' place while leaves the same level of debuggability in case of errors.	2023-04-13 22:43:30 -07:00
Aliaksandr Valialkin	aac3dccfd1	lib/fs: replace MkdirAllIfNotExist->MustMkdirIfNotExist and MkdirAllFailIfExist->MustMkdirFailIfExist Callers of these functions log the returned error and then exit. The returned error already contains the path to directory, which was failed to be created. So let's just log the error together with the call stack inside these functions. This leaves the debuggability of the returned error at the same level while allows simplifying the code at callers' side. While at it, properly use MustMkdirFailIfExist instead of MustMkdirIfNotExist inside inmemoryPart.MustStoreToDisk(). It is expected that the inmemoryPart.MustStoreToDick() must fail if there is already a directory under the given path.	2023-04-13 22:22:08 -07:00
Aliaksandr Valialkin	26b361f4c3	app/vmselect/vmui: run `make vmui-update` after `01fc228fb0`	2023-04-06 15:11:54 -07:00
Aliaksandr Valialkin	a241485262	app/vmselect/vmui: run `make vmui-update` after `a1601929ec`	2023-04-06 03:20:16 -07:00
Yury Molodov	7871ee0e43	vmui: implement heatmap improvements (#4078 ) * fix: disabled limits for histogram * fix: add sorted buckets by upper bound * refactor: move line chart components to folder * feat: implement heatmap improvements (https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3384#issuecomment-1484023162) * app/vmselect/vmui: `make vmui-update` --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-04-05 22:15:23 -07:00
Aliaksandr Valialkin	fa2ba7b07b	app/vmselect/vmui: run `make vmui-update` after `edb45d7fc1`	2023-04-02 21:22:17 -07:00
Aliaksandr Valialkin	7b10af4846	app/vmselect/vmui: run `make vmui-update` after `42087518ba`	2023-04-01 00:41:03 -07:00
Aliaksandr Valialkin	db8fda4ec6	app/vmselect/graphite: open source Graphite Render API	2023-03-31 23:37:40 -07:00
Nikolay	b38a145cfd	app/vmselect: properly remove temp files at windows system (#4020 ) With non-posix compliant systems it's not possible to remove unclosed files. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70	2023-03-27 18:10:44 -07:00
Aliaksandr Valialkin	54b9537a76	app/vmselect/promql: follow-up for `79e1c6a6fc` - Document the fix at docs/CHANGELOG.md - Add tests with multiple adjancent zero buckets - Simplify the fix a bit Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/296 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4021	2023-03-27 18:04:30 -07:00
Ze'ev Klapow	680a661ec0	fix le buckets when adjacent vmrange is empty (#4021 ) There is a bug here where if you have a single bucket like: foo{vmrange="4.084e+02...4.642e+02"} 2 123 The expected output is three le encoded buckets like: foo{le="4.084e+02"} 0 123 foo{le="4.642e+02"} 2 123 foo{le="+Inf"} 2 123 This correctly encodes the start and end of the vmrange. If however, the input contains the previous bucket, and that bucket is empty then you only get the end le and +Inf out currently, i.e: foo{vmrange="7.743e+05...8.799e+05"} 5 123 foo{vmrange="6.813e+05...7.743e+05"} 0 123 results in: foo{le="8.799e+05"} 5 123 foo{le="+Inf"} 5 123 This causes issues when you go to compute a quantile because this means that the assumed lower bound of the buckets is 0 and this we interpolate between 0->end rather than the vmrange start->end as expected.	2023-03-27 18:04:29 -07:00
Aliaksandr Valialkin	9387793f47	app/vmselect: follow-up for `10ab086366` - Expose stats.seriesFetched at `/api/v1/query_range` responses too for the sake of consistency. - Initialize QueryStats when it is needed and pass it to EvalConfig then. This guarantees that the QueryStats is properly collected when the query contains some subqueries.	2023-03-27 15:11:42 -07:00
Roman Khavronenko	10ab086366	app/vmselect: export `seriesFetched` stat for /query responses (#3925 ) The change adds a new field `seriesFetched` to EvalConfig object. Since EvalConfig object can be copied inside `Exec`, `seriesFetched` is a pointer which can be updated by all copied objects. The reason for having stats is that other components, like vmalert, could benefit from this information. Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-03-27 08:51:33 -07:00
Yury Molodov	86a98fa131	vmui: heatmap (#3780 ) * fix: add stroke and font for all axes * feat: add util for generate gradient * feat: add heatmap plugin * feat: add heatmap legend * feat: add heatmap graph (#3384) * vmui: add heatmap graph (#3384) * feat: add convert Prometheus to VictoriaMetrics histogram * fix: prevent re-render graph * feat: reset step for heatmap * feat: normalize heatmap data * fix: format heatmap legend * wip * app/vmselect/vmui: run `make vmui-update` --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-03-26 00:31:21 -07:00
Aliaksandr Valialkin	db3bcbe56a	app/vmselect/netstorage: reduce the contention at fs.ReaderAt stats collection on systems with big number of CPU cores This optimization is based on the profile provided at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3966#issuecomment-1483208419	2023-03-25 16:38:39 -07:00
Aliaksandr Valialkin	a2ecf4fa4a	app/vmselect/netstorage: document why runtime.Gosched() is removed at `28f054bb00` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3966	2023-03-25 16:38:28 -07:00
Zakhar Bessarab	16f3b279a2	vmselect/netstorage: remove direct calls to `Gosched` to reduce amount of locks for global scope using `runtime.Gosched` requires acquiring global lock to check if there are any other goroutines to perform tasks. with the latest versions of runtime it can pause running goroutines automatically without requiring to call `Gosched` directly. Updates #3966 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2023-03-25 16:37:58 -07:00
Aliaksandr Valialkin	740fa57fdc	app/vmselect/promql: typo fix after `e7f46a0aab`	2023-03-24 23:47:11 -07:00
Aliaksandr Valialkin	7aff6f872f	app/vmselect/promql: follow-up for `7205c79c5a` - Allocate and initialize seriesByWorkerID slice in a single go instead of initializing every item in the list separately. This should reduce CPU usage a bit. - Properly set anti-false sharing padding at timeseriesWithPadding structure - Document the change at docs/CHANGELOG.md Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3966	2023-03-24 23:39:43 -07:00
Zakhar Bessarab	fec87e3ada	app/vmselect/promql: use lock-less approach to gather results of parallel processing for `evalRollup` funcs (#4004 ) vmselect/promql: refactor `evalRollupNoIncrementalAggregate` to use lock-less approach for parallel workers computation Locking there is causing issues when running on highly multi-core system as it introduces lock contention during results merge. New implementation uses lock less approach to store results per workerID and merges final result in the end, this is expected to significantly reduce lock contention and CPU usage for systems with high number of cores. Related: #3966 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * vmselect/promql: add pooling for `timeseriesWithPadding` to reduce allocations Related: #3966 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * vmselect/promql: refactor `evalRollupFuncWithSubquery` to avoid using locks Uses same approach as `evalRollupNoIncrementalAggregate` to remove locking between workers and reduce lock contention. Related: #3966 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2023-03-24 23:39:41 -07:00
Aliaksandr Valialkin	b9632023c4	app/vmselect/vmui: run `make vmui-update` after `dc2c712a29`	2023-03-24 18:08:51 -07:00
Aliaksandr Valialkin	79d8f0e7c6	app/vmselect/promql: pass workerID to the callback inside doParallel() This opens the possibility to remove tssLock from evalRollupFuncWithSubquery() in the follow-up commit from @zekker6 in order to speed up the code for systems with many CPU cores. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3966	2023-03-20 20:57:34 -07:00
Aliaksandr Valialkin	e749a015a9	app/vmselect/promql: fix TestIncrementalAggr test on systems less than 3 CPU cores This is a follow-up for `4856a4cf5a`	2023-03-20 20:37:44 -07:00
Aliaksandr Valialkin	08da383eac	app/vmselect/netstorage: reduce the number of calls to runtime.Gosched() at timeseriesWorker() and unpackWorker() Call runtime.Gosched() only when there is a work to steal from other workers. Simplify the timeseriesWorker() and unpackWroker() code a bit by inlining stealTimeseriesWork() and stealUnpackWork(). This should reduce CPU usage when processing queries on systems with big number of CPU cores. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3966	2023-03-20 20:32:56 -07:00
Aliaksandr Valialkin	18af01c387	app/vmselect: optimize incremental aggregates a bit Substitute sync.Map with an ordinary slice indexed by workerID. This should reduce the overhead when updating the incremental aggregate state	2023-03-20 15:42:13 -07:00
Aliaksandr Valialkin	7a1e2f49cc	app/vmselect/vmui: `make vmui-update` after `d4525bd2d0`	2023-03-20 14:35:17 -07:00
Aliaksandr Valialkin	fc3d826d7f	all: add Windows build for VictoriaMetrics This commit changes background merge algorithm, so it becomes compatible with Windows file semantics. The previous algorithm for background merge: 1. Merge source parts into a destination part inside tmp directory. 2. Create a file in txn directory with instructions on how to atomically swap source parts with the destination part. 3. Perform instructions from the file. 4. Delete the file with instructions. This algorithm guarantees that either source parts or destination part is visible in the partition after unclean shutdown at any step above, since the remaining files with instructions is replayed on the next restart, after that the remaining contents of the tmp directory is deleted. Unfortunately this algorithm doesn't work under Windows because it disallows removing and moving files, which are in use. So the new algorithm for background merge has been implemented: 1. Merge source parts into a destination part inside the partition directory itself. E.g. now the partition directory may contain both complete and incomplete parts. 2. Atomically update the parts.json file with the new list of parts after the merge, e.g. remove the source parts from the list and add the destination part to the list before storing it to parts.json file. 3. Remove the source parts from disk when they are no longer used. This algorithm guarantees that either source parts or destination part is visible in the partition after unclean shutdown at any step above, since incomplete partitions from step 1 or old source parts from step 3 are removed on the next startup by inspecting parts.json file. This algorithm should work under Windows, since it doesn't remove or move files in use. This algorithm has also the following benefits: - It should work better for NFS. - It fits object storage semantics. The new algorithm changes data storage format, so it is impossible to downgrade to the previous versions of VictoriaMetrics after upgrading to this algorithm. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3236 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3821 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70	2023-03-19 23:28:26 -07:00
oliverpool	8c708ca1e9	app/vmselect/promql: add test to ensure 8-byte alignment (#3948 ) See `0af9e2b693`	2023-03-16 22:07:13 -07:00
Aliaksandr Valialkin	3b4a3583bc	app/vmselect/promql: prevent from `cannot unmarshal timeseries from rollupResultCache` panic after the upgrade to v1.89.0	2023-03-12 19:09:11 -07:00
Aliaksandr Valialkin	cf7d8811f6	app/vmselect/vmui: `make vmui-update` after `00a0816ab1`	2023-03-12 17:22:28 -07:00
Aliaksandr Valialkin	a6a4beb89a	app/vmselect: remove data race on updating EvalConfig.IsPartialResponse from concurrently running goroutines This properly returns `is_partial: true` for partial responses.	2023-03-12 16:53:03 -07:00
Aliaksandr Valialkin	5cd60c54d3	app/vmselect/promql: prevent from SIGBUS crash on architecures, which deny unaligned access to 8-byte words (e.g. ARM) Thanks to @oliverpool for nailing down the root cause of the issue and for the initial attempt to fix it at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3927	2023-03-12 16:29:18 -07:00
Aliaksandr Valialkin	e491fee1f4	app/vmselect/netstorage: do not intern string representation of MetricName for time series received from vmstorage It has been appeared that this interning may lead to increased memory usage and increased CPU usage when vmselect performs queries, which select big number of time series. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3692 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3863	2023-03-12 00:44:08 -08:00
Aliaksandr Valialkin	54fe207cc0	all: follow-up for `7a3e16e774` - Sync the description for -httpListenAddr.useProxyProtocol command-line flag at vmagent and vmauth, so it is consistent with the description at vmauth and victoria-metrics - Add a sample of panic text to docs/CHANGELOG.md, so it could be googled - Mention the -httpListenAddr.useProxyProtocol command-line flag in the description for the bugfix Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3335	2023-03-08 01:42:58 -08:00
Aliaksandr Valialkin	bc24d35153	app/vmselect/vmui: `make vmui-update` after `bbf8e459a0`	2023-03-08 01:40:18 -08:00
Aliaksandr Valialkin	d3605ad072	app/vmselect/promql: fix panic when calculating `aggr_func(rollup*())` The panic has been introduced in `dac21d874b`	2023-02-27 11:48:38 -08:00
Aliaksandr Valialkin	bbd5914eb1	all: add makefile rules for GOARCH=s390x for all the VictoriaMetrics components This is a follow-up for `007530f882`	2023-02-26 12:38:48 -08:00
Aliaksandr Valialkin	18dd0d1dbf	.golangci.yml: properly enable `revive` linter and fix all the warnings it detects	2023-02-26 12:19:58 -08:00
Roman Khavronenko	66d0b45651	vmselect/promql: check for deadline in `count_values` fn (#3806 ) * vmselect/promql: check for deadline in `count_values` fn `count_values` could be very slow during the data processing. Checking for deadline between iterations supposed to reduce probability of exceeding `search.maxQueryDuration`. The change also adds a new trace record, which captures the time spent in aggregation function. Before that, the trace for aggr funcs could be confusing since it doesn't account for all the places where time was spent. Signed-off-by: hagen1778 <roman@victoriametrics.com> * wip --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-02-24 17:10:38 -08:00
Roman Khavronenko	79eb33556e	metricsql: support optional 2nd argument for rollup functions (#3841 ) * metricsql: support optional 2nd argument for rollup functions Support optional 2nd argument `min`, `max` or `avg` for rollup functions: * rollup * rollup_delta * rollup_deriv * rollup_increase * rollup_rate * rollup_scrape_interval If second argument is passed, then rollup function will return only the selected aggregation type. This change can be useful for situations where only one type of rollup calculation is needed. For example, `rollup_rate(requests_total[5m], "max")`. Signed-off-by: hagen1778 <roman@victoriametrics.com> * wip --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-02-24 13:48:30 -08:00

1 2 3 4 5 ...

1027 Commits