VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-12-18 14:40:26 +01:00

Author	SHA1	Message	Date
Aliaksandr Valialkin	01c8e12370	app/vlselect: add /select/logsql/stats_query endpoint, which is going to be used by vmalert Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6942 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6706	2024-09-06 23:00:58 +02:00
Aliaksandr Valialkin	e90e809c00	deployment: update Go builder from Go1.23.0 to Go1.23.1 See https://github.com/golang/go/issues?q=milestone%3AGo1.23.1+label%3ACherryPickApproved	2024-09-06 22:57:56 +02:00
f41gh7	395894688c	app/*/multiarch: return back empty value for TARGETARCH follow-up after `91456ab5bb` docker buildx uses special variables, such as TARGETARCH and it shouldn't be overwritten. See this article for details https://www.docker.com/blog/faster-multi-platform-builds-dockerfile-cross-compilation-guide/ Signed-off-by: f41gh7 <nik@victoriametrics.com>	2024-09-06 18:15:22 +02:00
Artem Fetishev	85bf768013	lib/storage: adds metrics that count records that failed to insert ### Describe Your Changes Add storage metrics that count records that failed to insert: - `RowsReceivedTotal`: the number of records that have been received by the storage from the clients - `RowsAddedTotal`: the number of records that have actually been persisted. This value must be equal to `RowsReceivedTotal` if all the records have been valid ones. But it will be smaller otherwise. The values of the metrics below should provide the insight of why some records hasn't been added - `NaNValueRows`: the number of records whose value was `NaN` - `StaleNaNValueRows`: the number of records whose value was `Stale NaN` - `InvalidRawMetricNames`: the number of records whose raw metric name has failed to unmarshal. The following metrics existed before this PR and are listed here for completeness: - `TooSmallTimestampRows`: the number of records whose timestamp is negative or is older than retention period - `TooBigTimestampRows`: the number of records whose timestamp is too far in the future. - `HourlySeriesLimitRowsDropped`: the number of records that have not been added because the hourly series limit has been exceeded. - `DailySeriesLimitRowsDropped`: the number of records that have not been added because the daily series limit has been exceeded. --- Signed-off-by: Artem Fetishev <wwctrsrx@gmail.com>	2024-09-06 18:13:48 +02:00
f41gh7	64361c2d7a	follow-up after `01430a155c` * properly check SeverityNumber at FormatSeverity function it could be negative, which could cause panic for victorialogs	2024-09-04 15:39:55 +02:00
Andrii Chubatiuk	711f2cc4f2	vlinsert: added opentelemetry logs support Commit adds the following changes: * Adds support of OpenTelemetry logs for Victoria Logs with protobuf encoded messages * json encoding is not supported for the following reasons: - It brings a lot of fragile code, which works inefficiently. - json encoding is impossible to use with language SDK. * splits metrics and logs structures at lib/protoparser/opentelemetry/pb package. * adds docs with examples for opentelemetry logs. --- Related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4839 Co-authored-by: AndrewChubatiuk <andrew.chubatiuk@gmail.com> Co-authored-by: f41gh7 <nik@victoriametrics.com>	2024-09-03 20:24:01 +02:00
f41gh7	dcc525b388	follow-up after `1731c0eabf` * updates change log * adds VL-Debug http header * updates doc * extracts only the first value of http headers for VL-Stream-Fields and VL-Ignore-Fields. It makes behaviour the same as Query string args. And allows to easily configure client applications. Since most of the client collectors don't support multi value headers. Signed-off-by: f41gh7 <nik@victoriametrics.com>	2024-09-03 20:24:01 +02:00
Andrii Chubatiuk	d5fe4566e5	app/vlinsert: support getting _msg_field, _time_field, _stream_fields and _ignore_fields from headers * Many collectors don't support forwarding url query params to the remote system. It makes impossible to define stream fields for it. Workaround with proxy between VictoriaLogs and log shipper is too complicated solution. * This commit adds the following changes: * Adds fallback to to headers params, if query param is empty for: _msg_field -> VL-Msg-Field _stream_fields -> VL-Stream-Fields _ignore_fields -> VL-Ignore-Fields _time_field -> VL-Time-Field * removes deprecations from victorialogs compose files, added more output format examples for logstash, telegraf, fluent-bit related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5310	2024-09-03 20:24:00 +02:00
Aliaksandr Valialkin	ac507466c3	all: suppress InvalidDefaultArgInFrom warning emitted by `docker build` when building Docker packages via `make package-` command Recent versions of `docker build` started generating the InvalidDefaultArgInFrom warning if Dockerfile contains an ARG without default value. While this warning doesn't affect building Docker packages via `make package-` commands, it is better suppressing the warning, so it doesn't clutter `make package-*` output with the noise, which can hide real issues in the future.	2024-09-03 14:05:43 +02:00
Hui Wang	a21aea5dd4	stream aggregation: perform deduplication for all received data when … (#6711 ) …specifying `-streamAggr.dedupInterval` or `-remoteWrite.streamAggr.dedupInterval` command-line flag [The documentation](https://docs.victoriametrics.com/stream-aggregation/) contains conflicting descriptions regarding deduplication for non-matched series when `-remoteWrite.streamAggr.config` and / or `-streamAggr.config` are set: 1. Statement below says all the received data is deduplicated: >[vmagent](https://docs.victoriametrics.com/vmagent/) supports relabeling, deduplication and stream aggregation for all the received data, scraped or pushed. Then, the collected data will be forwarded to specified -remoteWrite.url destinations. The data processing order is the following: >1. all the received data is relabeled according to the specified [-remoteWrite.relabelConfig](https://docs.victoriametrics.com/vmagent/#relabeling) (if it is set) >2. all the received data is deduplicated according to specified [-streamAggr.dedupInterval](https://docs.victoriametrics.com/stream-aggregation/#deduplication) (if it is set to duration bigger than 0) 2. Another statement says the deduplication is performed individually for the matching samples >The de-deduplication is performed after applying [relabeling](https://docs.victoriametrics.com/vmagent/#relabeling) and before performing the aggregation. If the -remoteWrite.streamAggr.config and / or -streamAggr.config is set, then the de-duplication is performed individually per each [stream aggregation config](https://docs.victoriametrics.com/stream-aggregation/#stream-aggregation-config) for the matching samples after applying [input_relabel_configs](https://docs.victoriametrics.com/stream-aggregation/#relabeling). Considering the following deduplication use cases: 1. To apply deduplication(globally or for specific remoteWrite destination) for all the received data, scraped or pushed --- using `-streamAggr.dedupInterval` or `-remoteWrite.streamAggr.dedupInterval`. 2. To deduplicate and aggregate metrics that match the rule `match` filters --- using `-remoteWrite.streamAggr.config` and specifiying `dedup_interval` option in [stream aggregation config](https://docs.victoriametrics.com/stream-aggregation/#stream-aggregation-config). 3. To deduplicate all the received data while having `streamAggr.config` for some metrics --- no way for a single vmagent now, need to set up two level vmagents This PR implements case3. --------- Co-authored-by: Roman Khavronenko <roman@victoriametrics.com> (cherry picked from commit `d523015f27`)	2024-09-03 10:49:38 +02:00
dufucun	1aa9f7be4e	tests: fix slice init length (#6897 ) ### Describe Your Changes fix slice init length ### Checklist The following checks are mandatory: - [ ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). Signed-off-by: dufucun <dufuchun@sohu.com> (cherry picked from commit `95bafc8caf`)	2024-08-30 11:18:21 +02:00
hagen1778	681dc7bb7d	app/{vmselect,vlselect}: run make vmui-update vmui-logs-update Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `9a343b3613`)	2024-08-28 13:38:28 +02:00
YuDong Tang	cab3ef8294	app/vmselect:add command-line flag -search.inmemoryBufSizeBytes (#6869 ) add command-line flag `-search.inmemoryBufSizeBytes` for configuring size of in-memory buffers used by vmselect during processing of vmstorage responses. A new summary metric `vm_tmp_blocks_inmemory_file_size_bytes` is exposed to show the size of the buffer during requests processing. The new setting can be used by experienced users to adjust memory usage by vmselect when processing many small read requests. Instead of allocating 4MB buffers each time, vmselect can be instructed to lower the buffer size via `-search.inmemoryBufSizeBytes`. To make the decision whether this flag needs to be adjusted users can consult with `vm_tmp_blocks_inmemory_file_size_bytes` which shows the actual size of buffers used during query processing. ---------- The detailed information of this PR can be found in https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6851 ### Checklist The following checks are mandatory: - [ ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Co-authored-by: hagen1778 <roman@victoriametrics.com>	2024-08-26 14:37:45 +02:00
Yury Akudovich	f759371c00	app/vmagent: add `remoteWrite.retryMinInterval` and `remoteWrite.retryMaxTime` flags (#6289 ) ## Describe Your Changes Add RemoteWrite Retry Controls This PR introduces two new flags to the remote write functionality: - remoteWrite.retryMinInterval - remoteWrite.retryMaxTime These flags provide finer control over the retry behavior for remoteWrite operations, allowing users to customize the minimum interval between retries and the maximum duration for retry attempts. Fixes #5486. ## Checklist - [x] The following checks are mandatory: My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: Yury Akudovich <ya@matterlabs.dev> Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `d0f5a9d77a`)	2024-08-23 15:28:44 +02:00
Roman Khavronenko	f599f3ad86	deployment/docker: update Go builder from Go1.22.5 to Go1.23.0 (#6861 ) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-08-22 23:56:12 +02:00
Roman Khavronenko	c6e0780f4b	app/vmalert: update parsing for instant responses (#6859 ) This change is made in attempt to reduce memory usage by vmalert when parsing big instant responses from VM/Prometheus. In `a5c427bac4` vmalert switched from std json lib to fastjson lib in order to reduce amount of allocations, as according to highloaded profiles of vmalert the CPU is mostly spent on GC. But switching to fastjson resulted into excessive memory usage for cases when vmalert has to parse long json lines, which usually happens when instant response contains many `metric` objects. In this change we do a mixed parsing: 1. Slice of `metric` objects is parsed with std lib to keep mem low 2. Each `metric` object is parsed with fastjson to reduce allocs The benchmark results are the following: ``` pkg: github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource BenchmarkParsePrometheusResponse/Instant_std+fastjson-10 1760 668959 ns/op 280147 B/op 5781 allocs/op MBs allocated at heap: 493.078392 mallocs: 18655472 BenchmarkParsePrometheusResponse/Instant_fastjson-10 6109 198258 ns/op 172839 B/op 5548 allocs/op MBs allocated at heap: 1056.384464 mallocs: 34457184 BenchmarkParsePrometheusResponse/Instant_std-10 1287 950987 ns/op 451677 B/op 9619 allocs/op MBs allocated at heap: 580.802976 mallocs: 13351636 ``` The benchmark function code with mem measurement is available here https://gist.github.com/hagen1778/b9c3ca7f8ca7d6b21aec9777112c5810 The benchmark contains 3 results: 1. Instant_std+fastjson is the implementation in this change 2. Instant_fastjson-10 is the implementation from `a5c427bac4` 3. BenchmarkParsePrometheusResponse/Instant_std-10 is implementation before `a5c427bac4` According to these results, this new implementation is slower than previous, but faster than before switching to fastjson. It also has lower number of allocations and roughly the same memory allocation on heap with GC turned off. --------- Other changes: 1. rm BenchmarkMetrics as it doesn't measure anything 2. simplify BenchmarkParsePrometheusResponse into BenchmarkPromInstantUnmarshal ### Describe Your Changes Please provide a brief description of the changes you made. Be as specific as possible to help others understand the purpose and impact of your modifications. ### Checklist The following checks are mandatory: - [ ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-08-22 23:56:11 +02:00
Yury Molodov	a3509add4d	vmui: add column search in table settings (#6804 ) ### Describe Your Changes Add search functionality to the column display settings in the table #6668 ![image](https://github.com/user-attachments/assets/e9bd52c3-6428-4d4f-8b7f-d83dd80b6912) ### Checklist The following checks are mandatory: - [ ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `e35237920a`)	2024-08-22 16:59:18 +02:00
Dima Lazerka	f1325531a1	vmui: Fix initial serverUrl for vmanomaly (#6834 ) - fix TS lint - anomaly: remove /vmui - anomaly: minor inspections fix - docs: fix broken links to headings ### Describe Your Changes Initially vmanomaly opened with `/vmui` in serverUrl, remove it. (cherry picked from commit `535a9ed059`)	2024-08-21 14:12:07 +02:00
jackyin	98758ef18e	vmui: fix not found index.js in VictoriaLogs (#6770 ) fix #6764 the index.js file is for [this feature](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/master/app/vmui#predefined-dashboards), the feature is just for victoriametrics. so the index.js is deleted in victorialogs. i just add an empty index.js to fix it. --------- Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `3ebdd3bcb8`)	2024-08-20 17:14:00 +02:00
Hui Wang	e8f5dbd598	vmalert: add command line flag `-notifier.headers` (#6751 ) to allow configuring additional headers in each request to the corresponding notifier. Other flags like `-datasource.headers`, `-remoteWrite.headers` already use `^^` as delimiter, it's consistent to use it in `-notifier.headers` as well. related https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3260 vmalert can integrate with alertmanager that supports multi-tenant by adding tenantID header`X-Scope-OrgID` in requests. In multitenancy, vmalert can also filter alerts which send to different notifier addresses(or with different header settings) using `alert_relabel_configs`. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3260 --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `0f1ec33892`)	2024-08-19 21:41:57 +02:00
hagen1778	bd6405df01	make go vet happy Address `non-constant format string in call` check: https://github.com/golang/go/issues/60529 Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `febba3971b`)	2024-08-19 21:41:44 +02:00
Roman Khavronenko	d4240c4a3e	lib/httputils: parse URL before creating HTTP transport (#6820 ) https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6740 --------- Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-08-16 11:34:49 +02:00
Zakhar Bessarab	84b8ea7337	app/vmseleсt/promql: fix calculation of histogram buckets This issue was introduced in `6a4bd5049b` See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6714 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2024-08-15 10:13:54 +02:00
Nikolay	f255800da3	app/vminsert: returns back memory optimisation (#6794 ) Production workload shows that it's useful optimisation. Channel based objects pool allows to handle irregural data ingestion requests and make memory allocations more smooth. It's improves sync.Pool efficiency, since objects from sync.Pool removed after 2 GC cycles. With GOGC=30 value, GC runs significantly more often. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6733 ### Checklist The following checks are mandatory: - [ ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: f41gh7 <nik@victoriametrics.com> Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com>	2024-08-13 10:49:09 -04:00
ccliu	8729052623	vmagent: resolve the issue where usePromCompatibleNaming is not working (#6776 ) Describe Your Changes When I use usePromCompatibleNaming with vmagent to process data that needs to be formatted from different sources such as InfluxDB, I find that it doesn’t work However, it works in vminsert. I found that vminsert uses the HasRelabeling method to determine whether to relabel. ```go func HasRelabeling() bool { pcs := pcsGlobal.Load() return pcs.Len() > 0 \|\| usePromCompatibleNaming } ``` in vmagent, the decision to relabel is determined only by pcsGlobal.Len() > 0. However, in the applyRelabeling method, the usePromCompatibleNaming logic is also used to determine whether to relabel in the error handling. ```go func (rctx relabelCtx) applyRelabeling(tss []prompbmarshal.TimeSeries, pcs promrelabel.ParsedConfigs) []prompbmarshal.TimeSeries { if pcs.Len() == 0 && !usePromCompatibleNaming { // Nothing to change. return tss } ``` So I think that the logic for determining whether to relabel in vmagent is not as expected. Checklist The following checks are mandatory: [✅]My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Co-authored-by: Roman Khavronenko <hagen1778@gmail.com> (cherry picked from commit `d134a310f3`)	2024-08-13 10:33:55 -04:00
jackyin	11233364b6	vlogs: add select/deselect all button to table settings in UI (#6680 ) fix #6668, just add select all and "unselect all" func. https://github.com/user-attachments/assets/0c31385b-def0-4618-aa9c-5ba4bb6f56c3 --------- Co-authored-by: Yury Molodov <yurymolodov@gmail.com> Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `5f5bc46b3e`)	2024-08-13 10:33:54 -04:00
Zhu Jiekun	27a6be6630	docs: add more details to -cacheDataPath vmselect flag (#6708 ) vmselect will create `./tmp` dir under `cacheDataPath`. If `cacheDataPath` is set to `/`, vmselect will use `/tmp`. content under `/tmp` dir might be auto removed based on the OS behaviour. See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5770 - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com>	2024-08-13 09:17:43 -04:00
Hui Wang	e74d5f266e	stream aggregation: do not allow to enable `-stream.keepInput` and `k… (#6723 ) …eep_metric_names` options in stream aggregation config together With aggregated data and raw data under the same metric, results would be confusing. --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `62d19369a3`)	2024-08-13 09:08:27 -04:00
Anton L	79008b712f	app/vmselect/graphite: respect denyPartialResponse for graphite requests (#6748 ) VM has different responses to equivalent queries for MetricsQL and GraphiteQL in case of failed access to one of vmstorage node of the cluster vmstorage nodes. For GraphiteQL, the denyPartialResponse feature is not used, it is always true, which is not always correct (depending on the configuration). In the PR I have removed the hardcoded denyPartialResponse for GraphiteQL, just like MetricsQL does. - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/).	2024-08-07 12:34:23 +02:00
Hui Wang	13a21a3ba0	app/vmagent/remotewrite: make `-remoteWrite.streamAggr.ignoreFirstIntervals` of array type (#6744 ) Make `-remoteWrite.streamAggr.ignoreFirstIntervals` of array type so it could accept multiple values which can be applied to the corresponding`-remoteWrite.url`. --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `8f5c26d788`)	2024-08-07 09:57:49 +02:00
Hui Wang	71ac65996b	app/vmagent/remotewrite: fix `-streamAggr.dropInputLabels` behavior (#6743 ) Fix `-streamAggr.dropInputLabels` behavior when global deduplication is enabled without `-streamAggr.config`. Previously, `-remoteWrite.streamAggr.dropInputLabels` is misapplied. --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `4863605469`)	2024-08-07 09:57:49 +02:00
hagen1778	ec05e70742	app/vmalert: rm unnecessary err check The error check was needed before `a84491324d` It was kept by mistake and makes no sense to have rn. Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `9726e6c1a2`)	2024-08-07 09:57:48 +02:00
Yury Molodov	b4aec9ee05	vmui/logs: add display top streams in the hits graph (#6647 ) ### Describe Your Changes - Adds support for displaying the top 5 log streams in the hits graph, grouping the remaining streams into an "other" label. #6545 - Adds options to customize the graph display with bar, line, stepped line, and points views. ### Checklist The following checks are mandatory: - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). (cherry picked from commit `04c2232e45`)	2024-08-06 16:30:12 +02:00
Zakhar Bessarab	a3a0bafe76	app/vlinsert/elasticsearch: add fake response for logstash requests (#6742 ) ### Describe Your Changes This is needed in order to support standard Elasticsearch output in Logstash pipelines. See: https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6660 ### Checklist The following checks are mandatory: - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> (cherry picked from commit `58b6c54da2`)	2024-08-06 16:30:11 +02:00
Hui Wang	9f84c4fdfa	vmalert: respect HTTP headers defined in notifier configuration file (#6762 ) Co-authored-by: Roman Khavronenko <roman@victoriametrics.com> (cherry picked from commit `c1b54779a2`)	2024-08-06 16:30:10 +02:00
hagen1778	c99700ae15	fix typos in comments Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `f283126084`)	2024-08-06 16:30:10 +02:00
Zakhar Bessarab	0b1def6e24	app/{vminsert,vmagent}: add healthcheck for influx ingestion endpoints (#6749 ) ### Describe Your Changes This is useful for clients which validate InfluxDB is available before data ingestion can be started. See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6653 ### Checklist The following checks are mandatory: - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `9877a5e7d5`) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-08-05 09:45:32 +02:00
Dmytro Kozlov	fdad3e94f5	vmctl: add `--backoff-retries`, `--backoff-factor`, `--backoff-min-duration` global command-line flags (#6639 ) ### Describe Your Changes Added `--vm-backoff-retries`, `--vm-backoff-factor`, `--vm-backoff-min-duration` and `--vm-native-backoff-retries`, `--vm-native-backoff-factor`, `--vm-native-backoff-min-duration` command-line flags to the `vmctl` app. Those changes will help to configure the retry backoff policy for different situations. Related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6622 ### Checklist The following checks are mandatory: - [X] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `6f401daacb`) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-08-03 19:34:03 +02:00
Yury Molodov	00b108ca04	vmui/logs: improve UI functionality (#6688 ) * add a toggle button to the "Group" tab that allows users to expand or collapse all groups at once * introduce the ability to select a key for grouping logs within the "Group" tab * display the number of entries within each log group. * move the Markdown toggle to the general settings panel in the upper left corner. (cherry picked from commit `e06a19d85f`)	2024-08-02 15:58:07 +02:00
Yury Molodov	a93ee27a85	vmui/logs: add fields for tenant configuration (#6661 ) Added fields for configuring AccountID and ProjectID #6631	2024-08-02 11:14:42 +02:00
f41gh7	115a76d28c	make vmui-update	2024-08-01 14:45:29 +02:00
Yury Molodov	7d37ca3159	vmui: fix auto-completion triggers (#6566 ) ### Describe Your Changes - Fixes auto-complete triggers according to [these comments](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5866#issuecomment-2065273421). - Fixes loading and displaying suggestions when there is no metric in the expression. Related issue: #6153 - Adds quotes when inserting label values. Related issue: #6260 - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). (cherry picked from commit `53919327b2`)	2024-07-31 16:09:18 +02:00
Aliaksandr Valialkin	9a3f44e79c	app/{vmselect,vlselect}: run `make vmui-update vmui-logs-update` after `efd70b2c52`	2024-07-27 13:51:02 +02:00
Aliaksandr Valialkin	5dc7ec058f	app/vmauth: verify how backend response headers are propagated to vmauth client	2024-07-27 13:45:07 +02:00
Hui Wang	e0c62e5c50	security: upgrade base docker image (Alpine) from 3.20.1 to 3.20.2 (#6684 ) See https://www.alpinelinux.org/posts/Alpine-3.20.1-released.html >including security fix for: OpenSSL CVE-2024-5535	2024-07-25 11:02:23 +02:00
Zakhar Bessarab	9f5eb25150	app/vmauth: change response code when all backend are not available (#6676 ) ### Describe Your Changes Change response code to 502 to align it with behaviour of other existing reverse proxies. Currently, the following reverse proxies will return 502 in case an upstream is not available: nginx, traefik, caddy, apache. Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2024-07-25 10:24:05 +02:00
Aliaksandr Valialkin	339821f5ce	app/vmauth: test how User-Agent header is set in requests to backend	2024-07-20 11:43:36 +02:00
Aliaksandr Valialkin	b31bd5613f	app/vmauth: verify the correctness of X-Forwarded-For header processing at TestRequestHandler()	2024-07-20 11:43:36 +02:00
Aliaksandr Valialkin	8f3fd62f50	app/vmauth: add missing tests for requestHandler()	2024-07-20 11:22:54 +02:00
Aliaksandr Valialkin	37fdba6897	app/vmauth: add more tests for requestHandler()	2024-07-20 10:19:57 +02:00
Aliaksandr Valialkin	28af963940	docs/vmauth.md: document the case with default url_prefix additionally to url_map	2024-07-20 09:46:31 +02:00
Aliaksandr Valialkin	a50a29500f	app/vmauth: properly proxy requests to backend paths ending with / Previously the traling / was incorrectly removed when proxying requests from http://vmauth/ While at it, add more tests for requestHandler()	2024-07-19 17:29:17 +02:00
Aliaksandr Valialkin	4e3acfbe9a	app/vmauth: properly proxy HTTP requests without body The Request.Body for requests without body can be nil. This could break readTrackingBody.Read() logic, which could incorrectly return "cannot read data after closing the reader" error in this case. Fix this by initializing the readTrackingBody.r with zeroReader. While at it, properly set Host header if it is specified in 'headers' section. It must be set net/http.Request.Host instead of net/http.Request.Header.Set(), since the net/http.Client overwrites the Host header with the value from req.Host before sending the request. While at it, add tests for requestHandler(). Additional tests for various requestHandler() cases will be added in future commits. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6445 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5707 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5240 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6525	2024-07-19 16:26:07 +02:00
Yury Molodov	be2a61c244	vmui/logs: switched requests to sequential execution (#6624 ) ### Describe Your Changes This PR changes `/select/logsql/query` and `/select/logsql/hits` to execute sequentially Fixed https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6558#issuecomment-2219298984 ### Checklist The following checks are mandatory: - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/).	2024-07-18 11:56:19 +02:00
Aliaksandr Valialkin	7e0fff224e	app/vmselect/vmui: run `make vmui-update` after `959a4383c5`	2024-07-17 23:09:25 +02:00
Aliaksandr Valialkin	97d696ae8b	all: substitute double "the the" with "the" This is a follow-up for `8786a08d27` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6600	2024-07-17 14:29:05 +02:00
Aliaksandr Valialkin	f8aa445945	all: consistently use stringsutil.JSONString() for formatting JSON strings with fmt.* functions instead of using "%q" formatter The %q formatter may result in incorrectly formatted JSON string if the original string contains special chars such as \x1b . They must be encoded as \u001b , otherwise the resulting JSON string cannot be parsed by JSON parsers. This is a follow-up for `c0caa69939` See https://github.com/VictoriaMetrics/victorialogs-datasource/issues/24	2024-07-17 14:01:37 +02:00
rtm0	1b03d7e6de	Fix inconsistent error handling in Storage.AddRows() (#6583 ) `Storage.AddRows()` returns an error only in one case: when `Storage.updatePerDateData()` fails to unmarshal a `metricNameRaw`. But the same error is treated as a warning when it happens inside `Storage.add()` or returned by `Storage.prefillNextIndexDB()`. This commit fixes this inconsistency by treating the error returned by `Storage.updatePerDateData()` as a warning as well. As a result `Storage.add()` does not need a return value anymore and so doesn't `Storage.AddRows()`. Additionally, this commit adds a unit test that checks all cases that result in a row not being added to the storage. --------- Signed-off-by: Artem Fetishev <wwctrsrx@gmail.com> Co-authored-by: Nikolay <nik@victoriametrics.com>	2024-07-17 12:55:07 +02:00
Aliaksandr Valialkin	3087749aa9	app/vmauth: properly handle the case when zero backend hosts are resolved at SRV DNS When zero backend hosts are resolved, then vmauth must return 'no backend hosts' error instead of crashing with panic This is a follow-up for `590aeccd7d` and `3a45bbb4e0` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6401	2024-07-17 11:34:33 +02:00
Aliaksandr Valialkin	31b8e9054d	app/vmauth: pool readTrackingBody structs in order to reduce pressure on Go GC - use pool for readTrackingBody structs in order to reduce pressure on Go GC - allow re-reading partially read request body - add missing tests for various cases of readTrackingBody usage This is a follow-up for `ad6af95183` and `4d66e042e3`. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6445 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6446 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6533	2024-07-17 11:34:32 +02:00
Aliaksandr Valialkin	d427a4cfaa	app/vmauth: use more clear names for the field and function added at `e666d64f1d` - Rename overrideHostHeader() function to hasEmptyHostHeader() - Rename overrideHostHeader field at UserInfo to useBackendHostHeader This should simplify the future maintenance of the code Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6525	2024-07-17 11:34:32 +02:00
Aliaksandr Valialkin	111f7da946	Revert "app/vmauth: reader pool to reduce gc & mem alloc (#6533 )" This reverts commit `4d66e042e3`. Reasons for revert: - The commit makes unrelated invalid changes to docs/CHANGELOG.md - The changes at app/vmauth/main.go are too complex. It is better splitting them into two parts: - pooling readTrackingBody struct for reducing pressure on GC - avoiding to use readTrackingBody when -maxRequestBodySizeToRetry command-line flag is set to 0 Let's make this in the follow-up commits! Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6445 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6533	2024-07-17 11:34:31 +02:00
Aliaksandr Valialkin	8b1c38abde	app/vmauth: follow-up for `3a45bbb4e0` - Move the test for SRV discovery into a separate function. This allows verifying round-robin discovery across SRV records. - Restore the original netutil.Resolver after the test finishes, so it doesn't interfere with other tests. - Move the description of the bugfix into the correct place at docs/CHANGELOG.md - it should be placed under v1.102.0-rc2 instead of v1.102.0-rc1. - Remove unneeded code in URLPrefix.sanitizeAndInitialize(), since it is expected this function is called only once for finishing URLPrefix initializiation. In this case URLPrefix.nextDiscoveryDeadline and URLPrefix.n are equal to 0 according to https://pkg.go.dev/sync/atomic#Uint64 - Properly fix the bug at URLPrefix.discoverBackendAddrsIfNeeded() - it is expected that hostToAddrs map uses the original hostname keys, including 'srv+' prefix, so it shouldn't be removed when looping over up.busOriginal. Instead, the 'srv+' prefix must be removed from the hostname only locally before passing the hostname to netutil.Resolver.LookupSRV. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6401	2024-07-16 10:41:08 +02:00
Aliaksandr Valialkin	468c04d3c2	app/vmauth: clarify the description for -idleConnTimeout command-line flag This is a follow-up for `d44058bcd6` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6388	2024-07-16 09:40:01 +02:00
Aliaksandr Valialkin	8b76a40715	lib/httpserver: skip basic auth check for additional request paths, which should call httpserver.CheckAuthFlag() This is a follow-up for `61dce6f2a1` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6338 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6329	2024-07-16 01:08:41 +02:00
Aliaksandr Valialkin	aa52d6cd9b	app/vminsert: increase default value for -maxLabelValueLen command-line flag from 1KiB to 4KiB It has been appeared that the standard Kubernetes monitoring can generate labels with sizes up to 4KiB This is a follow-up for `a5d1013042` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6176	2024-07-15 23:32:54 +02:00
Aliaksandr Valialkin	476bf400ac	lib/{httputils,netutil}: move httputils.GetStatDialFunc to netutil.NewStatDialFunc - Rename GetStatDialFunc to NewStatDialFunc, since it returns new function with every call - NewStatDialFunc isn't related to http in any way, so it must be moved from lib/httputils to lib/netutil - Simplify the implementation of NewStatDialFunc by removing sync.Map from there. - Use netutil.NewStatDialFunc at app/vmauth and lib/promscrape/discoveryutils - Use gauge instead of counter type for *_conns metric This is a follow-up for `d7b5062917` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6299	2024-07-15 23:05:46 +02:00
Aliaksandr Valialkin	353766061b	app/{vminsert,vmselect}: pass proper args to metrics.UnregisterSet() after `a8356f3a26`	2024-07-15 20:27:40 +02:00
Aliaksandr Valialkin	cbc637d1dd	app/vmagent/remotewrite: follow-up for `f153f54d11` - Move the remaining code responsible for stream aggregation initialization from remotewrite.go to streamaggr.go . This improves code maintainability a bit. - Properly shut down streamaggr.Aggregators initialized inside remotewrite.CheckStreamAggrConfigs(). This prevents from potential resource leaks. - Use separate functions for initializing and reloading of global stream aggregation and per-remoteWrite.url stream aggregation. This makes the code easier to read and maintain. This also fixes INFO and ERROR logs emitted by these functions. - Add an ability to specify `name` option in every stream aggregation config. This option is used as `name` label in metrics exposed by stream aggregation at /metrics page. This simplifies investigation of the exposed metrics. - Add `path` label additionally to `name`, `url` and `position` labels at metrics exposed by streaming aggregation. This label should simplify investigation of the exposed metrics. - Remove `match` and `group` labels from metrics exposed by streaming aggregation, since they have little practical applicability: it is hard to use these labels in query filters and aggregation functions. - Rename the metric `vm_streamaggr_flushed_samples_total` to less misleading `vm_streamaggr_output_samples_total` . This metric shows the number of samples generated by the corresponding streaming aggregation rule. This metric has been added in the commit `861852f262` . See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6462 - Remove the metric `vm_streamaggr_stale_samples_total`, since it is unclear how it can be used in practice. This metric has been added in the commit `861852f262` . See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6462 - Remove Alias and aggrID fields from streamaggr.Options struct, since these fields aren't related to optional params, which could modify the behaviour of the constructed streaming aggregator. Convert the Alias field to regular argument passed to LoadFromFile() function, since this argument is mandatory. - Pass Options arg to LoadFromFile() function by reference, since this structure is quite big. This also allows passing nil instead of Options when default options are enough. - Add `name`, `path`, `url` and `position` labels to `vm_streamaggr_dedup_state_size_bytes` and `vm_streamaggr_dedup_state_items_count` metrics, so they have consistent set of labels comparing to the rest of streaming aggregation metrics. - Convert aggregator.aggrStates field type from `map[string]aggrState` to `[]aggrOutput`, where `aggrOutput` contains the corresponding `aggrState` plus all the related metrics (currently only `vm_streamaggr_output_samples_total` metric is exposed with the corresponding `output` label per each configured output function). This simplifies and speeds up the code responsible for updating per-output metrics. This is a follow-up for the commit `2eb1bc4f81` . See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6604 - Added missing urls to docs ( https://docs.victoriametrics.com/stream-aggregation/ ) in error messages. These urls help users figuring out why VictoriaMetrics or vmagent generates the corresponding error messages. The urls were removed for unknown reason in the commit `2eb1bc4f81` . - Fix incorrect update for `vm_streamaggr_output_samples_total` metric in flushCtx.appendSeriesWithExtraLabel() function. While at it, reduce memory usage by limiting the maximum number of samples per flush to 10K. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5467 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6268	2024-07-15 20:25:36 +02:00
Aliaksandr Valialkin	a8356f3a26	vendor: update github.com/VictoriaMetrics/metrics from v1.34.1 to v1.35.0 Fix potential memory leaks across VictoriaMetrics codebase after metrics.UnregisterSet(s) call because of missing s.UnregisterAllMetrics() call. This is a follow-up for `6a6e34ab8e` . It is OK if some vmauth metrics aren't visible for a few microseconds when the previous metrics are unregistered and new metrics weren't registered yet. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6247 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4690 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6252 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5805	2024-07-15 10:45:39 +02:00
Aliaksandr Valialkin	3365dd508f	app/vmagent/remotewrite: do not spend CPU time on an attempt to send data to blocked queue if some queues are unblocked Previously remotewrite.TryPush() was trying to send data to remote storages with blocked persistent queues, if some persistent queues to other remote storage systems were unblocked. This resulted in excess CPU usage on relabeling and stream aggregation for the remote storage with blocked queues. The solution is to check whether some peristent storages have blocked queues and skip them before applying per- -remoteWrite.url relabeling and streaming aggregation. While at it, properly update per- -remoteWrite.url vmagent_remotewrite_samples_dropped_total and vmagent_remotewrite_push_failures_total counters when global streaming aggregation cannot send data to remote storage systems because of blocked queues. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5467 and https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6268 . This is a follow-up for `87fd400dfc` and `f153f54d11` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6248 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6065	2024-07-15 09:40:34 +02:00
Aliaksandr Valialkin	4921ec5604	docs/CHANGELOG.md: use new link to VictoriaMetrics cluster docs instead of old link The old link was changed globally to the new link in the commit `f4b1cbfef0` . Unfortunately, old links are still posted in new commits :( This is a follow-up for `680b8c25c8` . While at it, remove duplicate 'len(*remoteWriteURLs) > 0' check in the remotewrite.Init() functions, since this check is already made at the beginning of the function. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6253	2024-07-13 03:04:20 +02:00
Aliaksandr Valialkin	bc1f92d7f5	app/vmagent/remotewrite: follow-up for `87fd400dfc` - Drop samples and return true from remotewrite.TryPush() at fast path when all the remote storage systems are configured with the disabled on-disk queue, every in-memory queue is full and -remoteWrite.dropSamplesOnOverload is set to true. This case is quite common, so it should be optimized. Previously additional CPU time was spent on per-remoteWriteCtx relabeling and other processing in this case. - Properly count the number of dropped samples inside remoteWriteCtx.pushInternalTrackDropped(). Previously dropped samples were counted only if -remoteWrite.dropSamplesOnOverload flag is set. In reality, the samples are dropped when they couldn't be sent to the queue because in-memory queue is full and on-disk queue is disabled. The remoteWriteCtx.pushInternalTrackDropped() function is called by streaming aggregation for pushing the aggregated data to the remote storage. Streaming aggregation cannot wait until the remote storage processes pending data, so it drops aggregated samples in this case. - Clarify the description for -remoteWrite.disableOnDiskQueue command-line flag at -help output, so it is clear that this flag can be set individually per each -remoteWrite.url. - Make the -remoteWrite.dropSamplesOnOverload flag global. If some of the remote storage systems are configured with the disabled on-disk queue, then there is no sense in keeping samples on some of these systems, while dropping samples on the remaining systems, since this will result in global stall on the remote storage system with the disabled on-disk queue and with the -remoteWrite.dropSamplesOnOverload=false flag. vmagent will always return false from remotewrite.TryPush() in this case. This will result in infinite duplicate samples written to the remaining remote storage systems. That's why the -remoteWrite.dropSamplesOnOverload is forcibly set to true if more than one -remoteWrite.disableOnDiskQueue flag is set. This allows proceeding with newly scraped / pushed samples by sending them to the remaining remote storage systems, while dropping them on overloaded systems with the -remoteWrite.disableOnDiskQueue flag set. - Verify that the remoteWriteCtx.TryPush() returns true in the TestRemoteWriteContext_TryPush_ImmutableTimeseries test. - Mention in vmagent docs that the -remoteWrite.disableOnDiskQueue command-line flag can be set individually per each -remoteWrite.url. See https://docs.victoriametrics.com/vmagent/#disabling-on-disk-persistence Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6248 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6065	2024-07-13 02:30:10 +02:00
Aliaksandr Valialkin	5c7345b8ce	app/victoria-logs/Makefile: add `make victoria-logs-linux-loong64` build rule This is a follow-up for `80f3644ee3` The https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6222 missed build rule for VictoriaLogs.	2024-07-12 23:13:19 +02:00
Aliaksandr Valialkin	43fc1183b9	app/vmalert: switch from table-driven tests to f-tests This makes test code more clear and reduces the number of code lines by 500. This also simplifies debugging tests. See https://itnext.io/f-tests-as-a-replacement-for-table-driven-tests-in-go-8814a8b19e9e While at it, consistently use t.Fatal* instead of t.Error* across tests, since t.Error* requires more boilerplate code, which can result in additional bugs inside tests. While t.Error* allows writing logging errors for the same, this doesn't simplify fixing broken tests most of the time. This is a follow-up for `a9525da8a4`	2024-07-12 22:45:50 +02:00
Aliaksandr Valialkin	04a304fd39	app/vmctl: switch from table-driven tests to f-tests This simplifies debugging tests and makes the test code more clear and concise. See https://itnext.io/f-tests-as-a-replacement-for-table-driven-tests-in-go-8814a8b19e9e While at is, consistently use t.Fatal* instead of t.Error* across tests, since t.Error* requires more boilerplate code, which can result in additional bugs inside tests. While t.Error* allows writing logging errors for the same, this doesn't simplify fixing broken tests most of the time. This is a follow-up for `a9525da8a4`	2024-07-12 22:45:49 +02:00
Aliaksandr Valialkin	7c97cef95c	app: consistently use t.Fatal* instead of t.Error* (except of app/vmalert and app/vmctl - these packages will be processed in a separate commit) Consistently using t.Fatal* simplifies the test code and makes it less fragile, since it is common error to forget to make proper cleanup after t.Error* call. Also t.Error* calls do not provide any practical benefits when some tests fail. They just clutter test output with additional noise information, which do not help in fixing failing tests most of the time. This is a follow-up for `a9525da8a4`	2024-07-11 16:01:25 +02:00
Zhu Jiekun	2ea575e776	vmalert: [bug] fixed System hyperlink 404 redirect (#6620 ) ### Describe Your Changes As mentioned in https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6603, some hyperlinks under `vmalert` -> `System` section is not working as expected. Pages and redirection: - For page `http://127.0.0.1:8880/`: `flags` button will redirect to `http://127.0.0.1:8880/flags` - For page `http://127.0.0.1:8880/vmalert`: `http://127.0.0.1:8880/flags` - For page `http://127.0.0.1:8880/vmalert/`: `http://127.0.0.1:8880/vmalert/flags` (page not exists) - Similar redirection could be observed with `-http.pathPrefix` Two potential ways to avoid 404 redirection: 1. avoid visiting `/vmalert/` (I'm trying to do this). 2. provide support for `/vmalert/flags`. `/vmalert/` could be visit only when user click other navigator (e.g. Group) and click vmalert again: ![Peek 2024-07-10 10-07](https://github.com/VictoriaMetrics/VictoriaMetrics/assets/30280396/13d7b147-a1b6-4e93-9ee0-26f881a16bef) Because: `http://127.0.0.1:8880/vmalert/groups?search=` + `<a class="nav-link" href=".">` = `http://127.0.0.1:8880/vmalert/` So I'm trying to change the `href="."` to `href="../vmalert"`. ### Checklist The following checks are mandatory: - [X] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). (cherry picked from commit `cadf1eb5ab`)	2024-07-11 12:40:23 +02:00
Zakhar Bessarab	401ae72587	app/vmselect/promql: propagate lower bucket values when fixing a histogram (#6547 ) ### Describe Your Changes In most cases histograms are exposed in sorted manner with lower buckets being first. This means that during scraping buckets with lower bounds have higher chance of being updated earlier than upper ones. Previously, values were propagated from upper to lower bounds, which means that in most cases that would produce results higher than expected once all buckets will become updated. Propagating from upper bound effectively limits highest value of histogram to the value of previous scrape. Once the data will become consistent in the subsequent evaluation this causes spikes in the result. Changing propagation to be from lower to higher buckets reduces value spikes in most cases due to nature of the original inconsistency. See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4580 An example histogram with previous(red) and updated(blue) versions: ![1719565540](https://github.com/VictoriaMetrics/VictoriaMetrics/assets/1367798/605c5e60-6abe-45b5-89b2-d470b60127b8) This also makes logic of filling nan values with lower buckets values: [1 2 3 nan nan nan] => [1 2 3 3 3 3] obsolete. Since buckets are now fixed from lower ones to upper this happens in the main loop, so there is no need in a second one. --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Andrii Chubatiuk <andrew.chubatiuk@gmail.com> Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `6a4bd5049b`)	2024-07-10 15:17:08 +02:00
Aliaksandr Valialkin	a1decb5ca1	app/vlinsert/loki: use easyproto instead for parsing Loki protobuf messages	2024-07-10 03:05:55 +02:00
Aliaksandr Valialkin	32ae40410c	app/vlselect/vmui: run `make vmui-logs-update` after `662e026279`	2024-07-10 03:05:55 +02:00
Aliaksandr Valialkin	b8a8d3d6f1	lib/logstorage: drop all the pipes from the query when calculating the number of matching logs at /select/logsql/hits API	2024-07-10 00:39:16 +02:00
Aliaksandr Valialkin	d6415b2572	all: consistently use 'any' instead of 'interface{}' 'any' type is supported starting from Go1.18. Let's consistently use it instead of 'interface{}' type across the code base, since `any` is easier to read than 'interface{}'.	2024-07-10 00:23:26 +02:00
Aliaksandr Valialkin	73ca22bb7d	app/vlinsert/loki: remove unused functions from the generated protobuf code	2024-07-10 00:22:10 +02:00
Yury Molodov	33bd5ccbab	vmui/logs: add spinner to bar chart (#6577 ) Add a spinner to the bar chart https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6558 --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `662e026279`)	2024-07-09 18:27:23 +02:00
Hui Wang	6f602a4ef5	security: upgrade base docker image (Alpine) from 3.20.0 to 3.20.1 See https://www.alpinelinux.org/posts/Alpine-3.20.1-released.html >including security fixes for: OPENSSL [CVE-2024-4741](https://security.alpinelinux.org/vuln/CVE-2024-4741) BUSYBOX [CVE-2023-42364](https://security.alpinelinux.org/vuln/CVE-2023-42364) [CVE-2023-42365](https://security.alpinelinux.org/vuln/CVE-2023-42365) (cherry picked from commit `8e9f98e725`)	2024-07-09 11:38:44 +02:00
Artem Navoiev	7b508a9334	fix typo Signed-off-by: Artem Navoiev <tenmozes@gmail.com> (cherry picked from commit `4527020a68`)	2024-07-09 10:52:50 +02:00
Yury Molodov	7fc9912d15	vmui: add compact JSON display (#6582 ) ### Describe Your Changes If a JSON element has only one field, it will be displayed on a single line. #6559 \| Old Display \| New Display \| \|-------------\|-------------\| \| ![image](https://github.com/VictoriaMetrics/VictoriaMetrics/assets/29711459/8866517b-a49d-450f-904c-19117397a078) \| ![image](https://github.com/VictoriaMetrics/VictoriaMetrics/assets/29711459/8e222b43-a4cb-4f32-9a79-6199778404d3) \| ### Checklist The following checks are mandatory: - [ ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `959a4383c5`)	2024-07-05 09:49:12 +02:00
Hui Wang	bbd49a1a61	vmalert: allow omitting `-replay.timeTo` in replay mode, default valu… (#6575 ) …e is the current timestamp address https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6492 --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `3169524fb7`)	2024-07-05 09:49:06 +02:00
Roman Khavronenko	b13c363f12	app/vmalert: add examples for `source` override (#6561 ) The change adds a new docs section with examples on how source can be overridden. It should address questions like https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6536 While there, fix the example in `external.alert.source` cmd-line flag and docker-compose examples. ### Checklist The following checks are mandatory: - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `c429bbf889`)	2024-07-05 09:49:03 +02:00
Aliaksandr Valialkin	172ae1adf7	Revert `c6c5a5a186` and `b2765c45d0` Reason for revert: There are many statsd servers exist: - https://github.com/statsd/statsd - classical statsd server - https://docs.datadoghq.com/developers/dogstatsd/ - statsd server from DataDog built into DatDog Agent ( https://docs.datadoghq.com/agent/ ) - https://github.com/avito-tech/bioyino - high-performance statsd server - https://github.com/atlassian/gostatsd - statsd server in Go - https://github.com/prometheus/statsd_exporter - statsd server, which exposes the aggregated data as Prometheus metrics These servers can be used for efficient aggregating of statsd data and sending it to VictoriaMetrics according to https://docs.victoriametrics.com/#how-to-send-data-from-graphite-compatible-agents-such-as-statsd ( the https://github.com/prometheus/statsd_exporter can be scraped as usual Prometheus target according to https://docs.victoriametrics.com/#how-to-scrape-prometheus-exporters-such-as-node-exporter ). Adding support for statsd data ingestion protocol into VictoriaMetrics makes sense only if it provides significant advantages over the existing statsd servers, while has no significant drawbacks comparing to existing statsd servers. The main advantage of statsd server built into VictoriaMetrics and vmagent - getting rid of additional statsd server. The main drawback is non-trivial and inconvenient streaming aggregation configs, which must be used for the ingested statsd metrics ( see https://docs.victoriametrics.com/stream-aggregation/ ). These configs are incompatible with the configs for standalone statsd servers. So you need to manually translate configs of the used statsd server to stream aggregation configs when migrating from standalone statsd server to statsd server built into VictoriaMetrics (or vmagent). Another important drawback is that it is very easy to shoot yourself in the foot when using built-in statsd server with the -statsd.disableAggregationEnforcement command-line flag or with improperly configured streaming aggregation. In this case the ingested statsd metrics will be stored to VictoriaMetrics as is without any aggregation. This may result in high CPU usage during data ingestion, high disk space usage for storing all the unaggregated statsd metrics and high CPU usage during querying, since all the unaggregated metrics must be read, unpacked and processed during querying. P.S. Built-in statsd server can be added to VictoriaMetrics and vmagent after figuring out more ergonomic specialized configuration for aggregating of statsd metrics. The main requirements for this configuration: - easy to write, read and update (ideally it should work out of the box for most cases without additional configuration) - hard to misconfigure (e.g. hard to shoot yourself in the foot) It would be great if this configuration will be compatible with the configuration of the most widely used statsd server. In the mean time it is recommended continue using external statsd server. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6265 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5053 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5052 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/206 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4600	2024-07-03 23:57:49 +02:00
Aliaksandr Valialkin	cd152693c6	Revert "Exemplar support (#5982 )" This reverts commit `5a3abfa041`. Reason for revert: exemplars aren't in wide use because they have numerous issues which prevent their adoption (see below). Adding support for examplars into VictoriaMetrics introduces non-trivial code changes. These code changes need to be supported forever once the release of VictoriaMetrics with exemplar support is published. That's why I don't think this is a good feature despite that the source code of the reverted commit has an excellent quality. See https://docs.victoriametrics.com/goals/ . Issues with Prometheus exemplars: - Prometheus still has only experimental support for exemplars after more than three years since they were introduced. It stores exemplars in memory, so they are lost after Prometheus restart. This doesn't look like production-ready feature. See `0a2f3b3794/content/docs/instrumenting/exposition_formats.md (L153-L159)` and https://prometheus.io/docs/prometheus/latest/feature_flags/#exemplars-storage - It is very non-trivial to expose exemplars alongside metrics in your application, since the official Prometheus SDKs for metrics' exposition ( https://prometheus.io/docs/instrumenting/clientlibs/ ) either have very hard-to-use API for exposing histograms or do not have this API at all. For example, try figuring out how to expose exemplars via https://pkg.go.dev/github.com/prometheus/client_golang@v1.19.1/prometheus . - It looks like exemplars are supported for Histogram metric types only - see https://pkg.go.dev/github.com/prometheus/client_golang@v1.19.1/prometheus#Timer.ObserveDurationWithExemplar . Exemplars aren't supported for Counter, Gauge and Summary metric types. - Grafana has very poor support for Prometheus exemplars. It looks like it supports exemplars only when the query contains histogram_quantile() function. It queries exemplars via special Prometheus API - https://prometheus.io/docs/prometheus/latest/querying/api/#querying-exemplars - (which is still marked as experimental, btw.) and then displays all the returned exemplars on the graph as special dots. The issue is that this doesn't work in production in most cases when the histogram_quantile() is calculated over thousands of histogram buckets exposed by big number of application instances. Every histogram bucket may expose an exemplar on every timestamp shown on the graph. This makes the graph unusable, since it is litterally filled with thousands of exemplar dots. Neither Prometheus API nor Grafana doesn't provide the ability to filter out unneeded exemplars. - Exemplars are usually connected to traces. While traces are good for some I doubt exemplars will become production-ready in the near future because of the issues outlined above. Alternative to exemplars: Exemplars are marketed as a silver bullet for the correlation between metrics, traces and logs - just click the exemplar dot on some graph in Grafana and instantly see the corresponding trace or log entry! This doesn't work as expected in production as shown above. Are there better solutions, which work in production? Yes - just use time-based and label-based correlation between metrics, traces and logs. Assign the same `job` and `instance` labels to metrics, logs and traces, so you can quickly find the needed trace or log entry by these labes on the time range with the anomaly on metrics' graph. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5982	2024-07-03 16:09:18 +02:00
Aliaksandr Valialkin	a5d60ad78e	app/vmagent/remotewrite,lib/streamaggr: re-use common code in tests after `879771808b` - Export streamaggr.LoadFromData() function, so it could be used in tests outside the lib/streamaggr package. This allows removing a hack with creation of temporary files at TestRemoteWriteContext_TryPush_ImmutableTimeseries. - Move common code for mustParsePromMetrics() function into lib/prompbmarshal package, so it could be used in tests for building []prompbmarshal.TimeSeries from string. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6205 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6206	2024-07-03 15:22:51 +02:00
Aliaksandr Valialkin	4268a310c1	app/vmagent/remotewrite/remotewrite.go: make remoteWriteCtx.TryPush code easier to follow Move the code responsible for relabelCtx clearing into deferred function. This allows making more clear the remoteWriteCtx.TryPush code. This is a follow-up for `879771808b` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6205 While at it, clarify the description of the bugfix at docs/CHANGELOG.md	2024-07-03 14:18:51 +02:00
Aliaksandr Valialkin	f406764ccc	app/vmagent/remotewrite/streamaggr.go: clarify the description for -remoteWrite.streamAggr.* command-line flags, so they are applied to the corresponding -remoteWrite.url	2024-07-03 14:18:51 +02:00
Aliaksandr Valialkin	bb7406e9c0	app/vmselect/promql: follow-up for `dd0d2c77c8` and `6149adbe10` Use metricsql.IsLikelyInvalid() function for determining whether the given query is likely invalid, e.g. there is high change the query is incorrectly written, so it will return unexpected results. The query is invalid most of the time if it passes something other than series selector into rollup function. For example: - rate(sum(foo)) - rate(foo + bar) - rate(foo > bar) Improtant note: the query is considered valid if it misses the lookbehind window in square brackes inside rollup function, e.g. rate(foo), since this is very convenient MetricsQL extention to PromQL, and this query returns the expected results most of the time. Other unsafe query types can be added in the future into metricsql.IsLikelyInvalid(). TODO: probably, the -search.disableImplicitConversion command-line flag must be set by default in the future releases of VictoriaMetrics. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4338 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6180 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6450	2024-07-03 00:46:56 +02:00
Aliaksandr Valialkin	82748b2b9d	deployment/docker: update Go builder from Go1.22.4 to Go1.22.5 See https://github.com/golang/go/issues?q=milestone%3AGo1.22.5+label%3ACherryPickApproved	2024-07-03 00:07:55 +02:00
LHHDZ	c8431c8e4d	app/vmauth: reader pool to reduce gc & mem alloc (#6533 ) follow up https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6446 issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6445 --------- Signed-off-by: f41gh7 <nik@victoriametrics.com> Co-authored-by: f41gh7 <nik@victoriametrics.com> (cherry picked from commit `4d66e042e3`)	2024-07-02 14:37:15 +02:00
Aliaksandr Valialkin	0912a652d5	app/vlinsert/insertutils: flush the ingested logs from in-memory buffer to storage every second Previously the in-memory buffer could remain unflushed for long periods of time under low ingestion rate. The ingested logs weren't visible for search during this time.	2024-07-02 01:39:45 +02:00
Aliaksandr Valialkin	ab28a1f93e	app/vlinsert/syslog: add an ability to use log ingestion time as the _time field	2024-07-02 01:39:45 +02:00

1 2 3 4 5 ...

3477 Commits