Commit Graph

3485 Commits

Author SHA1 Message Date
Aliaksandr Valialkin
7e0ff1ee46
app/vlselect/logsql: call Query.Optimize() on the cloned query in order to replace * filter with filterNoop inside getLastNQueryResults()
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6785

(cherry picked from commit e86891b010)
2024-09-19 15:48:07 +02:00
Aliaksandr Valialkin
4e00e4428e
app/vmselect/promql: properly calculate c1 and c2 and c1 or c2 by upgrading github.com/VictoriaMetrics/metricsql to v0.79.0
The fix is in the https://github.com/VictoriaMetrics/metricsql/pull/34
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6637

(cherry picked from commit b82e2cabc5)
2024-09-19 15:48:06 +02:00
Dima Lazerka
465c7ad045
docs: fixes misspelled typos
Also tried to make it catch "Authorisation" in the future, fixed a lot
of other misspells along the way, but didn't make it catch
"Authorisation" anyway.

- Fix misspelled "Authorization" header name
- Fix misspelled "organization"
- Fix more misspells
2024-09-13 13:19:03 +02:00
Hui Wang
c7fc0d0d2f
vmalert: do not send message to alertmanager when alert has no label … (#6823)
…pair

`alert_relabel_configs` in [notifier
config](https://docs.victoriametrics.com/vmalert/#notifier-configuration-file)
can drop alert labels when used to filter different tenant alert message
to different notifier.
alertmanager would report error like `msg="Failed to validate alerts"
err="at least one label pair required"` in this case, but the rest of
the alerts inside one request would still be valid in alertmanager, so
it's not severe.

(cherry picked from commit ae4d376e41)
2024-09-09 16:06:44 +02:00
Aliaksandr Valialkin
8eef397d29
deployment/docker: update base Alpine docker image from 3.20.2 to 3.20.3
See https://alpinelinux.org/posts/Alpine-3.17.10-3.18.9-3.19.4-3.20.3-released.html
2024-09-08 19:27:05 +02:00
Aliaksandr Valialkin
cad236003b
app/vlselect: consistently reuse the original query timestamp when executing /select/logsql/query with positive limit=N query arg
Previously the query could return incorrect results, since the query timestamp was updated with every Query.Clone() call
during iterative search for the time range with up to limit=N rows.

While at it, optimize queries, which find low number of matching logs, while spend a lot of CPU time for searching
across big number of logs. The optimization reduces the upper bound of the time range to search if the current time range
contains zero matching rows.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6785
2024-09-08 14:34:46 +02:00
Aliaksandr Valialkin
4b49b62a58
lib/logstorage: improve error logging for incorrect queries passed to /select/logsql/stats_query and /select/logsql/stats_query_range functions 2024-09-08 12:28:33 +02:00
Aliaksandr Valialkin
c448189f69
app/vlselect: add /select/logsql/stats_query_range endpoint for building time series panels in VictoriaLogs plugin for Grafana
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6943
Updates https://github.com/VictoriaMetrics/victorialogs-datasource/issues/61
2024-09-07 00:44:34 +02:00
Aliaksandr Valialkin
01c8e12370
app/vlselect: add /select/logsql/stats_query endpoint, which is going to be used by vmalert
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6942
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6706
2024-09-06 23:00:58 +02:00
Aliaksandr Valialkin
e90e809c00
deployment: update Go builder from Go1.23.0 to Go1.23.1
See https://github.com/golang/go/issues?q=milestone%3AGo1.23.1+label%3ACherryPickApproved
2024-09-06 22:57:56 +02:00
f41gh7
395894688c
app/*/multiarch: return back empty value for TARGETARCH
follow-up after 91456ab5bb

docker buildx uses special variables, such as TARGETARCH and it shouldn't be overwritten.

 See this article for details
https://www.docker.com/blog/faster-multi-platform-builds-dockerfile-cross-compilation-guide/

Signed-off-by: f41gh7 <nik@victoriametrics.com>
2024-09-06 18:15:22 +02:00
Artem Fetishev
85bf768013
lib/storage: adds metrics that count records that failed to insert
### Describe Your Changes

Add storage metrics that count records that failed to insert:

- `RowsReceivedTotal`: the number of records that have been received by
the storage from the clients
- `RowsAddedTotal`: the number of records that have actually been
persisted. This value must be equal to `RowsReceivedTotal` if all the
records have been valid ones. But it will be smaller otherwise. The
values of the metrics below should provide the insight of why some
records hasn't been added
-   `NaNValueRows`: the number of records whose value was `NaN`
- `StaleNaNValueRows`: the number of records whose value was `Stale NaN`
- `InvalidRawMetricNames`: the number of records whose raw metric name
has failed to unmarshal.

The following metrics existed before this PR and are listed here for
completeness:

- `TooSmallTimestampRows`: the number of records whose timestamp is
negative or is older than retention period
- `TooBigTimestampRows`: the number of records whose timestamp is too
far in the future.
- `HourlySeriesLimitRowsDropped`: the number of records that have not
been added because the hourly series limit has been exceeded.
- `DailySeriesLimitRowsDropped`: the number of records that have not
been added because the daily series limit has been exceeded.

---
Signed-off-by: Artem Fetishev <wwctrsrx@gmail.com>
2024-09-06 18:13:48 +02:00
f41gh7
64361c2d7a
follow-up after 01430a155c
* properly check SeverityNumber at FormatSeverity function
 it could be negative, which could cause panic for victorialogs
2024-09-04 15:39:55 +02:00
Andrii Chubatiuk
711f2cc4f2
vlinsert: added opentelemetry logs support
Commit adds the following changes:

* Adds support of OpenTelemetry logs for Victoria Logs with protobuf encoded messages

*  json encoding is not supported for the following reasons:
   - It brings a lot of fragile code, which works inefficiently.
   - json encoding is impossible to use with language SDK.

* splits metrics and logs structures at lib/protoparser/opentelemetry/pb package.

* adds docs with examples for opentelemetry logs.

---
Related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4839

Co-authored-by: AndrewChubatiuk <andrew.chubatiuk@gmail.com>
Co-authored-by: f41gh7 <nik@victoriametrics.com>
2024-09-03 20:24:01 +02:00
f41gh7
dcc525b388
follow-up after 1731c0eabf
* updates change log
* adds VL-Debug http header
* updates doc
* extracts only the first value of http headers for VL-Stream-Fields and VL-Ignore-Fields.
  It makes behaviour the same as Query string args. And allows to easily configure client applications.
  Since most of the client collectors don't support multi value headers.

Signed-off-by: f41gh7 <nik@victoriametrics.com>
2024-09-03 20:24:01 +02:00
Andrii Chubatiuk
d5fe4566e5
app/vlinsert: support getting _msg_field, _time_field, _stream_fields and _ignore_fields from headers
*  Many collectors don't support forwarding url query params to the remote system. It makes impossible to define stream fields for it. Workaround with proxy between VictoriaLogs and log shipper is too complicated solution.

* This commit adds the following changes:
 * Adds fallback to to headers params, if query param is empty for:
     _msg_field -> VL-Msg-Field
    _stream_fields -> VL-Stream-Fields
    _ignore_fields -> VL-Ignore-Fields
    _time_field -> VL-Time-Field
 * removes deprecations from victorialogs compose files, added more
output format examples for logstash, telegraf, fluent-bit

 related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5310
2024-09-03 20:24:00 +02:00
Aliaksandr Valialkin
ac507466c3
all: suppress InvalidDefaultArgInFrom warning emitted by docker build when building Docker packages via make package-* command
Recent versions of `docker build` started generating the InvalidDefaultArgInFrom warning if Dockerfile contains
an ARG without default value. While this warning doesn't affect building Docker packages via `make package-*` commands,
it is better suppressing the warning, so it doesn't clutter `make package-*` output with the noise,
which can hide real issues in the future.
2024-09-03 14:05:43 +02:00
Hui Wang
a21aea5dd4
stream aggregation: perform deduplication for all received data when … (#6711)
…specifying `-streamAggr.dedupInterval` or
`-remoteWrite.streamAggr.dedupInterval` command-line flag

[The
documentation](https://docs.victoriametrics.com/stream-aggregation/)
contains conflicting descriptions regarding deduplication for
non-matched series when `-remoteWrite.streamAggr.config` and / or
`-streamAggr.config` are set:
1. Statement below says **all the received data** is deduplicated:
>[vmagent](https://docs.victoriametrics.com/vmagent/) supports
relabeling, deduplication and stream aggregation for all the received
data, scraped or pushed. Then, the collected data will be forwarded to
specified -remoteWrite.url destinations. The data processing order is
the following:
>1. all the received data is relabeled according to the specified
[-remoteWrite.relabelConfig](https://docs.victoriametrics.com/vmagent/#relabeling)
(if it is set)
>2. all the received data is deduplicated according to specified
[-streamAggr.dedupInterval](https://docs.victoriametrics.com/stream-aggregation/#deduplication)
(if it is set to duration bigger than 0)

2. Another statement says the deduplication is performed individually
for the **matching samples**
>The de-deduplication is performed after applying
[relabeling](https://docs.victoriametrics.com/vmagent/#relabeling) and
before performing the aggregation. If the -remoteWrite.streamAggr.config
and / or -streamAggr.config is set, then the de-duplication is performed
individually per each [stream aggregation
config](https://docs.victoriametrics.com/stream-aggregation/#stream-aggregation-config)
for the matching samples after applying
[input_relabel_configs](https://docs.victoriametrics.com/stream-aggregation/#relabeling).

Considering the following deduplication use cases:
1. To apply deduplication(globally or for specific remoteWrite
destination) for all the received data, scraped or pushed
--- using `-streamAggr.dedupInterval` or
`-remoteWrite.streamAggr.dedupInterval`.
2. To deduplicate and aggregate metrics that match the rule `match`
filters
--- using `-remoteWrite.streamAggr.config` and specifiying
`dedup_interval` option in [stream aggregation
config](https://docs.victoriametrics.com/stream-aggregation/#stream-aggregation-config).
3. To deduplicate all the received data while having `streamAggr.config`
for some metrics
--- no way for a single vmagent now, need to set up two level vmagents

This PR implements case3.

---------

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
(cherry picked from commit d523015f27)
2024-09-03 10:49:38 +02:00
dufucun
1aa9f7be4e
tests: fix slice init length (#6897)
### Describe Your Changes

fix slice init length

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

Signed-off-by: dufucun <dufuchun@sohu.com>
(cherry picked from commit 95bafc8caf)
2024-08-30 11:18:21 +02:00
hagen1778
681dc7bb7d
app/{vmselect,vlselect}: run make vmui-update vmui-logs-update
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 9a343b3613)
2024-08-28 13:38:28 +02:00
YuDong Tang
cab3ef8294
app/vmselect:add command-line flag -search.inmemoryBufSizeBytes (#6869)
add command-line flag `-search.inmemoryBufSizeBytes` for configuring size of in-memory buffers used by vmselect during processing of vmstorage responses. A new summary metric `vm_tmp_blocks_inmemory_file_size_bytes` is exposed to show the size of the buffer during requests processing. 

The new setting can be used by experienced users to adjust memory usage by vmselect when processing
many small read requests. Instead of allocating 4MB buffers each time, vmselect can be instructed to lower
the buffer size via `-search.inmemoryBufSizeBytes`. To make the decision whether this flag needs to be adjusted
users can consult with `vm_tmp_blocks_inmemory_file_size_bytes` which shows the actual size of buffers used
during query processing.

----------

The detailed information of this PR can be found in
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6851

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

---------

Co-authored-by: hagen1778 <roman@victoriametrics.com>
2024-08-26 14:37:45 +02:00
Yury Akudovich
f759371c00
app/vmagent: add remoteWrite.retryMinInterval and remoteWrite.retryMaxTime flags (#6289)
## Describe Your Changes

Add RemoteWrite Retry Controls

This PR introduces two new flags to the remote write functionality:
- remoteWrite.retryMinInterval
- remoteWrite.retryMaxTime

These flags provide finer control over the retry behavior for
remoteWrite operations, allowing users to customize the minimum interval
between retries and the maximum duration for retry attempts.

Fixes #5486.

## Checklist

- [x] The following checks are mandatory:

My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

---------

Signed-off-by: Yury Akudovich <ya@matterlabs.dev>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit d0f5a9d77a)
2024-08-23 15:28:44 +02:00
Roman Khavronenko
f599f3ad86
deployment/docker: update Go builder from Go1.22.5 to Go1.23.0 (#6861)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-08-22 23:56:12 +02:00
Roman Khavronenko
c6e0780f4b
app/vmalert: update parsing for instant responses (#6859)
This change is made in attempt to reduce memory usage by vmalert when
parsing big instant responses from VM/Prometheus.

In
a5c427bac4
vmalert switched from std json lib to fastjson lib in order to reduce
amount of allocations, as according to highloaded profiles of vmalert
the CPU is mostly spent on GC.

But switching to fastjson resulted into excessive memory usage for cases
when vmalert has to parse long json lines, which usually happens when
instant response contains many `metric` objects.

In this change we do a mixed parsing:
1. Slice of `metric` objects is parsed with std lib to keep mem low
2. Each `metric` object is parsed with fastjson to reduce allocs

The benchmark results are the following:
```
pkg: github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource
BenchmarkParsePrometheusResponse/Instant_std+fastjson-10                    1760            668959 ns/op          280147 B/op       5781 allocs/op
MBs allocated at heap: 493.078392
mallocs: 18655472
BenchmarkParsePrometheusResponse/Instant_fastjson-10                        6109            198258 ns/op          172839 B/op       5548 allocs/op
MBs allocated at heap: 1056.384464
mallocs: 34457184
BenchmarkParsePrometheusResponse/Instant_std-10                             1287            950987 ns/op          451677 B/op       9619 allocs/op
MBs allocated at heap: 580.802976
mallocs: 13351636
```
The benchmark function code with mem measurement is available here
https://gist.github.com/hagen1778/b9c3ca7f8ca7d6b21aec9777112c5810

The benchmark contains 3 results:
1. Instant_std+fastjson is the implementation in this change
2. Instant_fastjson-10 is the implementation from
a5c427bac4
3. BenchmarkParsePrometheusResponse/Instant_std-10 is implementation
before
a5c427bac4

According to these results, this new implementation is slower than
previous, but faster than before switching to fastjson. It also has
lower number of allocations and roughly the same memory allocation on
heap with GC turned off.

---------

Other changes:
1. rm BenchmarkMetrics as it doesn't measure anything
2. simplify BenchmarkParsePrometheusResponse into
BenchmarkPromInstantUnmarshal

### Describe Your Changes

Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-08-22 23:56:11 +02:00
Yury Molodov
a3509add4d
vmui: add column search in table settings (#6804)
### Describe Your Changes

Add search functionality to the column display settings in the table
 #6668

![image](https://github.com/user-attachments/assets/e9bd52c3-6428-4d4f-8b7f-d83dd80b6912)

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit e35237920a)
2024-08-22 16:59:18 +02:00
Dima Lazerka
f1325531a1
vmui: Fix initial serverUrl for vmanomaly (#6834)
- fix TS lint
- anomaly: remove /vmui
- anomaly: minor inspections fix
- docs: fix broken links to headings

### Describe Your Changes

Initially vmanomaly opened with `/vmui` in serverUrl, remove it.

(cherry picked from commit 535a9ed059)
2024-08-21 14:12:07 +02:00
jackyin
98758ef18e
vmui: fix not found index.js in VictoriaLogs (#6770)
fix #6764
the index.js file is for [this
feature](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/master/app/vmui#predefined-dashboards),
the feature is just for victoriametrics. so the index.js is deleted in
victorialogs.
i just add an empty index.js to fix it.

---------

Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 3ebdd3bcb8)
2024-08-20 17:14:00 +02:00
Hui Wang
e8f5dbd598
vmalert: add command line flag -notifier.headers (#6751)
to allow configuring additional headers in each request to the
corresponding notifier.
Other flags like `-datasource.headers`, `-remoteWrite.headers` already
use `^^` as delimiter, it's consistent to use it in `-notifier.headers`
as well.

related https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3260
vmalert can integrate with alertmanager that supports multi-tenant by
adding tenantID header`X-Scope-OrgID` in requests.
In multitenancy, vmalert can also filter alerts which send to different
notifier addresses(or with different header settings) using
`alert_relabel_configs`.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3260

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 0f1ec33892)
2024-08-19 21:41:57 +02:00
hagen1778
bd6405df01
make go vet happy
Address `non-constant format string in call` check:
https://github.com/golang/go/issues/60529

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit febba3971b)
2024-08-19 21:41:44 +02:00
Roman Khavronenko
d4240c4a3e
lib/httputils: parse URL before creating HTTP transport (#6820)
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6740

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-08-16 11:34:49 +02:00
Zakhar Bessarab
84b8ea7337
app/vmseleсt/promql: fix calculation of histogram buckets
This issue was introduced in 6a4bd5049b

See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6714

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2024-08-15 10:13:54 +02:00
Nikolay
f255800da3
app/vminsert: returns back memory optimisation (#6794)
Production workload shows that it's useful optimisation.

Channel based objects pool allows to handle irregural data ingestion
requests and make memory allocations more smooth.
It's improves sync.Pool efficiency, since objects from sync.Pool removed
after 2 GC cycles. With GOGC=30 value, GC runs significantly more often.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6733

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

---------

Signed-off-by: f41gh7 <nik@victoriametrics.com>
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
2024-08-13 10:49:09 -04:00
ccliu
8729052623
vmagent: resolve the issue where usePromCompatibleNaming is not working (#6776)
Describe Your Changes
When I use usePromCompatibleNaming with vmagent to process data that
needs to be formatted from different sources such as InfluxDB, I find
that it doesn’t work

However, it works in vminsert. I found that vminsert uses the
HasRelabeling method to determine whether to relabel.
```go
func HasRelabeling() bool {
	pcs := pcsGlobal.Load()
	return pcs.Len() > 0 || *usePromCompatibleNaming
}
```
in vmagent, the decision to relabel is determined only by
pcsGlobal.Len() > 0. However, in the applyRelabeling method, the
usePromCompatibleNaming logic is also used to determine whether to
relabel in the error handling.
```go
func (rctx *relabelCtx) applyRelabeling(tss []prompbmarshal.TimeSeries, pcs *promrelabel.ParsedConfigs) []prompbmarshal.TimeSeries {
	if pcs.Len() == 0 && !*usePromCompatibleNaming {
		// Nothing to change.
		return tss
	}
```
So I think that the logic for determining whether to relabel in vmagent
is not as expected.

Checklist
The following checks are mandatory:

[]My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

---------

Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>
(cherry picked from commit d134a310f3)
2024-08-13 10:33:55 -04:00
jackyin
11233364b6
vlogs: add select/deselect all button to table settings in UI (#6680)
fix #6668, just add **select all** and "unselect all" func.

https://github.com/user-attachments/assets/0c31385b-def0-4618-aa9c-5ba4bb6f56c3

---------

Co-authored-by: Yury Molodov <yurymolodov@gmail.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 5f5bc46b3e)
2024-08-13 10:33:54 -04:00
Zhu Jiekun
27a6be6630
docs: add more details to -cacheDataPath vmselect flag (#6708)
vmselect will create `./tmp` dir under `cacheDataPath`. If
`cacheDataPath` is set to `/`, vmselect will use `/tmp`.

content under `/tmp` dir might be auto removed based on the OS
behaviour. See:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5770

- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
2024-08-13 09:17:43 -04:00
Hui Wang
e74d5f266e
stream aggregation: do not allow to enable -stream.keepInput and `k… (#6723)
…eep_metric_names` options in stream aggregation config together

With aggregated data and raw data under the same metric, results would
be confusing.

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 62d19369a3)
2024-08-13 09:08:27 -04:00
Anton L
79008b712f
app/vmselect/graphite: respect denyPartialResponse for graphite requests (#6748)
VM has different responses to equivalent queries for MetricsQL and
GraphiteQL in case of failed access to one of vmstorage node of the
cluster vmstorage nodes. For GraphiteQL, the denyPartialResponse feature
is not used, it is always true, which is not always correct (depending
on the configuration).

In the PR I have removed the hardcoded denyPartialResponse for
GraphiteQL, just like MetricsQL does.

- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
2024-08-07 12:34:23 +02:00
Hui Wang
13a21a3ba0
app/vmagent/remotewrite: make -remoteWrite.streamAggr.ignoreFirstIntervals of array type (#6744)
Make `-remoteWrite.streamAggr.ignoreFirstIntervals` of array type so it could
 accept multiple values which can be applied to the corresponding`-remoteWrite.url`.

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 8f5c26d788)
2024-08-07 09:57:49 +02:00
Hui Wang
71ac65996b
app/vmagent/remotewrite: fix -streamAggr.dropInputLabels behavior (#6743)
Fix `-streamAggr.dropInputLabels` behavior  when global deduplication is enabled without `-streamAggr.config`.
Previously, `-remoteWrite.streamAggr.dropInputLabels` is misapplied.

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 4863605469)
2024-08-07 09:57:49 +02:00
hagen1778
ec05e70742
app/vmalert: rm unnecessary err check
The error check was needed before a84491324d
It was kept by mistake and makes no sense to have rn.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 9726e6c1a2)
2024-08-07 09:57:48 +02:00
Yury Molodov
b4aec9ee05
vmui/logs: add display top streams in the hits graph (#6647)
### Describe Your Changes

- Adds support for displaying the top 5 log streams in the hits graph,
grouping the remaining streams into an "other" label.
   #6545

- Adds options to customize the graph display with bar, line, stepped
line, and points views.

### Checklist

The following checks are **mandatory**:

- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

(cherry picked from commit 04c2232e45)
2024-08-06 16:30:12 +02:00
Zakhar Bessarab
a3a0bafe76
app/vlinsert/elasticsearch: add fake response for logstash requests (#6742)
### Describe Your Changes

This is needed in order to support standard Elasticsearch output in
Logstash pipelines.

See: https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6660

### Checklist

The following checks are **mandatory**:

- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
(cherry picked from commit 58b6c54da2)
2024-08-06 16:30:11 +02:00
Hui Wang
9f84c4fdfa
vmalert: respect HTTP headers defined in notifier configuration file (#6762)
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
(cherry picked from commit c1b54779a2)
2024-08-06 16:30:10 +02:00
hagen1778
c99700ae15
fix typos in comments
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit f283126084)
2024-08-06 16:30:10 +02:00
Zakhar Bessarab
0b1def6e24
app/{vminsert,vmagent}: add healthcheck for influx ingestion endpoints (#6749)
### Describe Your Changes

This is useful for clients which validate InfluxDB is available before
data ingestion can be started.

See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6653

### Checklist

The following checks are **mandatory**:

- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>

(cherry picked from commit 9877a5e7d5)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-08-05 09:45:32 +02:00
Dmytro Kozlov
fdad3e94f5
vmctl: add --backoff-retries, --backoff-factor, --backoff-min-duration global command-line flags (#6639)
### Describe Your Changes

Added `--vm-backoff-retries`, `--vm-backoff-factor`,
`--vm-backoff-min-duration` and `--vm-native-backoff-retries`,
`--vm-native-backoff-factor`, `--vm-native-backoff-min-duration`
command-line flags to the `vmctl` app. Those changes will help to
configure the retry backoff policy for different situations.

Related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6622
### Checklist

The following checks are **mandatory**:

- [X] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>

(cherry picked from commit 6f401daacb)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-08-03 19:34:03 +02:00
Yury Molodov
00b108ca04
vmui/logs: improve UI functionality (#6688)
* add a toggle button to the "Group" tab that allows users to expand or collapse all groups at once
* introduce the ability to select a key for grouping logs within the "Group" tab
* display the number of entries within each log group.
* move the Markdown toggle to the general settings panel in the upper left corner.

(cherry picked from commit e06a19d85f)
2024-08-02 15:58:07 +02:00
Yury Molodov
a93ee27a85
vmui/logs: add fields for tenant configuration (#6661)
Added fields for configuring AccountID and ProjectID
#6631
2024-08-02 11:14:42 +02:00
f41gh7
115a76d28c
make vmui-update 2024-08-01 14:45:29 +02:00
Yury Molodov
7d37ca3159
vmui: fix auto-completion triggers (#6566)
### Describe Your Changes

- Fixes auto-complete triggers according to [these
comments](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5866#issuecomment-2065273421).
- Fixes loading and displaying suggestions when there is no metric in
the expression.
   Related issue: #6153
- Adds quotes when inserting label values.
   Related issue: #6260

- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

(cherry picked from commit 53919327b2)
2024-07-31 16:09:18 +02:00
Aliaksandr Valialkin
9a3f44e79c
app/{vmselect,vlselect}: run make vmui-update vmui-logs-update after efd70b2c52 2024-07-27 13:51:02 +02:00
Aliaksandr Valialkin
5dc7ec058f
app/vmauth: verify how backend response headers are propagated to vmauth client 2024-07-27 13:45:07 +02:00
Hui Wang
e0c62e5c50
security: upgrade base docker image (Alpine) from 3.20.1 to 3.20.2 (#6684)
See https://www.alpinelinux.org/posts/Alpine-3.20.1-released.html

>including security fix for:
OpenSSL CVE-2024-5535
2024-07-25 11:02:23 +02:00
Zakhar Bessarab
9f5eb25150
app/vmauth: change response code when all backend are not available (#6676)
### Describe Your Changes

Change response code to 502 to align it with behaviour of other existing
reverse proxies. Currently, the following reverse proxies will return
502 in case an upstream is not available: nginx, traefik, caddy, apache.


Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2024-07-25 10:24:05 +02:00
Aliaksandr Valialkin
339821f5ce
app/vmauth: test how User-Agent header is set in requests to backend 2024-07-20 11:43:36 +02:00
Aliaksandr Valialkin
b31bd5613f
app/vmauth: verify the correctness of X-Forwarded-For header processing at TestRequestHandler() 2024-07-20 11:43:36 +02:00
Aliaksandr Valialkin
8f3fd62f50
app/vmauth: add missing tests for requestHandler() 2024-07-20 11:22:54 +02:00
Aliaksandr Valialkin
37fdba6897
app/vmauth: add more tests for requestHandler() 2024-07-20 10:19:57 +02:00
Aliaksandr Valialkin
28af963940
docs/vmauth.md: document the case with default url_prefix additionally to url_map 2024-07-20 09:46:31 +02:00
Aliaksandr Valialkin
a50a29500f
app/vmauth: properly proxy requests to backend paths ending with /
Previously the traling / was incorrectly removed when proxying requests from http://vmauth/

While at it, add more tests for requestHandler()
2024-07-19 17:29:17 +02:00
Aliaksandr Valialkin
4e3acfbe9a
app/vmauth: properly proxy HTTP requests without body
The Request.Body for requests without body can be nil. This could break readTrackingBody.Read() logic,
which could incorrectly return "cannot read data after closing the reader" error in this case.
Fix this by initializing the readTrackingBody.r with zeroReader.

While at it, properly set Host header if it is specified in 'headers' section.
It must be set net/http.Request.Host instead of net/http.Request.Header.Set(),
since the net/http.Client overwrites the Host header with the value from req.Host
before sending the request.

While at it, add tests for requestHandler(). Additional tests for various requestHandler() cases
will be added in future commits.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6445
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5707
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5240
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6525
2024-07-19 16:26:07 +02:00
Yury Molodov
be2a61c244
vmui/logs: switched requests to sequential execution (#6624)
### Describe Your Changes

This PR changes `/select/logsql/query` and `/select/logsql/hits` to
execute sequentially
Fixed
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6558#issuecomment-2219298984

### Checklist

The following checks are **mandatory**:

- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
2024-07-18 11:56:19 +02:00
Aliaksandr Valialkin
7e0fff224e
app/vmselect/vmui: run make vmui-update after 959a4383c5 2024-07-17 23:09:25 +02:00
Aliaksandr Valialkin
97d696ae8b
all: substitute double "the the" with "the"
This is a follow-up for 8786a08d27

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6600
2024-07-17 14:29:05 +02:00
Aliaksandr Valialkin
f8aa445945
all: consistently use stringsutil.JSONString() for formatting JSON strings with fmt.* functions instead of using "%q" formatter
The %q formatter may result in incorrectly formatted JSON string if the original string
contains special chars such as \x1b . They must be encoded as \u001b , otherwise the resulting JSON string
cannot be parsed by JSON parsers.

This is a follow-up for c0caa69939

See https://github.com/VictoriaMetrics/victorialogs-datasource/issues/24
2024-07-17 14:01:37 +02:00
rtm0
1b03d7e6de
Fix inconsistent error handling in Storage.AddRows() (#6583)
`Storage.AddRows()` returns an error only in one case: when
`Storage.updatePerDateData()` fails to unmarshal a `metricNameRaw`. But
the same error is treated as a warning when it happens inside
`Storage.add()` or returned by `Storage.prefillNextIndexDB()`.

This commit fixes this inconsistency by treating the error returned by
`Storage.updatePerDateData()` as a warning as well. As a result
`Storage.add()` does not need a return value anymore and so doesn't
`Storage.AddRows()`.

Additionally, this commit adds a unit test that checks all cases that
result in a row not being added to the storage.

---------

Signed-off-by: Artem Fetishev <wwctrsrx@gmail.com>
Co-authored-by: Nikolay <nik@victoriametrics.com>
2024-07-17 12:55:07 +02:00
Aliaksandr Valialkin
3087749aa9
app/vmauth: properly handle the case when zero backend hosts are resolved at SRV DNS
When zero backend hosts are resolved, then vmauth must return 'no backend hosts' error instead of crashing with panic

This is a follow-up for 590aeccd7d and 3a45bbb4e0

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6401
2024-07-17 11:34:33 +02:00
Aliaksandr Valialkin
31b8e9054d
app/vmauth: pool readTrackingBody structs in order to reduce pressure on Go GC
- use pool for readTrackingBody structs in order to reduce pressure on Go GC
- allow re-reading partially read request body
- add missing tests for various cases of readTrackingBody usage

This is a follow-up for ad6af95183 and 4d66e042e3.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6445
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6446
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6533
2024-07-17 11:34:32 +02:00
Aliaksandr Valialkin
d427a4cfaa
app/vmauth: use more clear names for the field and function added at e666d64f1d
- Rename overrideHostHeader() function to hasEmptyHostHeader()
- Rename overrideHostHeader field at UserInfo to useBackendHostHeader

This should simplify the future maintenance of the code

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6525
2024-07-17 11:34:32 +02:00
Aliaksandr Valialkin
111f7da946
Revert "app/vmauth: reader pool to reduce gc & mem alloc (#6533)"
This reverts commit 4d66e042e3.

Reasons for revert:

- The commit makes unrelated invalid changes to docs/CHANGELOG.md
- The changes at app/vmauth/main.go are too complex. It is better splitting them into two parts:
  - pooling readTrackingBody struct for reducing pressure on GC
  - avoiding to use readTrackingBody when -maxRequestBodySizeToRetry command-line flag is set to 0

Let's make this in the follow-up commits!

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6445
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6533
2024-07-17 11:34:31 +02:00
Aliaksandr Valialkin
8b1c38abde
app/vmauth: follow-up for 3a45bbb4e0
- Move the test for SRV discovery into a separate function. This allows verifying round-robin discovery across SRV records.
- Restore the original netutil.Resolver after the test finishes, so it doesn't interfere with other tests.
- Move the description of the bugfix into the correct place at docs/CHANGELOG.md - it should be placed under v1.102.0-rc2
  instead of v1.102.0-rc1.
- Remove unneeded code in URLPrefix.sanitizeAndInitialize(), since it is expected this function is called only once
  for finishing URLPrefix initializiation. In this case URLPrefix.nextDiscoveryDeadline and URLPrefix.n are equal to 0
  according to https://pkg.go.dev/sync/atomic#Uint64
- Properly fix the bug at URLPrefix.discoverBackendAddrsIfNeeded() - it is expected that hostToAddrs map uses
  the original hostname keys, including 'srv+' prefix, so it shouldn't be removed when looping over up.busOriginal.
  Instead, the 'srv+' prefix must be removed from the hostname only locally before passing the hostname to netutil.Resolver.LookupSRV.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6401
2024-07-16 10:41:08 +02:00
Aliaksandr Valialkin
468c04d3c2
app/vmauth: clarify the description for -idleConnTimeout command-line flag
This is a follow-up for d44058bcd6
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6388
2024-07-16 09:40:01 +02:00
Aliaksandr Valialkin
8b76a40715
lib/httpserver: skip basic auth check for additional request paths, which should call httpserver.CheckAuthFlag()
This is a follow-up for 61dce6f2a1

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6338
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6329
2024-07-16 01:08:41 +02:00
Aliaksandr Valialkin
aa52d6cd9b
app/vminsert: increase default value for -maxLabelValueLen command-line flag from 1KiB to 4KiB
It has been appeared that the standard Kubernetes monitoring can generate labels with sizes up to 4KiB

This is a follow-up for a5d1013042
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6176
2024-07-15 23:32:54 +02:00
Aliaksandr Valialkin
476bf400ac
lib/{httputils,netutil}: move httputils.GetStatDialFunc to netutil.NewStatDialFunc
- Rename GetStatDialFunc to NewStatDialFunc, since it returns new function with every call
- NewStatDialFunc isn't related to http in any way, so it must be moved from lib/httputils to lib/netutil
- Simplify the implementation of NewStatDialFunc by removing sync.Map from there.
- Use netutil.NewStatDialFunc at app/vmauth and lib/promscrape/discoveryutils
- Use gauge instead of counter type for *_conns metric

This is a follow-up for d7b5062917
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6299
2024-07-15 23:05:46 +02:00
Aliaksandr Valialkin
353766061b
app/{vminsert,vmselect}: pass proper args to metrics.UnregisterSet() after a8356f3a26 2024-07-15 20:27:40 +02:00
Aliaksandr Valialkin
cbc637d1dd
app/vmagent/remotewrite: follow-up for f153f54d11
- Move the remaining code responsible for stream aggregation initialization from remotewrite.go to streamaggr.go .
  This improves code maintainability a bit.

- Properly shut down streamaggr.Aggregators initialized inside remotewrite.CheckStreamAggrConfigs().
  This prevents from potential resource leaks.

- Use separate functions for initializing and reloading of global stream aggregation and per-remoteWrite.url stream aggregation.
  This makes the code easier to read and maintain. This also fixes INFO and ERROR logs emitted by these functions.

- Add an ability to specify `name` option in every stream aggregation config. This option is used as `name` label
  in metrics exposed by stream aggregation at /metrics page. This simplifies investigation of the exposed metrics.

- Add `path` label additionally to `name`, `url` and `position` labels at metrics exposed by streaming aggregation.
  This label should simplify investigation of the exposed metrics.

- Remove `match` and `group` labels from metrics exposed by streaming aggregation, since they have little practical applicability:
  it is hard to use these labels in query filters and aggregation functions.

- Rename the metric `vm_streamaggr_flushed_samples_total` to less misleading `vm_streamaggr_output_samples_total` .
  This metric shows the number of samples generated by the corresponding streaming aggregation rule.
  This metric has been added in the commit 861852f262 .
  See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6462

- Remove the metric `vm_streamaggr_stale_samples_total`, since it is unclear how it can be used in practice.
  This metric has been added in the commit 861852f262 .
  See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6462

- Remove Alias and aggrID fields from streamaggr.Options struct, since these fields aren't related to optional params,
  which could modify the behaviour of the constructed streaming aggregator.
  Convert the Alias field to regular argument passed to LoadFromFile() function, since this argument is mandatory.

- Pass Options arg to LoadFromFile() function by reference, since this structure is quite big.
  This also allows passing nil instead of Options when default options are enough.

- Add `name`, `path`, `url` and `position` labels to `vm_streamaggr_dedup_state_size_bytes` and `vm_streamaggr_dedup_state_items_count` metrics,
  so they have consistent set of labels comparing to the rest of streaming aggregation metrics.

- Convert aggregator.aggrStates field type from `map[string]aggrState` to `[]aggrOutput`, where `aggrOutput` contains the corresponding
  `aggrState` plus all the related metrics (currently only `vm_streamaggr_output_samples_total` metric is exposed with the corresponding
  `output` label per each configured output function). This simplifies and speeds up the code responsible for updating per-output
  metrics. This is a follow-up for the commit 2eb1bc4f81 .
  See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6604

- Added missing urls to docs ( https://docs.victoriametrics.com/stream-aggregation/ ) in error messages. These urls help users
  figuring out why VictoriaMetrics or vmagent generates the corresponding error messages. The urls were removed for unknown reason
  in the commit 2eb1bc4f81 .

- Fix incorrect update for `vm_streamaggr_output_samples_total` metric in flushCtx.appendSeriesWithExtraLabel() function.
  While at it, reduce memory usage by limiting the maximum number of samples per flush to 10K.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5467
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6268
2024-07-15 20:25:36 +02:00
Aliaksandr Valialkin
a8356f3a26
vendor: update github.com/VictoriaMetrics/metrics from v1.34.1 to v1.35.0
Fix potential memory leaks across VictoriaMetrics codebase after metrics.UnregisterSet(s) call
because of missing s.UnregisterAllMetrics() call.

This is a follow-up for 6a6e34ab8e . It is OK if some vmauth metrics
aren't visible for a few microseconds when the previous metrics are unregistered and new metrics
weren't registered yet.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6247
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4690
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6252
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5805
2024-07-15 10:45:39 +02:00
Aliaksandr Valialkin
3365dd508f
app/vmagent/remotewrite: do not spend CPU time on an attempt to send data to blocked queue if some queues are unblocked
Previously remotewrite.TryPush() was trying to send data to remote storages with blocked persistent queues,
if some persistent queues to other remote storage systems were unblocked. This resulted in excess CPU usage
on relabeling and stream aggregation for the remote storage with blocked queues.

The solution is to check whether some peristent storages have blocked queues and skip them before applying
per- -remoteWrite.url relabeling and streaming aggregation.

While at it, properly update per- -remoteWrite.url vmagent_remotewrite_samples_dropped_total and vmagent_remotewrite_push_failures_total
counters when global streaming aggregation cannot send data to remote storage systems because of blocked queues.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5467 and https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6268 .

This is a follow-up for 87fd400dfc and f153f54d11

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6248
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6065
2024-07-15 09:40:34 +02:00
Aliaksandr Valialkin
4921ec5604
docs/CHANGELOG.md: use new link to VictoriaMetrics cluster docs instead of old link
The old link was changed globally to the new link in the commit f4b1cbfef0 .
Unfortunately, old links are still posted in new commits :(

This is a follow-up for 680b8c25c8 .

While at it, remove duplicate 'len(*remoteWriteURLs) > 0' check in the remotewrite.Init() functions,
since this check is already made at the beginning of the function.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6253
2024-07-13 03:04:20 +02:00
Aliaksandr Valialkin
bc1f92d7f5
app/vmagent/remotewrite: follow-up for 87fd400dfc
- Drop samples and return true from remotewrite.TryPush() at fast path when all the remote storage
  systems are configured with the disabled on-disk queue, every in-memory queue is full
  and -remoteWrite.dropSamplesOnOverload is set to true. This case is quite common,
  so it should be optimized. Previously additional CPU time was spent on per-remoteWriteCtx
  relabeling and other processing in this case.

- Properly count the number of dropped samples inside remoteWriteCtx.pushInternalTrackDropped().
  Previously dropped samples were counted only if -remoteWrite.dropSamplesOnOverload flag is set.
  In reality, the samples are dropped when they couldn't be sent to the queue because in-memory queue is full
  and on-disk queue is disabled.
  The remoteWriteCtx.pushInternalTrackDropped() function is called by streaming aggregation for pushing
  the aggregated data to the remote storage. Streaming aggregation cannot wait until the remote storage
  processes pending data, so it drops aggregated samples in this case.

- Clarify the description for -remoteWrite.disableOnDiskQueue command-line flag at -help output,
  so it is clear that this flag can be set individually per each -remoteWrite.url.

- Make the -remoteWrite.dropSamplesOnOverload flag global. If some of the remote storage systems
  are configured with the disabled on-disk queue, then there is no sense in keeping samples
  on some of these systems, while dropping samples on the remaining systems, since this
  will result in global stall on the remote storage system with the disabled on-disk queue
  and with the -remoteWrite.dropSamplesOnOverload=false flag. vmagent will always return false
  from remotewrite.TryPush() in this case. This will result in infinite duplicate samples
  written to the remaining remote storage systems. That's why the -remoteWrite.dropSamplesOnOverload
  is forcibly set to true if more than one -remoteWrite.disableOnDiskQueue flag is set.
  This allows proceeding with newly scraped / pushed samples by sending them to the remaining
  remote storage systems, while dropping them on overloaded systems with the -remoteWrite.disableOnDiskQueue flag set.

- Verify that the remoteWriteCtx.TryPush() returns true in the TestRemoteWriteContext_TryPush_ImmutableTimeseries test.

- Mention in vmagent docs that the -remoteWrite.disableOnDiskQueue command-line flag can be set individually per each -remoteWrite.url.
  See https://docs.victoriametrics.com/vmagent/#disabling-on-disk-persistence

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6248
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6065
2024-07-13 02:30:10 +02:00
Aliaksandr Valialkin
5c7345b8ce
app/victoria-logs/Makefile: add make victoria-logs-linux-loong64 build rule
This is a follow-up for 80f3644ee3

The https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6222 missed build rule for VictoriaLogs.
2024-07-12 23:13:19 +02:00
Aliaksandr Valialkin
43fc1183b9
app/vmalert: switch from table-driven tests to f-tests
This makes test code more clear and reduces the number of code lines by 500.
This also simplifies debugging tests. See https://itnext.io/f-tests-as-a-replacement-for-table-driven-tests-in-go-8814a8b19e9e

While at it, consistently use t.Fatal* instead of t.Error* across tests, since t.Error*
requires more boilerplate code, which can result in additional bugs inside tests.
While t.Error* allows writing logging errors for the same, this doesn't simplify fixing
broken tests most of the time.

This is a follow-up for a9525da8a4
2024-07-12 22:45:50 +02:00
Aliaksandr Valialkin
04a304fd39
app/vmctl: switch from table-driven tests to f-tests
This simplifies debugging tests and makes the test code more clear and concise.
See https://itnext.io/f-tests-as-a-replacement-for-table-driven-tests-in-go-8814a8b19e9e

While at is, consistently use t.Fatal* instead of t.Error* across tests, since t.Error*
requires more boilerplate code, which can result in additional bugs inside tests.
While t.Error* allows writing logging errors for the same, this doesn't simplify fixing
broken tests most of the time.

This is a follow-up for a9525da8a4
2024-07-12 22:45:49 +02:00
Aliaksandr Valialkin
7c97cef95c
app: consistently use t.Fatal* instead of t.Error* (except of app/vmalert and app/vmctl - these packages will be processed in a separate commit)
Consistently using t.Fatal* simplifies the test code and makes it less fragile, since it is common error
to forget to make proper cleanup after t.Error* call. Also t.Error* calls do not provide any practical
benefits when some tests fail. They just clutter test output with additional noise information,
which do not help in fixing failing tests most of the time.

This is a follow-up for a9525da8a4
2024-07-11 16:01:25 +02:00
Zhu Jiekun
2ea575e776
vmalert: [bug] fixed System hyperlink 404 redirect (#6620)
### Describe Your Changes

As mentioned in https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6603, some hyperlinks under `vmalert` -> `System`
section is not working as expected.

Pages and redirection:
- For page `http://127.0.0.1:8880/`: `flags` button will redirect to
`http://127.0.0.1:8880/flags`
- For page `http://127.0.0.1:8880/vmalert`:
`http://127.0.0.1:8880/flags`
- For page `http://127.0.0.1:8880/vmalert/`:
`http://127.0.0.1:8880/vmalert/flags` (page not exists)
- Similar redirection could be observed with `-http.pathPrefix`

Two potential ways to avoid 404 redirection:
1. **avoid visiting `/vmalert/`** (I'm trying to do this).
2. provide support for `/vmalert/flags`.

`/vmalert/` could be visit only when user click other navigator (e.g.
Group) and click vmalert again:
![Peek 2024-07-10
10-07](https://github.com/VictoriaMetrics/VictoriaMetrics/assets/30280396/13d7b147-a1b6-4e93-9ee0-26f881a16bef)
Because: `http://127.0.0.1:8880/vmalert/groups?search=` + `<a
class="nav-link" href=".">` = `http://127.0.0.1:8880/vmalert/`

So I'm trying to change the `href="."` to `href="../vmalert"`.

### Checklist

The following checks are **mandatory**:

- [X] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

(cherry picked from commit cadf1eb5ab)
2024-07-11 12:40:23 +02:00
Zakhar Bessarab
401ae72587
app/vmselect/promql: propagate lower bucket values when fixing a histogram (#6547)
### Describe Your Changes

In most cases histograms are exposed in sorted manner with lower buckets
being first. This means that during scraping buckets with lower bounds
have higher chance of being updated earlier than upper ones.

Previously, values were propagated from upper to lower bounds, which
means that in most cases that would produce results higher than expected
once all buckets will become updated.
Propagating from upper bound effectively limits highest value of
histogram to the value of previous scrape. Once the data will become
consistent in the subsequent evaluation this causes spikes in the
result.

Changing propagation to be from lower to higher buckets reduces value
spikes in most cases due to nature of the original inconsistency.

 See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4580

An example histogram with previous(red) and updated(blue) versions:

![1719565540](https://github.com/VictoriaMetrics/VictoriaMetrics/assets/1367798/605c5e60-6abe-45b5-89b2-d470b60127b8)

This also makes logic of filling nan values with lower buckets values: [1 2 3 nan nan nan] => [1 2 3 3 3 3] obsolete.
Since buckets are now fixed from lower ones to upper this happens in the main loop, so there is no need in a second one.

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Andrii Chubatiuk <andrew.chubatiuk@gmail.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 6a4bd5049b)
2024-07-10 15:17:08 +02:00
Aliaksandr Valialkin
a1decb5ca1
app/vlinsert/loki: use easyproto instead for parsing Loki protobuf messages 2024-07-10 03:05:55 +02:00
Aliaksandr Valialkin
32ae40410c
app/vlselect/vmui: run make vmui-logs-update after 662e026279 2024-07-10 03:05:55 +02:00
Aliaksandr Valialkin
b8a8d3d6f1
lib/logstorage: drop all the pipes from the query when calculating the number of matching logs at /select/logsql/hits API 2024-07-10 00:39:16 +02:00
Aliaksandr Valialkin
d6415b2572
all: consistently use 'any' instead of 'interface{}'
'any' type is supported starting from Go1.18. Let's consistently use it
instead of 'interface{}' type across the code base, since `any` is easier to read than 'interface{}'.
2024-07-10 00:23:26 +02:00
Aliaksandr Valialkin
73ca22bb7d
app/vlinsert/loki: remove unused functions from the generated protobuf code 2024-07-10 00:22:10 +02:00
Yury Molodov
33bd5ccbab
vmui/logs: add spinner to bar chart (#6577)
Add a spinner to the bar chart

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6558

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 662e026279)
2024-07-09 18:27:23 +02:00
Hui Wang
6f602a4ef5
security: upgrade base docker image (Alpine) from 3.20.0 to 3.20.1
See https://www.alpinelinux.org/posts/Alpine-3.20.1-released.html

>including security fixes for:
OPENSSL
[CVE-2024-4741](https://security.alpinelinux.org/vuln/CVE-2024-4741)
BUSYBOX
[CVE-2023-42364](https://security.alpinelinux.org/vuln/CVE-2023-42364)
[CVE-2023-42365](https://security.alpinelinux.org/vuln/CVE-2023-42365)

(cherry picked from commit 8e9f98e725)
2024-07-09 11:38:44 +02:00
Artem Navoiev
7b508a9334
fix typo
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
(cherry picked from commit 4527020a68)
2024-07-09 10:52:50 +02:00
Yury Molodov
7fc9912d15
vmui: add compact JSON display (#6582)
### Describe Your Changes
If a JSON element has only one field, it will be displayed on a single
line.
 #6559

| Old Display | New Display |
|-------------|-------------|
|
![image](https://github.com/VictoriaMetrics/VictoriaMetrics/assets/29711459/8866517b-a49d-450f-904c-19117397a078)
|
![image](https://github.com/VictoriaMetrics/VictoriaMetrics/assets/29711459/8e222b43-a4cb-4f32-9a79-6199778404d3)
|

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 959a4383c5)
2024-07-05 09:49:12 +02:00
Hui Wang
bbd49a1a61
vmalert: allow omitting -replay.timeTo in replay mode, default valu… (#6575)
…e is the current timestamp

address https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6492

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 3169524fb7)
2024-07-05 09:49:06 +02:00
Roman Khavronenko
b13c363f12
app/vmalert: add examples for source override (#6561)
The change adds a new docs section with examples on how source can be
overridden. It should address questions like
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6536

While there, fix the example in `external.alert.source` cmd-line flag
and docker-compose examples.

### Checklist

The following checks are **mandatory**:

- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit c429bbf889)
2024-07-05 09:49:03 +02:00
Aliaksandr Valialkin
172ae1adf7
Revert c6c5a5a186 and b2765c45d0
Reason for revert:

There are many statsd servers exist:

- https://github.com/statsd/statsd - classical statsd server
- https://docs.datadoghq.com/developers/dogstatsd/ - statsd server from DataDog built into DatDog Agent ( https://docs.datadoghq.com/agent/ )
- https://github.com/avito-tech/bioyino - high-performance statsd server
- https://github.com/atlassian/gostatsd - statsd server in Go
- https://github.com/prometheus/statsd_exporter - statsd server, which exposes the aggregated data as Prometheus metrics

These servers can be used for efficient aggregating of statsd data and sending it to VictoriaMetrics
according to https://docs.victoriametrics.com/#how-to-send-data-from-graphite-compatible-agents-such-as-statsd (
the https://github.com/prometheus/statsd_exporter can be scraped as usual Prometheus target
according to https://docs.victoriametrics.com/#how-to-scrape-prometheus-exporters-such-as-node-exporter ).

Adding support for statsd data ingestion protocol into VictoriaMetrics makes sense only if it provides
significant advantages over the existing statsd servers, while has no significant drawbacks comparing
to existing statsd servers.

The main advantage of statsd server built into VictoriaMetrics and vmagent - getting rid of additional statsd server.
The main drawback is non-trivial and inconvenient streaming aggregation configs, which must be used for the ingested statsd metrics (
see https://docs.victoriametrics.com/stream-aggregation/ ). These configs are incompatible with the configs for standalone statsd servers.
So you need to manually translate configs of the used statsd server to stream aggregation configs when migrating
from standalone statsd server to statsd server built into VictoriaMetrics (or vmagent).

Another important drawback is that it is very easy to shoot yourself in the foot when using built-in statsd server
with the -statsd.disableAggregationEnforcement command-line flag or with improperly configured streaming aggregation.
In this case the ingested statsd metrics will be stored to VictoriaMetrics as is without any aggregation.
This may result in high CPU usage during data ingestion, high disk space usage for storing all the unaggregated
statsd metrics and high CPU usage during querying, since all the unaggregated metrics must be read, unpacked and processed
during querying.

P.S. Built-in statsd server can be added to VictoriaMetrics and vmagent after figuring out more ergonomic
specialized configuration for aggregating of statsd metrics. The main requirements for this configuration:

- easy to write, read and update (ideally it should work out of the box for most cases without additional configuration)
- hard to misconfigure (e.g. hard to shoot yourself in the foot)

It would be great if this configuration will be compatible with the configuration of the most widely used statsd server.

In the mean time it is recommended continue using external statsd server.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6265
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5053
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5052
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/206
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4600
2024-07-03 23:57:49 +02:00
Aliaksandr Valialkin
cd152693c6
Revert "Exemplar support (#5982)"
This reverts commit 5a3abfa041.

Reason for revert: exemplars aren't in wide use because they have numerous issues which prevent their adoption (see below).
Adding support for examplars into VictoriaMetrics introduces non-trivial code changes. These code changes need to be supported forever
once the release of VictoriaMetrics with exemplar support is published. That's why I don't think this is a good feature despite
that the source code of the reverted commit has an excellent quality. See https://docs.victoriametrics.com/goals/ .

Issues with Prometheus exemplars:

- Prometheus still has only experimental support for exemplars after more than three years since they were introduced.
  It stores exemplars in memory, so they are lost after Prometheus restart. This doesn't look like production-ready feature.
  See 0a2f3b3794/content/docs/instrumenting/exposition_formats.md (L153-L159)
  and https://prometheus.io/docs/prometheus/latest/feature_flags/#exemplars-storage

- It is very non-trivial to expose exemplars alongside metrics in your application, since the official Prometheus SDKs
  for metrics' exposition ( https://prometheus.io/docs/instrumenting/clientlibs/ ) either have very hard-to-use API
  for exposing histograms or do not have this API at all. For example, try figuring out how to expose exemplars
  via https://pkg.go.dev/github.com/prometheus/client_golang@v1.19.1/prometheus .

- It looks like exemplars are supported for Histogram metric types only -
  see https://pkg.go.dev/github.com/prometheus/client_golang@v1.19.1/prometheus#Timer.ObserveDurationWithExemplar .
  Exemplars aren't supported for Counter, Gauge and Summary metric types.

- Grafana has very poor support for Prometheus exemplars. It looks like it supports exemplars only when the query
  contains histogram_quantile() function. It queries exemplars via special Prometheus API -
  https://prometheus.io/docs/prometheus/latest/querying/api/#querying-exemplars - (which is still marked as experimental, btw.)
  and then displays all the returned exemplars on the graph as special dots. The issue is that this doesn't work
  in production in most cases when the histogram_quantile() is calculated over thousands of histogram buckets
  exposed by big number of application instances. Every histogram bucket may expose an exemplar on every timestamp shown on the graph.
  This makes the graph unusable, since it is litterally filled with thousands of exemplar dots.
  Neither Prometheus API nor Grafana doesn't provide the ability to filter out unneeded exemplars.

- Exemplars are usually connected to traces. While traces are good for some

I doubt exemplars will become production-ready in the near future because of the issues outlined above.

Alternative to exemplars:

Exemplars are marketed as a silver bullet for the correlation between metrics, traces and logs -
just click the exemplar dot on some graph in Grafana and instantly see the corresponding trace or log entry!
This doesn't work as expected in production as shown above. Are there better solutions, which work in production?
Yes - just use time-based and label-based correlation between metrics, traces and logs. Assign the same `job`
and `instance` labels to metrics, logs and traces, so you can quickly find the needed trace or log entry
by these labes on the time range with the anomaly on metrics' graph.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5982
2024-07-03 16:09:18 +02:00