### Describe Your Changes
Updated Quickstart guide for VIctoriaMetrics and VictoriaMetrics Cluster to include instructions for installing the binaries by hand
### Checklist
The following checks are **mandatory**:
- [ x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
(cherry picked from commit 0e1dbdee28)
### Describe Your Changes
Fixed Custom Model guide according to newer `vmanomaly` versions
### Checklist
The following checks are **mandatory**:
- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
(cherry picked from commit 7478d2de4c)
### Describe Your Changes
docs: fix typos
### Checklist
The following checks are **mandatory**:
- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
(cherry picked from commit 8786a08d27)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
The change adds a new docs section with examples on how source can be
overridden. It should address questions like
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6536
While there, fix the example in `external.alert.source` cmd-line flag
and docker-compose examples.
### Checklist
The following checks are **mandatory**:
- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
---------
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit c429bbf889)
Reason for revert:
There are many statsd servers exist:
- https://github.com/statsd/statsd - classical statsd server
- https://docs.datadoghq.com/developers/dogstatsd/ - statsd server from DataDog built into DatDog Agent ( https://docs.datadoghq.com/agent/ )
- https://github.com/avito-tech/bioyino - high-performance statsd server
- https://github.com/atlassian/gostatsd - statsd server in Go
- https://github.com/prometheus/statsd_exporter - statsd server, which exposes the aggregated data as Prometheus metrics
These servers can be used for efficient aggregating of statsd data and sending it to VictoriaMetrics
according to https://docs.victoriametrics.com/#how-to-send-data-from-graphite-compatible-agents-such-as-statsd (
the https://github.com/prometheus/statsd_exporter can be scraped as usual Prometheus target
according to https://docs.victoriametrics.com/#how-to-scrape-prometheus-exporters-such-as-node-exporter ).
Adding support for statsd data ingestion protocol into VictoriaMetrics makes sense only if it provides
significant advantages over the existing statsd servers, while has no significant drawbacks comparing
to existing statsd servers.
The main advantage of statsd server built into VictoriaMetrics and vmagent - getting rid of additional statsd server.
The main drawback is non-trivial and inconvenient streaming aggregation configs, which must be used for the ingested statsd metrics (
see https://docs.victoriametrics.com/stream-aggregation/ ). These configs are incompatible with the configs for standalone statsd servers.
So you need to manually translate configs of the used statsd server to stream aggregation configs when migrating
from standalone statsd server to statsd server built into VictoriaMetrics (or vmagent).
Another important drawback is that it is very easy to shoot yourself in the foot when using built-in statsd server
with the -statsd.disableAggregationEnforcement command-line flag or with improperly configured streaming aggregation.
In this case the ingested statsd metrics will be stored to VictoriaMetrics as is without any aggregation.
This may result in high CPU usage during data ingestion, high disk space usage for storing all the unaggregated
statsd metrics and high CPU usage during querying, since all the unaggregated metrics must be read, unpacked and processed
during querying.
P.S. Built-in statsd server can be added to VictoriaMetrics and vmagent after figuring out more ergonomic
specialized configuration for aggregating of statsd metrics. The main requirements for this configuration:
- easy to write, read and update (ideally it should work out of the box for most cases without additional configuration)
- hard to misconfigure (e.g. hard to shoot yourself in the foot)
It would be great if this configuration will be compatible with the configuration of the most widely used statsd server.
In the mean time it is recommended continue using external statsd server.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6265
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5053
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5052
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/206
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4600
Move the code responsible for relabelCtx clearing into deferred function.
This allows making more clear the remoteWriteCtx.TryPush code.
This is a follow-up for 879771808b
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6205
While at it, clarify the description of the bugfix at docs/CHANGELOG.md
Use metricsql.IsLikelyInvalid() function for determining whether the given query is likely invalid,
e.g. there is high change the query is incorrectly written, so it will return unexpected results.
The query is invalid most of the time if it passes something other than series selector into rollup function.
For example:
- rate(sum(foo))
- rate(foo + bar)
- rate(foo > bar)
Improtant note: the query is considered valid if it misses the lookbehind window in square brackes inside rollup function,
e.g. rate(foo), since this is very convenient MetricsQL extention to PromQL, and this query returns the expected results
most of the time.
Other unsafe query types can be added in the future into metricsql.IsLikelyInvalid().
TODO: probably, the -search.disableImplicitConversion command-line flag must be set by default in the future releases of VictoriaMetrics.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4338
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6180
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6450
- Clarify docs for `Ignore aggregation intervals on start` feature.
- Make more clear the code dealing with ignoreFirstIntervals at aggregator.runFlusher() functions.
It is better from readability and maintainability PoV using distinct a.flush() calls
for distinct cases instead of merging them into a single a.flush() call.
- Take into account the first incomplete interval when tracking the number of skipped aggregation intervals,
since this behaviour is easier to understand by the end users.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6137
Previously the in-memory buffer could remain unflushed for long periods of time under low ingestion rate.
The ingested logs weren't visible for search during this time.
### Describe Your Changes
Removed snap packages support as it requires time for maintenance and
it's not popular at all
### Checklist
The following checks are **mandatory**:
- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
---------
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 0a42c8fd8b)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
### Describe Your Changes
- added stale metrics counters for input and output samples
- added labels for aggregator metrics =>
`name="{rwctx}:{aggrId}:{aggrSuffix}"`
- rwctx - global or number starting from 1
- aggrid - aggregator id starting from 1
- aggrSuffix - <interval>_(by|without)_label1_label2_labeln
e.g: `name="global:1:1m_without_instance_pod"`
### Checklist
The following checks are **mandatory**:
- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
---------
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 861852f262)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
The set of log fields in the found logs may differ from the set of log fields present in the log stream.
So compare only the log fields in the found logs when searching for the matching log entry in the log stream.
While at it, return _stream field in the delimiter log entry, since this field is used by VictoriaLogs Web UI
for grouping logs by log streams.
Fix typo
### Describe Your Changes
Fix typo in the vmalert docs. In the docs it states rules at `VMAgent`'s
namespace when it should be `VMAlert`'s namespace.
### Checklist
The following checks are **mandatory**:
- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
(cherry picked from commit 8d01dd0514)
### Describe Your Changes
Small improvements to a QuickStart guide of `vmanomaly`
### Checklist
The following checks are **mandatory**:
- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
### Describe Your Changes
Fixes#6453
### Checklist
The following checks are **mandatory**:
- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
### Describe Your Changes
Fix Date metricid cache consistency under concurrent use.
When one goroutine calls Has() and does not find the cache entry in the
immutable map it will acquire a lock and check the mutable map. And it
is possible that before that lock is acquired, the entry is moved from
the mutable map to the immutable map by another goroutine causing a
cache miss.
The fix is to check the immutable map again once the lock is acquired.
### Checklist
The following checks are **mandatory**:
- [x ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
---------
Signed-off-by: Artem Fetishev <wwctrsrx@gmail.com>
Co-authored-by: Nikolay <nik@victoriametrics.com>
### Describe Your Changes
- Fixed the update of the relative time range when `Execute Query` is
clicked
- Optimized server requests: now, if an error occurs in the `/query`
request, the `/hits` request will not be executed.
#6345 (duplicates: #6440, #6312)
(cherry picked from commit 43342745ac)
Updating documentation around the opentelemetry endpoint for metrics and
the "How to use OpenTelemetry metrics with VictoriaMetrics" guide so
that it shows not only how to directly write but also how to write to
the otel collector and view metrics in vmui.
---------
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
(cherry picked from commit 6652fb630f)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
While there, add a hint about truncated error messages.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 6775a00622)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
- Fixed config example in QuickStart vmanomaly docs for 1.13 version
compatibility
- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
(cherry picked from commit 4b77778ff5)
The TryParseTimestampRFC3339Nano() must properly parse RFC3339 timestamps with timezone offsets.
While at it, make tryParseTimestampISO8601 function private in order to prevent
from improper usage of this function from outside the lib/logstorage package.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6508
While there, remove the obsolete files from the previous version of
docs.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 87edfa61c1)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
This reverts commit 5ecf439078.
Reason for revert: the previous logic was correct.
The purpose of `-search.maxSamplesPerQuery` command-line flag is to limit the amounts of CPU resources,
which could be taken by a single query - see https://docs.victoriametrics.com/#resource-usage-limits .
VictoriaMetrics processes samples in blocks during querying - it reads the block, then unpacks it,
then filters out samples outside the selected time range. This means that it _spends CPU time_
on reading and unpacking of _all the samples_ in every block on the requested time range,
even if only a single sample per each block matches the given time range.
The previous logic was effectively limiting CPU time a single query could take.
The new logic fails limiting CPU time a single query could take in some pathological cases
when only a small fraction of samples per each requested block fit the requested time range.
This allows performing multiplication DoS-attacks by querying very narrow time ranges over historical blocks,
which tend to be full. For example, if the `-search.maxSamplesPerQuery` equals to a billion,
and the query requests a single sample out of 8K samples per each block, this means that the query
may unpack a billion of such blocks without exceeding the limit, e.g. it may unpack and process 8K*1e9=8e12 samples.
This is not what the resource usage limits were created for originally - see https://docs.victoriametrics.com/#resource-usage-limits
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5851
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6464
This reverts commit 92b22581e6.
Reason for revert: too complex and slow approach for spellchecking task.
This approach may significantly slow down development pace. It also may take non-trivial
amounts of additional time and resources at CI/CD because of all this npm shit at cspell directory.
Note to @arkid15r : the idea with the ability to run spellchecker on all the VictoriaMetrics
codebase is great. But this shouldn't be mandatory pre-commit check. It is enough to have
a Makefile rule like `make spellcheck`, which could be run manually whenever spellcheck is needed
(e.g. once per month or once per quarter).
Reason for revert: this commit doesn't resolve real security issues,
while it complicates the resulting code in subtle ways (aka security circus).
Comparison of two strings (passwords, auth keys) takes a few nanoseconds.
This comparison is performed in non-trivial http handler, which takes thousands
of nanoseconds, and the request handler timing is non-deterministic because of Go runtime,
Go GC and other concurrently executed goroutines. The request handler timing is even
more non-deterministic when the application is executed in shared environments
such as Kubernetes, where many other applications may run on the same host and use
shared resources of this host (CPU, RAM bandwidth, network bandwidth).
Additionally, it is expected that the passwords and auth keys are passed via TLS-encrypted connections.
Establishing TLS connections takes additional non-trivial time (millions of nanoseconds),
which depends on many factors such as network latency, network congestion, etc.
This makes impossible to conduct timing attack on passwords and auth keys in VictoriaMetrics components.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6423/files
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6392
mention change for
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6457
### Describe Your Changes
Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.
### Checklist
The following checks are **mandatory**:
- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
(cherry picked from commit 1af13208c2)
### Describe Your Changes
With the recent release of Managed VictoriaMetrics users are able to
create and execute Alerting & Recording rules and send notifications via
hosted Alertmanager.
So, we're publishing Alertmanager configuration docs for Managed
VictoriaMetrics.
### Checklist
The following checks are **mandatory**:
- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
(cherry picked from commit 1106d91741)
### Describe Your Changes
Trimming content which is loaded from an external pass leads to obscure
issues in case user-defined input contained trimmed chars. For example.
user-defined password "foo\n" will become "foo" while user will expect
it to contain a new line.
---
For example, a user defines a password which ends with `\n`. This often
happens when user Kubernetes secrets and manually encodes value as
base64-encoded string.
In this case vmauth configuration might look like:
```
users:
- url_prefix:
- http://vminsert:8480/insert/0/prometheus/api/v1/write
name: foo
username: foo
password: "foobar\n"
```
vmagent configuration for this setup will use the following flags:
```
-remoteWrite.url=http://vmauth:8427/
-remoteWrite.basicAuth.passwordFile=/tmp/vmagent-password
-remoteWrite.basicAuth.username="foo"
```
Where `/tmp/vmagent-password` is a file with `foobar\n` password.
Before this change such configuration will result in `401 Unauthorized`
response received by vmagent since after file content will become
`foobar`.
---
An example with Kubernetes operator which uses a secret to reference the
same password in multiple configurations.
<details>
<summary>See full manifests</summary>
`Secret`:
```
apiVersion: v1
data:
name: Zm9v # foo
password: Zm9vYmFy # foobar\n
username: Zm9v= # foo
kind: Secret
metadata:
name: vmuser
```
`VMUser`:
```
apiVersion: operator.victoriametrics.com/v1beta1
kind: VMUser
metadata:
name: vmagents
spec:
generatePassword: false
name: vmagents
targetRefs:
- crd:
kind: VMAgent
name: some-other-agent
namespace: example
username: foo
# note - the secret above is referenced to provide password
passwordRef:
name: vmagent
key: password
```
`VMAgent`:
```
apiVersion: operator.victoriametrics.com/v1beta1
kind: VMAgent
metadata:
name: example
spec:
selectAllByDefault: true
scrapeInterval: 5s
replicaCount: 1
remoteWrite:
- url: "http://vmauth-vmauth-example:8427/api/v1/write"
# note - the secret above is referenced as well
basicAuth:
username:
name: vmagent
key: username
password:
name: vmagent
key: password
```
</details>
Since both config target exactly the same `Secret` object it is expected
to work, but apparently the result will be `401 Unauthrized` error.
### Checklist
The following checks are **mandatory**:
- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
---------
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 201fd6de1e)
- Added a bar chart displaying the number of log entries over a time
range.
#6404
- When `_msg` is empty, all fields are displayed in a single line.
- Added double quotes when copying pairs: `key: "value"`.
- Minor style adjustments.
(cherry picked from commit 32fbffedd9)
Check for ranged vector arguments in aggregate expressions when
`-search.disableImplicitConversion` or `-search.logImplicitConversion`
are enabled.
For example, `sum(up[5m])` will fail to execute if these flags are set.
### Describe Your Changes
Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.
### Checklist
The following checks are **mandatory**:
- [*] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
---------
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 6149adbe10)
### Describe Your Changes
Added config example for `min_dev_from_expected` arg; also, small
styling fixes and alignments
### Checklist
The following checks are **mandatory**:
- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
The limit is specified with command-line flag
`-search.maxSamplesPerQuery`.
Previously, samples might be over-counted and query can't be fixed by
reducing time range.
address https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5851
(cherry picked from commit 6e395048d3)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
### Describe Your Changes
Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.
### Checklist
The following checks are **mandatory**:
- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
(cherry picked from commit f11261dc44)
### Describe Your Changes
small fix of typos in v1.13 presets (vmanomaly docs)
### Checklist
The following checks are **mandatory**:
- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
(cherry picked from commit 1cf28598c6)
### Describe Your Changes
* check if `lastValue` was seen at least twice with different
timestamps. Otherwise, the difference between last timestamp and
previous timestamp could be `0` and will result into `NaN` calculation
* check if there items left in lastValue map after staleness cleanup.
Otherwise, `rate_avg` could have produce `NaN` result.
### Checklist
The following checks are **mandatory**:
- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
---------
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 51d19485bb)
This change fixes the following panic:
```
2024-06-04T11:16:52.899Z warn app/vmauth/auth_config.go:353 cannot discover backend SRV records for http://srv+localhost:8080: lookup localhost on 10.100.10.4:53: server misbehaving; use it literally
panic: runtime error: integer divide by zero
goroutine 9 [running]:
github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver.handlerWrapper.func1()
/Users/lhhdz/wd/projects/go/VictoriaMetrics/lib/httpserver/httpserver.go:291 +0x58
panic({0x103115100?, 0x10338d700?})
/Users/lhhdz/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.22.3.darwin-arm64/src/runtime/panic.go:770 +0x124
main.getLeastLoadedBackendURL({0x0?, 0x22?, 0x1400014757b?}, 0x1400013c120?)
/Users/lhhdz/wd/projects/go/VictoriaMetrics/app/vmauth/auth_config.go:473 +0x210
main.(*URLPrefix).getBackendURL(0x140000aa080)
/Users/lhhdz/wd/projects/go/VictoriaMetrics/app/vmauth/auth_config.go:312 +0xb8
```
---------
Co-authored-by: Haley Wang <haley@victoriametrics.com>
### Describe Your Changes
Fix 404 relative img links for v1.13.0 update of vmanomaly docs
### Checklist
The following checks are **mandatory**:
- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
(cherry picked from commit 9a355c3512)
### Describe Your Changes
value→valid
### Checklist
The following checks are **mandatory**:
- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
Signed-off-by: Lapo Luchini <lapo@lapo.it>
(cherry picked from commit 14f586238f)
Add support for markdown format and emoji for the `_msg` field in the
"Group" view.
Add markdown rendering toggle. Disabled by default. Value is stored in
`localStorage`.
It is better to remove deprecated flag completely, so vmctl will
fail if this flag is used and user can immediately fix the issue.
Before, flag was ignored and it is worse then fail fast.
follow-up after 8b46bb0c41 (diff-2bfab3db5cc1baf4c6d3ff6b19901926e3bdf4411ec685dac973e5fcff1c723b)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 8d95522529)
* adds idleConnTimeout flag, which must reduce probability of `broken
pipe` and `connection reset` errors.
* one-time retry trivial network requests for the same backend
---------
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit d44058bcd6)
* deprecate `--vm-disable-progress-bar` in favour of `--disable-progress-bar`
* new `--disable-progress-bar` consistently disables usage of progress bar
for all migration modes.
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6367
---------
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 8b46bb0c41)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
…pAuth.*
address https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6329,
makes `reloadAuthKey`, `configAuthKey`, `flagsAuthKey`, `pprofAuthKey`
behavior the same way,
but keys like `-snapshotAuthKey`, `-forceMergeAuthKey` are still
protected by httpAuth.*. All the available key are listed in
https://docs.victoriametrics.com/single-server-victoriametrics/#security.
---------
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 61dce6f2a1)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
- Use bytesutil.InternString() instead of strings.Clone() for inputKey and outputKey in aggregatorpushSamples().
This should reduce string allocation rate, since strings can be re-used between aggrState flushes.
- Reduce memory allocations at dedupAggrShard by storing dedupAggrSample by value in the active series map.
- Remove duplicate call to bytesutil.InternBytes() at Deduplicator, since it is already called inside dedupAggr.pushSamples().
- Add missing string interning at rateAggrState.pushSamples().
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6402
The main change is getting rid of interning of sample key. It was
discovered that for cases with many unique time series aggregated by
vmagent interned keys could grow up to hundreds of millions of objects.
This has negative impact on the following aspects:
1. It slows down garbage collection cycles, as GC has to scan all inuse
objects periodically. The higher is the number of inuse objects, the
longer it takes/the more CPU it takes.
2. It slows down the hot path of samples aggregation where each key
needs to be looked up in the map first.
The change makes code more fragile, but suppose to provide performance
optimization for heavy-loaded vmagents with stream aggregation enabled.
---------
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
### Describe Your Changes
Added streamaggr metrics to:
- `vm_streamaggr_samples_lag_seconds` - samples lag
- `vm_streamaggr_ignored_samples_total{reason="nan"}` - ignored NaN
samples
- `vm_streamaggr_ignored_samples_total{reason="too_old"}` - ignored old
samples
(cherry picked from commit 185fac03b3)
### Describe Your Changes
Fix for issue #6396: according to rmdir manpage, ENOTEMPTY and EEXIST
should be treated equally
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6396
### Checklist
The following checks are **mandatory**:
- [x ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
---------
Co-authored-by: Ludovic Pollet <ludovic.pollet@exfo.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 3ddae77c63)
Unsupported path is already handled by `lib/httpserver`.
This prevents from misleading errors in logs caused by double-writing response headers.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit a5f81f67fd)
### Describe Your Changes
Scratch based images will be using a separate tag: "(version)-scratch"
and will be built for the same architecture as regular images.
This is useful for environments with higher security standards. In this
case using alpine as base layer requires updating images more frequently
in order to get the latest updates for the base image, even in case the
user did not need to update VictoriaMetrics version.
Tested that scratch images work for:
- vmagent - enterprise with kafka and opensource
- cluster
- single-node
No issues observed so far.
cc: @tenmozes
### Checklist
The following checks are **mandatory**:
- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
---------
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 7dc9124ba7)
* "*.idleConnTimeout" flags must reduce probability of `write: broken
pipe` and `read: connection reset by peer` errors Those errors may occur
if remote server closes TCP socket for connection, while it's still
exist at client.
* single time retries for `write: broken pipe` and `read: connection
reset by peer` must handle a case for incorrectly configured timeouts at
middleware proxies, mitigate minor network issues.
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5661
### Describe Your Changes
Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.
---------
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
(cherry picked from commit b97916276f)
### Describe Your Changes
Fix typo in link.
### Checklist
The following checks are **mandatory**:
- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
It allows to create alert for possible item drops at indexdb. It may
happen, if ingested metric size exceeds max indexdb item size.
---------
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 69d244e6fb)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Update wording to highlight that cache is not persistent if flag is
value is empty. Previously, it was not clear if cache is not used at all
or just not persistent.
* It must reduce memory usage for misbehaving clients. Since
VictoriaMetrics stores sparse index inmemory.
* Reduce disk space usage for indexdb.
* Prevent possible indexDB items drops.
* It may trigger slow insert and new timeseries registration due to
default value for flag change
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6176
---------
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
This PR fixes an issue where parsing long `_msg` values caused errors,
resulting in some log records not being displayed.
The error occurred due to partial processing of strings. In some cases,
a long record could be split into multiple chunks, causing only part of
the record to be processed instead of the entire entry.
#6281
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
This should improve precision of `restarts` and `version change` annotations when
zooming-in/zooming-out on the dashboards.
The change also makes `restarts` dashboard visible on the panels, so user can disable it from
displaying if needed. This could be useful when restarts overlap with version change events.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 9dd9b4442f)
* The new flag can be used for for skipping TLS certificates
verification when connecting to S3 endpoint. Affects vmbackup,
vmrestore, vmbackupmanager.
* replace deprecated `EndpointResolver` with `BaseEndpoint`
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1056
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit ac836bcf6c)
The error appears when executing annotations query against Prometheus backend
because the query itself hasn't specified look-behind window (which is allowed
in VictoriaMetrics query engine).
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6309
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit c746ba154d)
Prevent excessive resource usage when stream aggregation config file
contains no matchers by prevent pushing data into Aggregators object.
Before this change a lot of extra work was invoked without reason.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 7ce052b32d)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Occasionally, vmagent sends empty blocks to downstream servers. If a
downstream server returns an unexpected response, vmagent gets stuck in
a retry loop. While vmagent handles 400 and 409 errors, there are
various prometheus remote write implementations that return different
error codes. For example, vector returns a 422 error. To mitigate the
risk of vmagent getting stuck in a retry loop, it is advisable to skip
sending empty blocks to downstream servers.
Co-authored-by: hao.peng <hao.peng@smartx.com>
Co-authored-by: Zhu Jiekun <jiekun.dev@gmail.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 3661373cc2)
* adds datadog extensions for statsd:
- multiple packed values (v1.1)
- additional types distribution, histogram
* adds type check and append metric type to the labels with special tag
name `__statsd_metric_type__`. It simplifies streaming aggregation
config.
* remove statsd support from cluster, since cluster doesn't support
stream aggregation.
---------
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit b2765c45d0)
Signed-off-by: hagen1778 <roman@victoriametrics.com>