Commit Graph

632 Commits

Author SHA1 Message Date
Andrii Chubatiuk
937ae2ca90
lib/streamaggr: added stale samples metric, added metrics labels (#6462)
### Describe Your Changes

- added stale metrics counters for input and output samples
- added labels for aggregator metrics =>
`name="{rwctx}:{aggrId}:{aggrSuffix}"`
   - rwctx - global or number starting from 1
   - aggrid - aggregator id starting from 1
   - aggrSuffix - <interval>_(by|without)_label1_label2_labeln
   e.g: `name="global:1:1m_without_instance_pod"`

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>

(cherry picked from commit 861852f262)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-07-01 15:01:49 +02:00
Andrii Chubatiuk
516848783e
deployment: build image for vmagent streamaggr benchmark (#6515)
### Describe Your Changes

optionally build vmagent image for benchmark
needed for https://github.com/VictoriaMetrics/ops/pull/1297

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

(cherry picked from commit 6b128da811)
2024-06-24 16:29:14 +02:00
Hui Wang
028a80613f
lib/httpserver: allow reloadAuthKey and configAuthKey to override htt… (#6338)
…pAuth.*

address https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6329,
makes `reloadAuthKey`, `configAuthKey`, `flagsAuthKey`, `pprofAuthKey`
behavior the same way,
but keys like `-snapshotAuthKey`, `-forceMergeAuthKey` are still
protected by httpAuth.*. All the available key are listed in
https://docs.victoriametrics.com/single-server-victoriametrics/#security.

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>

(cherry picked from commit 61dce6f2a1)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-06-10 12:41:29 +02:00
hagen1778
73c9981335
chore: follow-up after c740a8042e
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 6d8e02f278)
2024-06-03 11:53:37 +02:00
yumeiyin
95b8cf76f8
chore: remove redundant words (#6348)
(cherry picked from commit 9289c7512d)
2024-05-29 14:37:04 +02:00
Andrii Chubatiuk
2c4a42554a
app/vmagent: fixed streamaggr args (#6374)
use GetOptionalArg instead of index to fallback to a first argument if
index is absent for remotewrite.streamaggr.config

(cherry picked from commit 7e5a206057)
2024-05-29 14:04:24 +02:00
Hui Wang
5b8c3fc9d0
app/vmalert: support DNS SRV record in -remoteWrite.url (#6299)
part of https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6053,
supports [DNS SRV](https://en.wikipedia.org/wiki/SRV_record) address in
`-remoteWrite.url` command-line option.

(cherry picked from commit d7b5062917)
2024-05-22 10:53:22 +02:00
Roman Khavronenko
3e8b5e74d5
lib/streamaggr: skip empty aggregators (#6307)
Prevent excessive resource usage when stream aggregation config file
contains no matchers by prevent pushing data into Aggregators object.
Before this change a lot of extra work was invoked without reason.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

(cherry picked from commit 7ce052b32d)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-05-20 14:46:36 +02:00
Roman Khavronenko
8daa1d9505
app/vmagent: fix panic on shutdown when no global deduplication is co… (#6308)
…nfigured

Follow-up for f153f54d11

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 7dc18bf67a)
2024-05-20 14:46:10 +02:00
viperstars
ab78f3c89d
app/vmagent/remotewrite: skip sending empty block to downstream server (#6241)
Occasionally, vmagent sends empty blocks to downstream servers. If a
downstream server returns an unexpected response, vmagent gets stuck in
a retry loop. While vmagent handles 400 and 409 errors, there are
various prometheus remote write implementations that return different
error codes. For example, vector returns a 422 error. To mitigate the
risk of vmagent getting stuck in a retry loop, it is advisable to skip
sending empty blocks to downstream servers.

Co-authored-by: hao.peng <hao.peng@smartx.com>
Co-authored-by: Zhu Jiekun <jiekun.dev@gmail.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 3661373cc2)
2024-05-17 14:57:07 +02:00
Andrii Chubatiuk
fe332c3419
app/vmagent: add global aggregator (#6268)
Add global stream aggregation for VMAgent

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5467
(cherry picked from commit f153f54d11)
2024-05-17 14:01:31 +02:00
Nikolay
ee4a94a371
follow-up for c6c5a5a186 (#6265)
* adds datadog extensions for statsd:
  - multiple packed values (v1.1)
  - additional types distribution, histogram

* adds type check and append metric type to the labels with special tag
name `__statsd_metric_type__`. It simplifies streaming aggregation
config.

* remove statsd support from cluster, since cluster doesn't support
stream aggregation.

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>

(cherry picked from commit b2765c45d0)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-05-17 13:49:24 +02:00
Andrii Chubatiuk
ec2273b247
app/vmagent: removed deprecated -remoteWrite.multitenantURL flag support (#6253)
Removed deprecated `-remoteWrite.multitenantURL` flag to simplify global
stream aggregation

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 680b8c25c8)
2024-05-13 16:49:33 +02:00
Roman Khavronenko
0bed453737
Feature allow configuring disableOnDiskQueue and dropSamplesOnOverload per url (#6248)
* FEATURE: [vmagent](https://docs.victoriametrics.com/vmagent.html):
allow configuring `-remoteWrite.disableOnDiskQueue` and
`-remoteWrite.dropSamplesOnOverload` cmd-line flags per each
`-remoteWrite.url`. See this [pull
request](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6065).
Thanks to @rbizos for implementaion!
* FEATURE: [vmagent](https://docs.victoriametrics.com/vmagent.html): add
labels `path` and `url` to metrics
`vmagent_remotewrite_push_failures_total` and
`vmagent_remotewrite_samples_dropped_total`. Now number of failed pushes
and dropped samples can be tracked per `-remoteWrite.url`.

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Raphael Bizos <r.bizos@criteo.com>
(cherry picked from commit 87fd400dfc)
2024-05-10 14:32:23 +02:00
qiangxuhui
885fc4122a
Add build support for loong64 (#6222)
### Describe Your Changes

Added makefile rule for `GOARCH=loong64` to support building all
VictoriaMetrics components on the `loongarch64` platform.

### Checklist

The following checks are **mandatory**:

* [X] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

Signed-off-by: qiangxuhui <qiangxuhui@loongson.cn>

(cherry picked from commit 80f3644ee3)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-05-10 14:32:05 +02:00
Oleg
76af930e4a
Statsd protocol compatibility (#5053)
In this PR I added compatibility with [statsd
protocol](https://github.com/b/statsd_spec) with tags to be able to send
metrics directly from statsd clients to vmagent or directly to VM.
For example its compatible with
[statsd-instrument](https://github.com/Shopify/statsd-instrument) and
[dogstatsd-ruby](https://github.com/DataDog/dogstatsd-ruby) gems

Related issues: #5052, #206, #4600

(cherry picked from commit c6c5a5a186)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-05-10 14:27:31 +02:00
Ted Possible
0206a01d03
Exemplar support (#5982)
This code adds Exemplars to VMagent and the promscrape parser adhering
to OpenMetrics Specifications. This will allow forwarding of exemplars
to Prometheus and other third party apps that support OpenMetrics specs.

---------

Signed-off-by: Ted Possible <ted_possible@cable.comcast.com>
(cherry picked from commit 5a3abfa041)
2024-05-10 13:14:17 +02:00
Andrii Chubatiuk
e26b55db1e
app/vmagent/remotewrite: do not cleanup timeseries which are used in multiple remote write contexts (#6206)
When at least one remote write has deduplication configured it cleans up
timeseries while they can be in use by another remote write without
deduplication

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6205
---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 879771808b)
2024-05-06 12:10:45 +02:00
hagen1778
57c841669c
app/vmagent: mention corner case with dangling queues and identical URLs
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6140

We don't cover this corner case as it has low chance for reproduction.
Precisely, the requirements are following:
1. vmagent need to be configured with multiple identical `remoteWrite.url` flags;
2. At least one of the persistent queues need to be non-empty, which already
signalizes about issues with setup;
3. vmagent need to be restarted with removing of one of `remoteWrite.url` flags.

We do not document this case in vmagent.md as it seems to be a rare corner case
and its explanation will require too much of explanation and confuse users.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 4251292708)
2024-04-23 14:52:35 +02:00
hagen1778
342290275e
app/streamaggr: follow-up after c0e4ccb7b5
* rm vmagent mentions from vminsert flags
* improve documentation wording, add links to related sections
* mention `ignore_first_intervals` in the stream aggr options
* update flags description
* add basic test for config parsing validation

Signed-off-by: hagen1778 <roman@victoriametrics.com>

(cherry picked from commit bae3874e6a)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-04-22 14:39:23 +02:00
Andrii Chubatiuk
131367fb59
lib/streamaggr: add option to ignore first N aggregation intervals (#6137)
Stream aggregation may yield inaccurate results if it processes incomplete data.
This issue can arise when data is sourced from clients that maintain a queue of unsent data, such as Prometheus or vmagent.
 If the queue isn't fully cleared within the aggregation interval, only a portion of the time series may be included in that period, leading to distorted calculations.
To mitigate this we add an option to ignore first N aggregation intervals. It is expected, that client queues
will be cleared during the time while aggregation ignores first N intervals and all subsequent aggregations
will be correct.

(cherry picked from commit c0e4ccb7b5)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-04-22 14:34:36 +02:00
Aliaksandr Valialkin
a249ab96b4
app/vmagent/influx: replace hybrid channel-based pool + sync.Pool with plain sync.Pool for pushCtx
Data ingestion benchmark doesn't show memory usage difference between two approaches,
so let's use simpler approach in order to improve code readability and maintainability.

This is a follow-up for 77c597738c
2024-04-20 21:38:25 +02:00
Aliaksandr Valialkin
d6da68ee90
app/vmagent/common: use plain sync.Pool instead of a mix of sync.Pool with channel-based pool for PushCtx
This scheme was used for reducing memory usage when vmagent runs on a machine with big number of CPU cores
and the ingestion rate isn't too big. The scheme with channel-based pool could reduce memory usage,
since it minimizes the number of PushCtx structs in the pool in this case.

Performance tests didn't reveal significant difference in memory usage under both low and high ingestion rate
between plain sync.Pool and the current hybrid scheme, so replace the scheme with plain sync.Pool in order
to simplify the code.
2024-04-20 21:31:14 +02:00
Aliaksandr Valialkin
9004bc098e
all: use clear() built-in Go function for clearing []prompbmarshal.TimeSeries and []prompbmarshal.Label slices
This makes the code a bit clear.
2024-04-20 21:00:24 +02:00
Aliaksandr Valialkin
5b29be1f4d
app/vmagent/remotewrite: add support for replication additionally to sharding when both -remoteWrite.shardByURL and -remoteWrite.shardByURLReplicas=RF command-line flags are set
This allows setting up data replication among failure domains if the replication factor is smaller than the number of failure domains.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6054

See https://docs.victoriametrics.com/vmagent/#sharding-among-remote-storages
2024-04-19 11:37:04 +02:00
Aliaksandr Valialkin
a21d1fcf57
all: replace old https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html url with the new one - https://docs.victoriametrics.com/cluster-victoriametrics/ 2024-04-18 02:56:28 +02:00
Aliaksandr Valialkin
728bcf0585
all: replace old https://docs.victoriametrics.com/stream-aggregation.html url with the new one - https://docs.victoriametrics.com/stream-aggregation/ 2024-04-18 02:20:00 +02:00
Aliaksandr Valialkin
0211a04a52
all: replace the outdated url https://docs.victoriametrics.com/vmagent.html with the new one - https://docs.victoriametrics.com/vmagent/ 2024-04-18 01:32:57 +02:00
Aliaksandr Valialkin
284d99e269
app/vmagent: support for DNS SRV urls at -remoteWrite.url, scrape target urls and service discovery urls
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6053
2024-04-17 20:56:23 +02:00
Aliaksandr Valialkin
d61f6c89a1
app/{vmagent,vminsert}: accept Prometheus remote write protocol requests at /prometheus/api/v1/push additionally to /api/v1/push
This is a follow-up for 7ccdb57ea4
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5990
2024-04-04 02:17:10 +03:00
Eugene Ma
7459a197c6
add "/api/v1/push" to request handler (#5990)
Co-authored-by: Eugene Ma <eugene.ma@airbnb.com>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2024-04-04 02:16:27 +03:00
Aliaksandr Valialkin
ecd782c75e
app/vmagent: follow-up for b3b29ba6ac
- Automatically reload changed TLS root CA pointed by -remoteWrite.tlsCAFile command-line flag
- Automatically reload changed TLS root CA configured via oauth2.tsl_config.ca_file option at -promscrape.config
- Document the change as a feature instead of a bug at docs/CHANGELOG.md
- Simplify the code at lib/promauth, which is responsible for reloading changed TLS root CA files.
- Simplify the usage of lib/promauth.Config.NewRoundTripper() - now it accepts the base http.Transport
  instead of a callback, which can change the internal http.Transport.
- Reuse the default tls config if lib/promauth.Config doesn't contain tls-specific configs.
  This should reduce memory usage a bit when tls isn't used for scraping big number of targets.
- Do not re-read TLS root CA files on every processed request. Re-read them once per second.
  This should reduce CPU usage when scraping big number of targets over https.
- Do not store cert.pem and key.pem files in TestTLSConfigWithCertificatesFilesUpdate, since they can be loaded
  from byte slices via crypto/tls.X509KeyPair().
- Remove obsolete comparisons of string representations for authConfig and proxyAuthConfig at areEqualScrapeConfigs().

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5725
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5526
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2171
2024-04-04 01:26:38 +03:00
Aliaksandr Valialkin
00f59d6ddf
all: fix golangci-lint(revive) warnings after 0c0ed61ce7
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6001
2024-04-03 03:00:45 +03:00
Artem Navoiev
f668489051
app/{vmagent/insert} fix typo in Firehose
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-04-03 02:51:57 +03:00
Aliaksandr Valialkin
faa2ba828a
app/vmagent: simplify code after 509df44d03
- Simplify the code in order to improve its maintenance
- Properly pass tenant ID when processing multi-tenant opentelemetry request at vmagent

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6016
2024-04-03 02:50:46 +03:00
Aliaksandr Valialkin
7edb5f77f1
app/vmagent: properly shutdown when -maxIngestionRate limit is reached
The remotewrite.Stop() expects that there are no pending calls to TryPush().
This means that the ingestionRateLimiter.Register() must be unblocked inside TryPush() when calling remotewrite.Stop().
Provide remotewrite.StopIngestionRateLimiter() function for unblocking the rate limiter before calling the remotewrite.Stop().

While at it, move the rate limiter into lib/ratelimiter package, since it has two users.
Also move the description of the feature to the correct place at docs/CHANGELOG.md.
Also cross-reference -remoteWrite.rateLimit and -maxIngestionRate command-line flags.

This is a follow-up for 02bccd1eb9
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5900
2024-04-03 02:41:11 +03:00
Aliaksandr Valialkin
3f79e54a51
app/vmagent/remotewrite: follow-up for 166b97b8d0 and b6bd9a97a3
- Make the configuration more clear by accepting the list of ignored labels during sharding
  via a dedicated command-line flag - -remoteWrite.shardByURL.ignoreLabels.
  This prevents from overloading the meaning of -remoteWrite.shardByURL.labels command-line flag.

- Removed superfluous memory allocation per each processed sample if sharding by remote storage is enabled.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5938
2024-04-03 00:50:48 +03:00
Andrii Chubatiuk
914b23f1e8
app/{vmagent,vminsert}: fixed firehose response (#6016) 2024-04-02 18:03:12 +03:00
hagen1778
36c1ca9949
app/vmagent: follow-up 166b97b8d0
* add tests for sharding function
* update flags description
* add changelog note

166b97b8d0
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit b6bd9a97a3)
2024-03-29 14:10:17 +01:00
Eugene Ma
744d513181
vmagent: support sharding by excluded labels (#5938)
To horizontally scale streaming aggregation, you might want to deploy a separate hashing tier
of vmagents that route to a separate aggregation tier. The hashing tier should shard by all labels
except the instance-level labels, to ensure the input metrics are routed correctly to the aggregator
instance responsible for those labels.
For this to achieve we introduce `remoteWrite.shardByURL.inverseLabels` flag to inverse logic of `remoteWrite.shardByURL.labels`

---------

Co-authored-by: Eugene Ma <eugene.ma@airbnb.com>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
(cherry picked from commit 166b97b8d0)
2024-03-29 14:10:15 +01:00
Alexander Marshalov
0b3744effa
[vmagent] added ingestion rate limiting with new flag -maxIngestionRate (#5900)
* [vmagent] added ingestion rate limiting with new flag `-maxIngestionRate`. This flag can be used to limit the number of samples ingested by vmagent per second. If the limit is exceeded, the ingestion rate will be throttled.

* fix changelog

* fix review comment

(cherry picked from commit 02bccd1eb9)
2024-03-25 15:42:52 +01:00
Aliaksandr Valialkin
111d0aa2bf
app/{vmagent,vminsert}: add an ability to ignore input samples outside the current aggregation interval for stream aggregation
See https://docs.victoriametrics.com/stream-aggregation.html#ignoring-old-samples
2024-03-17 23:30:46 +02:00
Aliaksandr Valialkin
1d7efc4c64
app/vmagent/remotewrite: clarify the reason behind the default value for -remoteWrite.queues in the same way as the reason for -maxConcurrentInserts is defined at 73f5fb0f0c 2024-03-06 13:57:53 +02:00
Aliaksandr Valialkin
27b9e8ed3e
app/{vmagent,vminsert}: add -streamAggr.dropInputSamples command-line flag for dropping the specified labels from input samples before deduplication and streaming aggregation 2024-03-05 02:27:27 +02:00
Aliaksandr Valialkin
c38c45d71f
app/{vminsert,vmagent}: allow using -streamAggr.dedupInterval without -streamAggr.config
This allows performing online de-duplication of incoming samples
2024-03-05 00:47:23 +02:00
Aliaksandr Valialkin
48a425898a
lib/streamaggr: enable time alignment for aggregate flushed to multiples of interval
For example, if `interval: 1m`, then data flush occurs at the end of every minute,
while `interval: 1h` leads to data flush at the end of every hour.

Add `no_align_flush_to_interval` option, which can be used for disabling the alignment.
2024-03-04 06:23:35 +02:00
Aliaksandr Valialkin
6aea1e5093
app/{vmagent,vminsert}: follow-up for 434a5803e7
Document the /opentelemetry/v1/metrics endpoint instead of /opentelemetry/api/v1/push,
since the /v1/metrics suffix is hardcoded in OpenTelemetry protocol specification.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5871
2024-02-29 18:05:12 +02:00
Nikolay
abb773120d
app/{vmagent,vminsert}: adds /v1/metrics suffix for opentelemetry route path (#5871)
* app/{vmagent,vminsert}: adds /v1/metrics suffix for opentelemetry route path
it must fix compatibility with opentemetry-collector [spec](https://opentelemetry.io/docs/specs/otlp/\#otlphttp-request)
this suffix is hard-coded and cannot be changed with collector configuration

* Apply suggestions from code review

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2024-02-29 17:59:57 +02:00
Aliaksandr Valialkin
7832d0800e
app/{vminsert,vmagent}: follow-up after 67a55b89a4
- Document the ability to read OpenTelemetry data from Amazon Firehose at docs/CHANGELOG.md

- Simplify parsing Firehose data. There is no need in trying to optimize the parsing with fastjson
  and byte slice tricks, since OpenTelemetry protocol is really slooow because of over-engineering.
  It is better to write clear code for better maintanability in the future.

- Move Firehose parser from /lib/protoparser/firehose to lib/protoparser/opentelemetry/firehose,
  since it is used only by opentelemetry parser.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5893
2024-02-29 14:47:20 +02:00
Andrii Chubatiuk
60cf0c9656
{vmagent,vminsert}: added firehose http destination opentelemetry data ingestion support (#5893)
Co-authored-by: Andrii Chubatiuk <wachy@Andriis-MBP-2.lan>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2024-02-29 14:46:16 +02:00