Commit Graph

614 Commits

Author SHA1 Message Date
Andrii Chubatiuk
c0e4ccb7b5
lib/streamaggr: add option to ignore first N aggregation intervals (#6137)
Stream aggregation may yield inaccurate results if it processes incomplete data. 
This issue can arise when data is sourced from clients that maintain a queue of unsent data, such as Prometheus or vmagent.
 If the queue isn't fully cleared within the aggregation interval, only a portion of the time series may be included in that period, leading to distorted calculations. 
To mitigate this we add an option to ignore first N aggregation intervals. It is expected, that client queues
will be cleared during the time while aggregation ignores first N intervals and all subsequent aggregations
will be correct.
2024-04-22 13:52:04 +02:00
Aliaksandr Valialkin
c22da2f917
app/vmagent/influx: replace hybrid channel-based pool + sync.Pool with plain sync.Pool for pushCtx
Data ingestion benchmark doesn't show memory usage difference between two approaches,
so let's use simpler approach in order to improve code readability and maintainability.

This is a follow-up for 77c597738c
2024-04-20 21:38:11 +02:00
Aliaksandr Valialkin
77c597738c
app/vmagent/common: use plain sync.Pool instead of a mix of sync.Pool with channel-based pool for PushCtx
This scheme was used for reducing memory usage when vmagent runs on a machine with big number of CPU cores
and the ingestion rate isn't too big. The scheme with channel-based pool could reduce memory usage,
since it minimizes the number of PushCtx structs in the pool in this case.

Performance tests didn't reveal significant difference in memory usage under both low and high ingestion rate
between plain sync.Pool and the current hybrid scheme, so replace the scheme with plain sync.Pool in order
to simplify the code.
2024-04-20 21:27:05 +02:00
Aliaksandr Valialkin
7531e9084a
all: use clear() built-in Go function for clearing []prompbmarshal.TimeSeries and []prompbmarshal.Label slices
This makes the code a bit clear.
2024-04-20 21:00:03 +02:00
Aliaksandr Valialkin
8f535f9e76
app/vmagent/remotewrite: add support for replication additionally to sharding when both -remoteWrite.shardByURL and -remoteWrite.shardByURLReplicas=RF command-line flags are set
This allows setting up data replication among failure domains if the replication factor is smaller than the number of failure domains.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6054

See https://docs.victoriametrics.com/vmagent/#sharding-among-remote-storages
2024-04-19 11:27:33 +02:00
Aliaksandr Valialkin
f4b1cbfef0
all: replace old https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html url with the new one - https://docs.victoriametrics.com/cluster-victoriametrics/ 2024-04-18 02:54:20 +02:00
Aliaksandr Valialkin
4d2b9fe6b2
all: replace old https://docs.victoriametrics.com/stream-aggregation.html url with the new one - https://docs.victoriametrics.com/stream-aggregation/ 2024-04-18 02:19:11 +02:00
Aliaksandr Valialkin
c81a633b02
all: replace the outdated url https://docs.victoriametrics.com/vmagent.html with the new one - https://docs.victoriametrics.com/vmagent/ 2024-04-18 01:31:37 +02:00
Aliaksandr Valialkin
dc326f70b4
app/vmagent: support for DNS SRV urls at -remoteWrite.url, scrape target urls and service discovery urls
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6053
2024-04-17 20:54:39 +02:00
Aliaksandr Valialkin
931dd3f320
app/{vmagent,vminsert}: accept Prometheus remote write protocol requests at /prometheus/api/v1/push additionally to /api/v1/push
This is a follow-up for 7ccdb57ea4
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5990
2024-04-04 02:15:52 +03:00
Eugene Ma
7ccdb57ea4
add "/api/v1/push" to request handler (#5990)
Co-authored-by: Eugene Ma <eugene.ma@airbnb.com>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2024-04-04 02:10:24 +03:00
Aliaksandr Valialkin
967d5496cf
app/vmagent: follow-up for b3b29ba6ac
- Automatically reload changed TLS root CA pointed by -remoteWrite.tlsCAFile command-line flag
- Automatically reload changed TLS root CA configured via oauth2.tsl_config.ca_file option at -promscrape.config
- Document the change as a feature instead of a bug at docs/CHANGELOG.md
- Simplify the code at lib/promauth, which is responsible for reloading changed TLS root CA files.
- Simplify the usage of lib/promauth.Config.NewRoundTripper() - now it accepts the base http.Transport
  instead of a callback, which can change the internal http.Transport.
- Reuse the default tls config if lib/promauth.Config doesn't contain tls-specific configs.
  This should reduce memory usage a bit when tls isn't used for scraping big number of targets.
- Do not re-read TLS root CA files on every processed request. Re-read them once per second.
  This should reduce CPU usage when scraping big number of targets over https.
- Do not store cert.pem and key.pem files in TestTLSConfigWithCertificatesFilesUpdate, since they can be loaded
  from byte slices via crypto/tls.X509KeyPair().
- Remove obsolete comparisons of string representations for authConfig and proxyAuthConfig at areEqualScrapeConfigs().

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5725
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5526
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2171
2024-04-04 01:27:35 +03:00
Aliaksandr Valialkin
3de8656551
app/vmagent/remotewrite: follow-up for 166b97b8d0 and b6bd9a97a3
- Make the configuration more clear by accepting the list of ignored labels during sharding
  via a dedicated command-line flag - -remoteWrite.shardByURL.ignoreLabels.
  This prevents from overloading the meaning of -remoteWrite.shardByURL.labels command-line flag.

- Removed superfluous memory allocation per each processed sample if sharding by remote storage is enabled.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5938
2024-04-03 00:54:01 +03:00
Aliaksandr Valialkin
918cccaddf
all: fix golangci-lint(revive) warnings after 0c0ed61ce7
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6001
2024-04-02 23:16:29 +03:00
Artem Navoiev
9bd3cadce6
app/{vmagent/insert} fix typo in Firehose
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-04-02 17:41:21 +02:00
Aliaksandr Valialkin
904e95fc69
app/vmagent: simplify code after 509df44d03
- Simplify the code in order to improve its maintenance
- Properly pass tenant ID when processing multi-tenant opentelemetry request at vmagent

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6016
2024-04-02 17:58:13 +03:00
Aliaksandr Valialkin
830b871baf
app/vmagent: properly shutdown when -maxIngestionRate limit is reached
The remotewrite.Stop() expects that there are no pending calls to TryPush().
This means that the ingestionRateLimiter.Register() must be unblocked inside TryPush() when calling remotewrite.Stop().
Provide remotewrite.StopIngestionRateLimiter() function for unblocking the rate limiter before calling the remotewrite.Stop().

While at it, move the rate limiter into lib/ratelimiter package, since it has two users.
Also move the description of the feature to the correct place at docs/CHANGELOG.md.
Also cross-reference -remoteWrite.rateLimit and -maxIngestionRate command-line flags.

This is a follow-up for 02bccd1eb9
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5900
2024-03-30 06:43:48 +02:00
hagen1778
b6bd9a97a3
app/vmagent: follow-up 166b97b8d0
* add tests for sharding function
* update flags description
* add changelog note

166b97b8d0
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-03-29 14:08:08 +01:00
Eugene Ma
166b97b8d0
vmagent: support sharding by excluded labels (#5938)
To horizontally scale streaming aggregation, you might want to deploy a separate hashing tier 
of vmagents that route to a separate aggregation tier. The hashing tier should shard by all labels 
except the instance-level labels, to ensure the input metrics are routed correctly to the aggregator 
instance responsible for those labels.
For this to achieve we introduce `remoteWrite.shardByURL.inverseLabels` flag to inverse logic of `remoteWrite.shardByURL.labels`

---------

Co-authored-by: Eugene Ma <eugene.ma@airbnb.com>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
2024-03-29 13:26:02 +01:00
Andrii Chubatiuk
509df44d03
app/{vmagent,vminsert}: fixed firehose response (#6016) 2024-03-26 13:20:41 +01:00
Alexander Marshalov
02bccd1eb9
[vmagent] added ingestion rate limiting with new flag -maxIngestionRate (#5900)
* [vmagent] added ingestion rate limiting with new flag `-maxIngestionRate`. This flag can be used to limit the number of samples ingested by vmagent per second. If the limit is exceeded, the ingestion rate will be throttled.

* fix changelog

* fix review comment
2024-03-21 17:14:49 +01:00
Aliaksandr Valialkin
1cedaf61cb
app/{vmagent,vminsert}: add an ability to ignore input samples outside the current aggregation interval for stream aggregation
See https://docs.victoriametrics.com/stream-aggregation.html#ignoring-old-samples
2024-03-17 23:03:47 +02:00
Aliaksandr Valialkin
b4b38f782c
app/vmagent/remotewrite: clarify the reason behind the default value for -remoteWrite.queues in the same way as the reason for -maxConcurrentInserts is defined at 73f5fb0f0c 2024-03-06 13:43:08 +02:00
Aliaksandr Valialkin
da611ad628
app/{vmagent,vminsert}: add -streamAggr.dropInputSamples command-line flag for dropping the specified labels from input samples before deduplication and streaming aggregation 2024-03-05 02:15:01 +02:00
Aliaksandr Valialkin
ed523b5bbc
app/{vminsert,vmagent}: allow using -streamAggr.dedupInterval without -streamAggr.config
This allows performing online de-duplication of incoming samples
2024-03-05 00:45:30 +02:00
Aliaksandr Valialkin
ac3cf3f357
lib/streamaggr: enable time alignment for aggregate flushed to multiples of interval
For example, if `interval: 1m`, then data flush occurs at the end of every minute,
while `interval: 1h` leads to data flush at the end of every hour.

Add `no_align_flush_to_interval` option, which can be used for disabling the alignment.
2024-03-04 05:42:58 +02:00
Aliaksandr Valialkin
cfe774ab50
app/{vmagent,vminsert}: follow-up for 434a5803e7
Document the /opentelemetry/v1/metrics endpoint instead of /opentelemetry/api/v1/push,
since the /v1/metrics suffix is hardcoded in OpenTelemetry protocol specification.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5871
2024-02-29 18:02:50 +02:00
Nikolay
434a5803e7
app/{vmagent,vminsert}: adds /v1/metrics suffix for opentelemetry route path (#5871)
* app/{vmagent,vminsert}: adds /v1/metrics suffix for opentelemetry route path
it must fix compatibility with opentemetry-collector [spec](https://opentelemetry.io/docs/specs/otlp/\#otlphttp-request)
this suffix is hard-coded and cannot be changed with collector configuration

* Apply suggestions from code review

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2024-02-29 17:58:49 +02:00
Aliaksandr Valialkin
04d13f6149
app/{vminsert,vmagent}: follow-up after 67a55b89a4
- Document the ability to read OpenTelemetry data from Amazon Firehose at docs/CHANGELOG.md

- Simplify parsing Firehose data. There is no need in trying to optimize the parsing with fastjson
  and byte slice tricks, since OpenTelemetry protocol is really slooow because of over-engineering.
  It is better to write clear code for better maintanability in the future.

- Move Firehose parser from /lib/protoparser/firehose to lib/protoparser/opentelemetry/firehose,
  since it is used only by opentelemetry parser.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5893
2024-02-29 14:38:23 +02:00
Andrii Chubatiuk
67a55b89a4
{vmagent,vminsert}: added firehose http destination opentelemetry data ingestion support (#5893)
Co-authored-by: Andrii Chubatiuk <wachy@Andriis-MBP-2.lan>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2024-02-29 14:03:24 +02:00
Aliaksandr Valialkin
6697da73e5
app: consistently use atomic.* types instead of atomic.* functions
See ea9e2b19a5
2024-02-24 02:44:24 +02:00
Aliaksandr Valialkin
7e1dd8ab9d
lib: consistently use atomic.* types instead of atomic.* functions
See ea9e2b19a5
2024-02-24 02:07:53 +02:00
Anton L
d68bb658ce
#5833 Fix Deadlock when using shardByURL of VMAgent (#5834) 2024-02-23 00:59:47 +02:00
Aliaksandr Valialkin
e963d6c789
app/vmagent/remotewrite: add -remoteWrite.tlsHandshakeTimeout command-line flag for tuning tls handshake timeout to -remoteWrite.url
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1699
2024-02-13 02:46:33 +02:00
Aliaksandr Valialkin
ae8a867924
all: add support for specifying multiple -httpListenAddr options 2024-02-09 03:15:04 +02:00
Aliaksandr Valialkin
541b644d3d
app/{vmagent,vminsert}: follow-up after a1d1ccd6f2
- Document the change at docs/CHANGELOG.md
- Copy changes from docs/Single-server-VictoriaMetrics.md to README.md
- Add missing handler for processing multitenant requests ( https://docs.victoriametrics.com/vmagent/#multitenancy )
- Substitute github.com/stretchr/testify dependency with 3 lines of code in the added tests
- Comment unclear code at lib/protoparser/datadogsketches/parser.go , so @AndrewChubatiuk could update it
  and add permalinks to the original source code there.
- Various code cleanups

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5584
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3091
2024-02-07 01:28:05 +02:00
Andrii Chubatiuk
a1d1ccd6f2
support datadog /api/beta/sketches API (#5584)
Co-authored-by: Andrew Chubatiuk <andrew.chubatiuk@motional.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2024-02-06 20:58:11 +00:00
Aliaksandr Valialkin
583b6fe1e7
app/vmagent/remotewrite: limit the concurrency for marshaling time series before sending them to remote storage
There is no sense in running more than GOMAXPROCS concurrent marshalers,
since they are CPU-bound. More concurrent marshalers do not increase the marshaling bandwidth,
but they may result in more RAM usage.
2024-01-30 12:18:19 +02:00
Aliaksandr Valialkin
3449d563bd
all: add up to 10% random jitter to the interval between periodic tasks performed by various components
This should smooth CPU and RAM usage spikes related to these periodic tasks,
by reducing the probability that multiple concurrent periodic tasks are performed at the same time.
2024-01-22 18:40:32 +02:00
Aliaksandr Valialkin
1f105dde98
all: allow dynamically reading *AuthKey flag values from files and urls
Examples:

1) -metricsAuthKey=file:///abs/path/to/file - reads flag value from the given absolute filepath
2) -metricsAuthKey=file://./relative/path/to/file - reads flag value from the given relative filepath
3) -metricsAuthKey=http://some-host/some/path?query_arg=abc - reads flag value from the given url

The flag value is automatically updated when the file contents changes.
2024-01-21 22:03:38 +02:00
Aliaksandr Valialkin
be509b3995
lib/pushmetrics: wait until the background goroutines, which push metrics, are stopped at pushmetrics.Stop()
Previously the was a race condition when the background goroutine still could try collecting metrics
from already stopped resources after returning from pushmetrics.Stop().
Now the pushmetrics.Stop() waits until the background goroutine is stopped before returning.

This is a follow-up for https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5549
and the commit fe2d9f6646 .

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5548
2024-01-15 13:50:36 +02:00
Aliaksandr Valialkin
d2c94a0663
lib/prompbmarshal: switch to github.com/VictoriaMetrics/easyproto 2024-01-14 23:04:45 +02:00
Aliaksandr Valialkin
f2229c2e42
lib/prompb: change type of Label.Name and Label.Value from []byte to string
This makes it more consistent with lib/prompbmarshal.Label
2024-01-14 22:33:21 +02:00
hagen1778
91ccea236f
app/all: follow-up after 84d710beab
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5548
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-09 13:34:54 +01:00
Aliaksandr Valialkin
7575f5c501
lib/protoparser/datadogv2: take into account source_type_name field, since it contains useful value such as kubernetes, docker, system, etc. 2023-12-21 23:05:41 +02:00
Aliaksandr Valialkin
fb90a56de2
app/{vminsert,vmagent}: preliminary support for /api/v2/series ingestion from new versions of DataDog Agent
This commit adds only JSON support - https://docs.datadoghq.com/api/latest/metrics/#submit-metrics ,
while recent versions of DataDog Agent send data to /api/v2/series in undocumented Protobuf format.
The support for this format will be added later.

Thanks to @AndrewChubatiuk for the initial implementation at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5094

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4451
2023-12-21 20:50:55 +02:00
Aliaksandr Valialkin
160cc9debd
app/{vmagent,vmalert}: add the ability to set OAuth2 endpoint params via the corresponding *.oauth2.endpointParams command-line flags
This is a follow-up for 5ebd5a0d7b

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5427
2023-12-20 21:35:28 +02:00
Morgan
5ebd5a0d7b
Expose OAuth2 Endpoint Parameters to cli (#5427)
The user may which to control the endpoint parameters for instance to
set the audience when requesting an access token. Exposing the
parameters as a map allows for additional use cases without requiring
modification.
2023-12-20 20:16:43 +02:00
Aliaksandr Valialkin
5a88bc973f
all: use Gauge instead of Counter for *_config_last_reload_successful metrics
This allows exposing the correct TYPE metadata for these labels when the app runs with -metrics.exposeMetadata command-line flag.
See https://github.com/VictoriaMetrics/metrics/pull/61#issuecomment-1860085508 for more details.

This is follow-up for 326a77c697
2023-12-20 14:23:42 +02:00
Aliaksandr Valialkin
b1fed78e0b
app: make more clear that -tls enables https at -httpListenAddr 2023-12-10 00:25:01 +02:00