VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-12-25 03:40:10 +01:00

Author	SHA1	Message	Date
Aliaksandr Valialkin	830b871baf	app/vmagent: properly shutdown when -maxIngestionRate limit is reached The remotewrite.Stop() expects that there are no pending calls to TryPush(). This means that the ingestionRateLimiter.Register() must be unblocked inside TryPush() when calling remotewrite.Stop(). Provide remotewrite.StopIngestionRateLimiter() function for unblocking the rate limiter before calling the remotewrite.Stop(). While at it, move the rate limiter into lib/ratelimiter package, since it has two users. Also move the description of the feature to the correct place at docs/CHANGELOG.md. Also cross-reference -remoteWrite.rateLimit and -maxIngestionRate command-line flags. This is a follow-up for `02bccd1eb9` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5900	2024-03-30 06:43:48 +02:00
hagen1778	b6bd9a97a3	app/vmagent: follow-up `166b97b8d0` * add tests for sharding function * update flags description * add changelog note `166b97b8d0` Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-03-29 14:08:08 +01:00
Eugene Ma	166b97b8d0	vmagent: support sharding by excluded labels (#5938 ) To horizontally scale streaming aggregation, you might want to deploy a separate hashing tier of vmagents that route to a separate aggregation tier. The hashing tier should shard by all labels except the instance-level labels, to ensure the input metrics are routed correctly to the aggregator instance responsible for those labels. For this to achieve we introduce `remoteWrite.shardByURL.inverseLabels` flag to inverse logic of `remoteWrite.shardByURL.labels` --------- Co-authored-by: Eugene Ma <eugene.ma@airbnb.com> Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>	2024-03-29 13:26:02 +01:00
Alexander Marshalov	02bccd1eb9	[vmagent] added ingestion rate limiting with new flag `-maxIngestionRate` (#5900 ) * [vmagent] added ingestion rate limiting with new flag `-maxIngestionRate`. This flag can be used to limit the number of samples ingested by vmagent per second. If the limit is exceeded, the ingestion rate will be throttled. * fix changelog * fix review comment	2024-03-21 17:14:49 +01:00
Aliaksandr Valialkin	1cedaf61cb	app/{vmagent,vminsert}: add an ability to ignore input samples outside the current aggregation interval for stream aggregation See https://docs.victoriametrics.com/stream-aggregation.html#ignoring-old-samples	2024-03-17 23:03:47 +02:00
Aliaksandr Valialkin	b4b38f782c	app/vmagent/remotewrite: clarify the reason behind the default value for -remoteWrite.queues in the same way as the reason for -maxConcurrentInserts is defined at `73f5fb0f0c`	2024-03-06 13:43:08 +02:00
Aliaksandr Valialkin	da611ad628	app/{vmagent,vminsert}: add `-streamAggr.dropInputSamples` command-line flag for dropping the specified labels from input samples before deduplication and streaming aggregation	2024-03-05 02:15:01 +02:00
Aliaksandr Valialkin	ed523b5bbc	app/{vminsert,vmagent}: allow using -streamAggr.dedupInterval without -streamAggr.config This allows performing online de-duplication of incoming samples	2024-03-05 00:45:30 +02:00
Aliaksandr Valialkin	ac3cf3f357	lib/streamaggr: enable time alignment for aggregate flushed to multiples of interval For example, if `interval: 1m`, then data flush occurs at the end of every minute, while `interval: 1h` leads to data flush at the end of every hour. Add `no_align_flush_to_interval` option, which can be used for disabling the alignment.	2024-03-04 05:42:58 +02:00
Aliaksandr Valialkin	6697da73e5	app: consistently use atomic.* types instead of atomic.* functions See `ea9e2b19a5`	2024-02-24 02:44:24 +02:00
Anton L	d68bb658ce	#5833 Fix Deadlock when using shardByURL of VMAgent (#5834 )	2024-02-23 00:59:47 +02:00
Aliaksandr Valialkin	e963d6c789	app/vmagent/remotewrite: add -remoteWrite.tlsHandshakeTimeout command-line flag for tuning tls handshake timeout to -remoteWrite.url Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1699	2024-02-13 02:46:33 +02:00
Aliaksandr Valialkin	583b6fe1e7	app/vmagent/remotewrite: limit the concurrency for marshaling time series before sending them to remote storage There is no sense in running more than GOMAXPROCS concurrent marshalers, since they are CPU-bound. More concurrent marshalers do not increase the marshaling bandwidth, but they may result in more RAM usage.	2024-01-30 12:18:19 +02:00
Aliaksandr Valialkin	3449d563bd	all: add up to 10% random jitter to the interval between periodic tasks performed by various components This should smooth CPU and RAM usage spikes related to these periodic tasks, by reducing the probability that multiple concurrent periodic tasks are performed at the same time.	2024-01-22 18:40:32 +02:00
Aliaksandr Valialkin	d2c94a0663	lib/prompbmarshal: switch to github.com/VictoriaMetrics/easyproto	2024-01-14 23:04:45 +02:00
Aliaksandr Valialkin	160cc9debd	app/{vmagent,vmalert}: add the ability to set OAuth2 endpoint params via the corresponding *.oauth2.endpointParams command-line flags This is a follow-up for `5ebd5a0d7b` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5427	2023-12-20 21:35:28 +02:00
Morgan	5ebd5a0d7b	Expose OAuth2 Endpoint Parameters to cli (#5427 ) The user may which to control the endpoint parameters for instance to set the audience when requesting an access token. Exposing the parameters as a map allows for additional use cases without requiring modification.	2023-12-20 20:16:43 +02:00
Aliaksandr Valialkin	5a88bc973f	all: use Gauge instead of Counter for `*_config_last_reload_successful` metrics This allows exposing the correct TYPE metadata for these labels when the app runs with -metrics.exposeMetadata command-line flag. See https://github.com/VictoriaMetrics/metrics/pull/61#issuecomment-1860085508 for more details. This is follow-up for `326a77c697`	2023-12-20 14:23:42 +02:00
Aliaksandr Valialkin	fdbbbf33ca	app/vmagent: add `-enableMultitenantHandlers` command-line flag This flag allows converting tenant id to (vm_account_id, vm_project_id) labels. this flag deprecates `-remoteWrite.multitenantURL` command-line flag, because `-enableMultitenantHandlers` is easier to use and combine with multitenant url at vminsert - https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#multitenancy-via-labels See https://docs.victoriametrics.com/vmagent.html#multitenancy Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1505	2023-12-05 01:28:37 +02:00
Aliaksandr Valialkin	fc2e7a30b3	app/vmagent: properly increase vmagent_remotewrite_samples_dropped_total when scraped samples cannot be sent to the remote storage and -remoteWrite.dropSamplesOnOverload is set This is a follow-up for `5034aa0773` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2110	2023-11-25 14:44:32 +02:00
Aliaksandr Valialkin	5034aa0773	app/vmagent: follow-up for `090cb2c9de` - Add Try* prefix to functions, which return bool result in order to improve readability and reduce the probability of missing check for the result returned from these functions. - Call the adjustSampleValues() only once on input samples. Previously it was called on every attempt to flush data to peristent queue. - Properly restore the initial state of WriteRequest passed to tryPushWriteRequest() before returning from this function after unsuccessful push to persistent queue. Previously a part of WriteRequest samples may be lost in such case. - Add -remoteWrite.dropSamplesOnOverload command-line flag, which can be used for dropping incoming samples instead of returning 429 Too Many Requests error to the client when -remoteWrite.disableOnDiskQueue is set and the remote storage cannot keep up with the data ingestion rate. - Add vmagent_remotewrite_samples_dropped_total metric, which counts the number of dropped samples. - Add vmagent_remotewrite_push_failures_total metric, which counts the number of unsuccessful attempts to push data to persistent queue when -remoteWrite.disableOnDiskQueue is set. - Remove vmagent_remotewrite_aggregation_metrics_dropped_total and vm_promscrape_push_samples_dropped_total metrics, because they are replaced with vmagent_remotewrite_samples_dropped_total metric. - Update 'Disabling on-disk persistence' docs at docs/vmagent.md - Update stale comments in the code Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5088 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2110	2023-11-25 12:09:44 +02:00
Nikolay	090cb2c9de	app/vmagent: allow to disabled on-disk persistence (#5088 ) * app/vmagent: allow to disabled on-disk queue Previously, it wasn't possible to build data processing pipeline with a chain of vmagents. In case when remoteWrite for the last vmagent in the chain wasn't accessible, it persisted data only when it has enough disk capacity. If disk queue is full, it started to silently drop ingested metrics. New flags allows to disable on-disk persistent and immediatly return an error if remoteWrite is not accessible anymore. It blocks any writes and notify client, that data ingestion isn't possible. Main use case for this feature - use external queue such as kafka for data persistence. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2110 * adds test, updates readme * apply review suggestions * update docs for vmagent * makes linter happy --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-11-24 13:42:11 +01:00
Aliaksandr Valialkin	1831c731a3	app/vmagent/remotewrite: do not drop persistent queues when -remoteWrite.multitenantURL is set It is unsafe to drop persistent queues when -remoteWrite.multitenantURL command-line flag is set, since these queues are created on demand when a new sample for the given tenant is pushed to the remote storage. This addresses https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5357 The issue has been appeared in the commit `f3a51e8b1d` when implementing https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4014	2023-11-23 20:40:39 +02:00
Aliaksandr Valialkin	ed70a40669	app/vmagent/remotewrite: add -remoteWrite.shardByURL.labels command-line flag This command-line flag can be used for specifying a list of labels used for sharding among -remoteWrite.url entries when -remoteWrite.shardByURL command-line flag is set. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4942	2023-11-01 23:08:54 +01:00
Aliaksandr Valialkin	d5a599badc	lib/promauth: follow-up for `e16d3f5639` - Make sure that invalid/missing TLS CA file or TLS client certificate files at vmagent startup don't prevent from processing the corresponding scrape targets after the file becomes correct, without the need to restart vmagent. Previously scrape targets with invalid TLS CA file or TLS client certificate files were permanently dropped after the first attempt to initialize them, and they didn't appear until the next vmagent reload or the next change in other places of the loaded scrape configs. - Make sure that TLS CA is properly re-loaded from file after it changes without the need to restart vmagent. Previously the old TLS CA was used until vmagent restart. - Properly handle errors during http request creation for the second attempt to send data to remote system at vmagent and vmalert. Previously failed request creation could result in nil pointer dereferencing, since the returned request is nil on error. - Add more context to the logged error during AWS sigv4 request signing before sending the data to -remoteWrite.url at vmagent. Previously it could miss details on the source of the request. - Do not create a new HTTP client per second when generating OAuth2 token needed to put in Authorization header of every http request issued by vmagent during service discovery or target scraping. Re-use the HTTP client instead until the corresponding scrape config changes. - Cache error at lib/promauth.Config.GetAuthHeader() in the same way as the auth header is cached, e.g. the error is cached for a second now. This should reduce load on CPU and OAuth2 server when auth header cannot be obtained because of temporary error. - Share tls.Config.GetClientCertificate function among multiple scrape targets with the same tls_config. Cache the loaded certificate and the error for one second. This should significantly reduce CPU load when scraping big number of targets with the same tls_config. - Allow loading TLS certificates from HTTP and HTTPs urls by specifying these urls at `tls_config->cert_file` and `tls_config->key_file`. - Improve test coverage at lib/promauth - Skip unreachable or invalid files specified at `scrape_config_files` during vmagent startup, since these files may become valid later. Previously vmagent was exitting in this case. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4959	2023-10-25 23:19:37 +02:00
Hui Wang	e16d3f5639	fix inconsistent behaviors with prometheus when scraping (#5153 ) * fix inconsistent behaviors with prometheus when scraping 1. address https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4959. skip job with wrong syntax in `scrape_configs` with error logs instead of exiting; 2. show error messages on vmagent /targets ui if there are wrong auth configs in `scrape_configs`, previously will print error logs and do scrape without auth header; 3. don't send requests if there are wrong auth configs in: 1. vmagent remoteWrite; 2. vmalert datasource/remoteRead/remoteWrite/notifier. * add changelogs * address review comments * fix ut	2023-10-17 17:58:19 +08:00
Aliaksandr Valialkin	6c3dd16a16	app/vmagent/remotewrite: move sas var initialization closer to the place where it is used This makes the code sligthtly easier to understand. This is a follow-up for `1d3d989be5` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5170	2023-10-16 20:52:56 +02:00
hagen1778	1d3d989be5	app/vmagent/remotewrite: follow-up after `4f102ff945` `4f102ff945` Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-10-16 16:00:24 +02:00
luosjde	4f102ff945	vmagent: fix streamaggr config reload bug https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5170 Authored-by: luoshaojun01 <luoshaojun01@baidu.com>	2023-10-16 15:57:24 +02:00
Aliaksandr Valialkin	0bbc6a5b43	app/vmagent/remotewrite: fix data race when extra labels are added to samples before sending them to multiple remote storage systems See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4972	2023-09-08 23:24:00 +02:00
Aliaksandr Valialkin	edee262ecc	Makefile: update golangci-lint from v1.51.2 to v1.54.2 See https://github.com/golangci/golangci-lint/releases/tag/v1.54.2	2023-09-01 10:16:42 +02:00
Aliaksandr Valialkin	9d2260ed3c	app/vmagent/remotewrite: do not retry request immediately on io.ErrUnexpectedEOF, since this error isn't returned on stale connection Also, mention the https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4139 in comments to the code in order to simplify further maintenance of this code. This is a follow-up for `992a1c0a3a`	2023-08-29 09:48:28 +02:00
hagen1778	757ae4275b	app/vmagent: fix comment typo after `992a1c0a3a` Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-08-24 09:32:11 +02:00
Roman Khavronenko	992a1c0a3a	vmagent: retry failed write request on the closed connection (#4857 ) * vmagent: retry failed write request on the closed connection Retry failed write request on the closed connection immediately, without waiting for backoff. This should improve data delivery speed and reduce amount of error logs emitted by vmagent when using idle connections. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4139 Signed-off-by: hagen1778 <roman@victoriametrics.com> * vmagent: retry failed write request on the closed connection Re-instantinate request before retry as body could have been already spoiled. Signed-off-by: hagen1778 <roman@victoriametrics.com> --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Nikolay <nik@victoriametrics.com>	2023-08-24 00:08:04 +02:00
Aliaksandr Valialkin	9137703729	app/vmagent/remotewrite: follow-up after `a27c2f3773` - Fix Prometheus-compatible naming after applying the relabeling if -usePromCompatibleNaming command-line flag is set. This should prevent from possible Prometheus-incompatible metric names and label names generated by the relabeling. - Do not return anything from relabelCtx.appendExtraLabels() function, since it cannot change the number of time series passed to it. Append labels for the passed time series in-place. - Remove promrelabel.FinalizeLabels() call after adding extra labels to time series, since this call has been already made at relabelCtx.applyRelabeling(). It is user's responsibility if he passes labels with double underscore prefixes to -remoteWrite.label. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4247	2023-08-17 14:43:50 +02:00
Alexander Marshalov	1e1a30ed7f	vmagent: fixed premature release of the context (after #4247 / #4824 ) (#4849 ) Follow-up after `a27c2f3773` https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4247 Signed-off-by: Alexander Marshalov <_@marshalov.org>	2023-08-17 12:15:03 +02:00
Alexander Marshalov	a27c2f3773	fixed applying `remoteWrite.label` for pushed metrics (#4247 ) (#4824 ) vmagent: properly add extra labels before sending data to remote storage labels from `remoteWrite.label` are now added to sent metrics just before they are pushed to `remoteWrite.url` after all relabelings, including stream aggregation relabelings (#4247) https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4247 Signed-off-by: Alexander Marshalov <_@marshalov.org> Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>	2023-08-15 13:47:48 +02:00
Aliaksandr Valialkin	fdae53a75b	lib/promrelabel: properly replace `:` char with `_` in metric names when -usePromCompatibleNaming command-line flag is set This addresses https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3113#issuecomment-1275077071 comment from @johnseekins	2023-08-14 16:14:42 +02:00
Aliaksandr Valialkin	d7067c46d0	lib/flagutil: add defaultValue arg to NewArray{Int,Bytes,Duration} functions The defaultValue is printed in the flag description when passing -help to the app. This is a follow-up for `aef31f201a` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4776	2023-08-12 04:19:05 -07:00
Aliaksandr Valialkin	a19a65b3a5	app/vmagent/remotewrite: go fmt	2023-08-11 06:23:00 -07:00
Aliaksandr Valialkin	4c4bcdf0b1	docs/CHANGELOG.md: add a link to stream aggregation for the description of the bugfix at `a4a1884237` This makes the description more clear. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4804	2023-08-11 05:38:30 -07:00
Aliaksandr Valialkin	2328e4cabc	app/vmagent/remotewrite: keep in sync the default value for -remoteWrite.sendTimeout option in the description with the actually used timeout This is a follow-up for `aef31f201a` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4776	2023-08-11 04:52:00 -07:00
Zakhar Bessarab	a4a1884237	{vmagent/remotewrite,vminsert/common}: fix dropInput and keepInput flags inconsistency (#4809 ) {vmagent/remotewrite,vminsert/common}: fix dropInput and keepInput flags inconsistency Sync behavior for dropInput and keepInput flags between single-node and vmagent. Fix vmagent not respecting dropInput flag and reverse logic for keepInput.	2023-08-10 14:27:21 +02:00
Alexander Marshalov	aef31f201a	add info about `remoteWrite.sendTimeout` default value (#4776 ) Signed-off-by: Alexander Marshalov <_@marshalov.org>	2023-08-03 16:25:11 +04:00
Aliaksandr Valialkin	6b6b61137f	app/vmagent: add ability to shard outgoing data among multiple remote storage systems Add -remoteWrite.shardByURL command-line flag, which instructs vmagent to spread evenly outgoing time series data among the configured remote storage systems specified via -remoteWrite.url . Samples for the same time series go to the same -remoteWrite.url . This allows building horizontally scalable stream aggregation when samples for counter and histogram series must be aggregated by the same second-level vmagent instance. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4637	2023-07-24 18:15:26 -07:00
Aliaksandr Valialkin	52c13e9515	lib/streamaggr: follow-up for `736197179e` - Use a byte slice instead of a map for tracking indexes for matching series. This improves performance, since access by slice index is faster than access by map key. - Re-use the byte slice for tracking indexes for matching series. This removes unnecessary memory allocations and improves stream aggregation performance a bit. - Add an ability to return to the previous behvaiour by specifying -remoteWrite.streamAggr.dropInput command-line flag. In this case all the input samples are dropped when stream aggregation is enabled. - Backport the new stream aggregation behaviour from vmagent to single-node VictoriaMetrics when -streamAggr.config option is set. - Improve docs regarding this change at docs/CHANGELOG.md - Document the new behavior at docs/stream-aggregation.md Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4243 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4575	2023-07-24 17:05:26 -07:00
Zakhar Bessarab	736197179e	{lib/streamaggr,vmagent/remotewrite}: breaking change for keepInput flag (#4575 ) * {lib/streamaggr,vmagent/remotewrite}: breaking change for keepInput flag Changes default behaviour of keepInput flag to write series which did not match any aggregators to the remote write. See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4243 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * Update app/vmagent/remotewrite/remotewrite.go Co-authored-by: Roman Khavronenko <roman@victoriametrics.com> --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Co-authored-by: Roman Khavronenko <roman@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-07-24 16:33:30 -07:00
Aliaksandr Valialkin	140e7b6b74	all: replace atomic.Value with atomic.Pointer[T] This eliminates the need in .(*T) casting for results obtained from Load() Leave atomic.Value for map, since atomic.Pointer[map[...]...] makes double pointer to map, because map is already a pointer type.	2023-07-19 17:42:06 -07:00
Zakhar Bessarab	adc07b711e	app/vmagent/remotewrite: fix error message for auth config (#4545 ) Error message will be present for any auth error, but message claims an error is about OAuth2 configuration which is confusing. Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2023-07-03 13:12:40 +02:00
Zakhar Bessarab	ce7141383d	app/vmagent/remotewrite: fix vmagent panic on shutdown (#4407 ) app/vmagent/remotewrite: fix vmagent panic on shutdown Currently, when vmagent is stopping it first flushes pending series in remote write context and proceeds to stop streaming aggregation. This leads to streaming aggregation being unable to write results into pending timeseries (since it is already nil) and panic. This can lead to losing some aggregation results being lost almost silently. The fix is reordering flow to first stop streaming aggregation and flush all pending time series after that. Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2023-06-07 15:45:43 +02:00

1 2 3 4 5

221 Commits