From 7295f079481b672226bb5af03971aa24536caf34 Mon Sep 17 00:00:00 2001 From: Vika Date: Tue, 5 Mar 2024 00:28:16 +0000 Subject: [PATCH] update wiki pages --- CHANGELOG.md | 1 + README.md | 6 +++++- Single-server-VictoriaMetrics.md | 6 +++++- stream-aggregation.md | 35 ++++++++++++++++++++++++++++++++ vmagent.md | 20 ++++++++---------- 5 files changed, 55 insertions(+), 13 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 0f369d7..9e6eb8e 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -32,6 +32,7 @@ See also [LTS releases](https://docs.victoriametrics.com/lts-releases/). * FEATURE: [stream aggregation](https://docs.victoriametrics.com/stream-aggregation/): reduce memory usage by up to 5x when aggregating over big number of unique [time series](https://docs.victoriametrics.com/keyconcepts/#time-series). The memory usage reduction is most visible when [stream deduplication](https://docs.victoriametrics.com/stream-aggregation/#deduplication) is enabled. The downside is increased CPU usage by up to 30%. * FEATURE: [stream aggregation](https://docs.victoriametrics.com/stream-aggregation/): allow using `-streamAggr.dedupInterval` and `-remoteWrite.streamAggr.dedupInterval` command-line flags without the need to specify `-streamAggr.config` and `-remoteWrite.streamAggr.config`. See [these docs](https://docs.victoriametrics.com/stream-aggregation/#deduplication). +* FEATURE: [stream aggregation](https://docs.victoriametrics.com/stream-aggregation/): add `-streamAggr.dropInputLabels` command-line flag, which can be used for dropping the listed labels from input samples before applying stream [de-duplication](https://docs.victoriametrics.com/stream-aggregation/#deduplication) and aggregation. This is faster and easier to use alternative to [input_relabel_configs](https://docs.victoriametrics.com/stream-aggregation/#relabeling). See [these docs](https://docs.victoriametrics.com/stream-aggregation/#dropping-unneeded-labels). * FEATURE: [stream aggregation](https://docs.victoriametrics.com/stream-aggregation/): add `dedup_interval` option, which allows configuring individual [deduplication intervals](https://docs.victoriametrics.com/stream-aggregation/#deduplication) per each [stream aggregation config](https://docs.victoriametrics.com/stream-aggregation/#stream-aggregation-config). * FEATURE: [stream aggregation](https://docs.victoriametrics.com/stream-aggregation/): add `keep_metric_names` option, which can be set at [stream aggregation config](https://docs.victoriametrics.com/stream-aggregation/#stream-aggregation-config) in order to keep the original metric names in the output aggregated samples instead of using [the default output metric naming scheme](https://docs.victoriametrics.com/stream-aggregation/#output-metric-names). * FEATURE: [stream aggregation](https://docs.victoriametrics.com/stream-aggregation/): align the time of aggregated data flush to the specified aggregation `interval`. For example, if `interval` is set to `1m`, then the aggregated data will be flushed at the end of every minute. The alginment can be disabled by setting `no_align_flush_to_interval: true` option at [stream aggregation config](https://docs.victoriametrics.com/stream-aggregation/#stream-aggregation-config). See [these docs](https://docs.victoriametrics.com/stream-aggregation/#flush-time-alignment) for details. diff --git a/README.md b/README.md index e5f89e7..73f8a7f 100644 --- a/README.md +++ b/README.md @@ -3110,9 +3110,13 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li -streamAggr.config string Optional path to file with stream aggregation config. See https://docs.victoriametrics.com/stream-aggregation.html . See also -streamAggr.keepInput, -streamAggr.dropInput and -streamAggr.dedupInterval -streamAggr.dedupInterval duration - Input samples are de-duplicated with this interval before optional aggregation with -streamAggr.config . See also -dedup.minScrapeInterval and https://docs.victoriametrics.com/stream-aggregation.html#deduplication + Input samples are de-duplicated with this interval before optional aggregation with -streamAggr.config . See also -streamAggr.dropInputLabels and -dedup.minScrapeInterval and https://docs.victoriametrics.com/stream-aggregation.html#deduplication -streamAggr.dropInput Whether to drop all the input samples after the aggregation with -streamAggr.config. By default, only aggregated samples are dropped, while the remaining samples are stored in the database. See also -streamAggr.keepInput and https://docs.victoriametrics.com/stream-aggregation.html + -streamAggr.dropInputLabels array + An optional list of labels to drop from samples before stream de-duplication and aggregation . See https://docs.victoriametrics.com/stream-aggregation.html#dropping-unneeded-labels + Supports an array of values separated by comma or specified via multiple flags. + Value can contain comma inside single-quoted or double-quoted string, {}, [] and () braces. -streamAggr.keepInput Whether to keep all the input samples after the aggregation with -streamAggr.config. By default, only aggregated samples are dropped, while the remaining samples are stored in the database. See also -streamAggr.dropInput and https://docs.victoriametrics.com/stream-aggregation.html -tls array diff --git a/Single-server-VictoriaMetrics.md b/Single-server-VictoriaMetrics.md index 075e805..ee8d39f 100644 --- a/Single-server-VictoriaMetrics.md +++ b/Single-server-VictoriaMetrics.md @@ -3118,9 +3118,13 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li -streamAggr.config string Optional path to file with stream aggregation config. See https://docs.victoriametrics.com/stream-aggregation.html . See also -streamAggr.keepInput, -streamAggr.dropInput and -streamAggr.dedupInterval -streamAggr.dedupInterval duration - Input samples are de-duplicated with this interval before optional aggregation with -streamAggr.config . See also -dedup.minScrapeInterval and https://docs.victoriametrics.com/stream-aggregation.html#deduplication + Input samples are de-duplicated with this interval before optional aggregation with -streamAggr.config . See also -streamAggr.dropInputLabels and -dedup.minScrapeInterval and https://docs.victoriametrics.com/stream-aggregation.html#deduplication -streamAggr.dropInput Whether to drop all the input samples after the aggregation with -streamAggr.config. By default, only aggregated samples are dropped, while the remaining samples are stored in the database. See also -streamAggr.keepInput and https://docs.victoriametrics.com/stream-aggregation.html + -streamAggr.dropInputLabels array + An optional list of labels to drop from samples before stream de-duplication and aggregation . See https://docs.victoriametrics.com/stream-aggregation.html#dropping-unneeded-labels + Supports an array of values separated by comma or specified via multiple flags. + Value can contain comma inside single-quoted or double-quoted string, {}, [] and () braces. -streamAggr.keepInput Whether to keep all the input samples after the aggregation with -streamAggr.config. By default, only aggregated samples are dropped, while the remaining samples are stored in the database. See also -streamAggr.dropInput and https://docs.victoriametrics.com/stream-aggregation.html -tls array diff --git a/stream-aggregation.md b/stream-aggregation.md index 2962ec1..aa70aa0 100644 --- a/stream-aggregation.md +++ b/stream-aggregation.md @@ -76,6 +76,8 @@ to the configured `-remoteWrite.url`. The de-duplication can be enabled via the - By specifying `dedup_interval` option individually per each [stream aggregation config](#stream-aggregation-config) at `-streamAggr.config`. +It is possible to drop the given labels before applying the de-duplication. See [these docs](#dropping-unneeded-labels). + The online de-duplication doesn't take into account timestamps associated with the de-duplicated samples - it just leaves the last seen sample on the configured deduplication interval. If you need taking into account timestamps during the de-duplication, then use [`-dedup.minScrapeInterval` command-line flag](https://docs.victoriametrics.com/#deduplication). @@ -447,6 +449,32 @@ Another option to remove the suffix, which is added by stream aggregation, is to keep_metric_names: true ``` +See also [dropping unneded labels](#dropping-unneeded-labels). + + +## Dropping unneeded labels + +If you need dropping some labels from input samples before [input relabeling](#relabeling), [de-duplication](#deduplication) +and [stream aggregation](#aggregation-outputs), then the following options exist: + +- To specify comma-separated list of label names to drop in `-streamAggr.dropInputLabels` command-line flag. + For example, `-streamAggr.dropInputLabels=replica,az` instructs to drop `replica` and `az` labels from input samples + before applying de-duplication and stream aggregation. + +- To specify `drop_input_labels` list with the labels to drop in [stream aggregation config](#stream-aggregation-config). + For example, the following config drops `replica` label from input samples with the name `process_resident_memory_bytes` + before calculating the average over one minute: + + ```yaml + - match: process_resident_memory_bytes + interval: 1m + drop_input_labels: [replica] + outputs: [avg] + keep_metric_names: true + ``` + +Typical use case is to drop `replica` label from samples, which are recevied from high availability replicas. + ## Aggregation outputs The aggregations are calculated during the `interval` specified in the [config](#stream-aggregation-config) @@ -889,6 +917,13 @@ at [single-node VictoriaMetrics](https://docs.victoriametrics.com/Single-server- # # keep_metric_names: false + # drop_input_labels instructs dropping the given labels from input samples. + # The labels' dropping is performed before input_relabel_configs are applied. + # This also means that the labels are dropped before de-duplication ( https://docs.victoriametrics.com/stream-aggregation.html#deduplication ) + # and stream aggregation. + # + # drop_input_labels: [replica, availability_zone] + # input_relabel_configs is an optional relabeling rules, # which are applied to the incoming samples after they pass the match filter # and before being aggregated. diff --git a/vmagent.md b/vmagent.md index 08eb403..556accc 100644 --- a/vmagent.md +++ b/vmagent.md @@ -255,21 +255,15 @@ There is also support for multitenant writes. See [these docs](#multitenancy). [Deduplication at stream aggregation](https://docs.victoriametrics.com/stream-aggregation/#deduplication) allows setting up arbitrary complex de-duplication schemes for the collected samples. Examples: -- The following command instructs `vmagent` to leave only the last sample per each seen [time series](https://docs.victoriametrics.com/keyconcepts/#time-series) per every 60 seconds: +- The following command instructs `vmagent` to send only the last sample per each seen [time series](https://docs.victoriametrics.com/keyconcepts/#time-series) per every 60 seconds: ``` ./vmagent -remoteWrite.url=http://remote-storage/api/v1/write -remoteWrite.streamAggr.dedupInterval=60s ``` -- The following [stream aggregation config](https://docs.victoriametrics.com/stream-aggregation/#stream-aggregation-config) instructs `vmagent` to merge - [time series](https://docs.victoriametrics.com/keyconcepts/#time-series) with different `replica` label values and then to leave only the last sample - per each merged series per ever 60 seconds: - ```yml - - input_relabel_configs: - - action: labeldrop - regex: replica - interval: 60s - keep_metric_names: true - outputs: [last] +- The following command instructs `vmagent` to merge [time series](https://docs.victoriametrics.com/keyconcepts/#time-series) with different `replica` label values + and then to send only the last sample per each merged series per ever 60 seconds: + ``` + ./vmagent -remoteWrite=http://remote-storage/api/v1/write -streamAggr.dropInputLabels=replica -remoteWrite.streamAggr.dedupInterval=60s ``` ## VictoriaMetrics remote write protocol @@ -2173,6 +2167,10 @@ See the docs at https://docs.victoriametrics.com/vmagent.html . The compression level for VictoriaMetrics remote write protocol. Higher values reduce network traffic at the cost of higher CPU usage. Negative values reduce CPU usage at the cost of increased network traffic. See https://docs.victoriametrics.com/vmagent.html#victoriametrics-remote-write-protocol -sortLabels Whether to sort labels for incoming samples before writing them to all the configured remote storage systems. This may be needed for reducing memory usage at remote storage when the order of labels in incoming samples is random. For example, if m{k1="v1",k2="v2"} may be sent as m{k2="v2",k1="v1"}Enabled sorting for labels can slow down ingestion performance a bit + -streamAggr.dropInputLabels array + An optional list of labels to drop from samples before stream de-duplication and aggregation . See https://docs.victoriametrics.com/stream-aggregation.html#dropping-unneeded-labels + Supports an array of values separated by comma or specified via multiple flags. + Value can contain comma inside single-quoted or double-quoted string, {}, [] and () braces. -tls array Whether to enable TLS for incoming HTTP requests at the given -httpListenAddr (aka https). -tlsCertFile and -tlsKeyFile must be set if -tls is set. See also -mtls Supports array of values separated by comma or specified via multiple flags.