- Consistently enumerate stream aggregation outputs in alphabetical order across the source code and docs.
This should simplify future maintenance of the corresponding code and docs.
- Fix the link to `rate_sum()` at `see also` section of `rate_avg()` docs.
- Make more clear the docs for `rate_sum()` and `rate_avg()` outputs.
- Encapsulate output metric suffix inside rateAggrState. This eliminates possible bugs related
to incorrect suffix passing to newRateAggrState().
- Rename rateAggrState.total field to less misleading rateAggrState.increase name, since it calculates
counter increase in the current aggregation window.
- Set rateLastValueState.prevTimestamp on the first sample in time series instead of the second sample.
This makes more clear the code logic.
- Move the code for removing outdated entries at rateAggrState into removeOldEntries() function.
This make the code logic inside rateAggrState.flushState() more clear.
- Do not write output sample with zero value if there are no input series, which could be used
for calculating the rate, e.g. if only a single sample is registered for every input series.
- Do not take into account input series with a single registered sample when calculating rate_avg(),
since this leads to incorrect results.
- Move {rate,total}AggrState.flushState() function to the end of rate.go and total.go files, so they look more similar.
This shuld simplify future mantenance.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6243
We use `vm_streamaggr_flushed_samples_total` to show the number of
produced samples by aggregation rule, previously it was overcounted, and
doesn't account for `output_relabel_configs`.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6462
---------
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 2eb1bc4f81)
- Export streamaggr.LoadFromData() function, so it could be used in tests outside the lib/streamaggr package.
This allows removing a hack with creation of temporary files at TestRemoteWriteContext_TryPush_ImmutableTimeseries.
- Move common code for mustParsePromMetrics() function into lib/prompbmarshal package,
so it could be used in tests for building []prompbmarshal.TimeSeries from string.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6205
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6206
- Clarify docs for `Ignore aggregation intervals on start` feature.
- Make more clear the code dealing with ignoreFirstIntervals at aggregator.runFlusher() functions.
It is better from readability and maintainability PoV using distinct a.flush() calls
for distinct cases instead of merging them into a single a.flush() call.
- Take into account the first incomplete interval when tracking the number of skipped aggregation intervals,
since this behaviour is easier to understand by the end users.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6137
### Describe Your Changes
- added stale metrics counters for input and output samples
- added labels for aggregator metrics =>
`name="{rwctx}:{aggrId}:{aggrSuffix}"`
- rwctx - global or number starting from 1
- aggrid - aggregator id starting from 1
- aggrSuffix - <interval>_(by|without)_label1_label2_labeln
e.g: `name="global:1:1m_without_instance_pod"`
### Checklist
The following checks are **mandatory**:
- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
---------
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 861852f262)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
- Use bytesutil.InternString() instead of strings.Clone() for inputKey and outputKey in aggregatorpushSamples().
This should reduce string allocation rate, since strings can be re-used between aggrState flushes.
- Reduce memory allocations at dedupAggrShard by storing dedupAggrSample by value in the active series map.
- Remove duplicate call to bytesutil.InternBytes() at Deduplicator, since it is already called inside dedupAggr.pushSamples().
- Add missing string interning at rateAggrState.pushSamples().
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6402
The main change is getting rid of interning of sample key. It was
discovered that for cases with many unique time series aggregated by
vmagent interned keys could grow up to hundreds of millions of objects.
This has negative impact on the following aspects:
1. It slows down garbage collection cycles, as GC has to scan all inuse
objects periodically. The higher is the number of inuse objects, the
longer it takes/the more CPU it takes.
2. It slows down the hot path of samples aggregation where each key
needs to be looked up in the map first.
The change makes code more fragile, but suppose to provide performance
optimization for heavy-loaded vmagents with stream aggregation enabled.
---------
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
### Describe Your Changes
Added streamaggr metrics to:
- `vm_streamaggr_samples_lag_seconds` - samples lag
- `vm_streamaggr_ignored_samples_total{reason="nan"}` - ignored NaN
samples
- `vm_streamaggr_ignored_samples_total{reason="too_old"}` - ignored old
samples
(cherry picked from commit 185fac03b3)
Prevent excessive resource usage when stream aggregation config file
contains no matchers by prevent pushing data into Aggregators object.
Before this change a lot of extra work was invoked without reason.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 7ce052b32d)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Change the return values for these functions - now they return the unmarshaled result plus
the size of the unmarshaled result in bytes, so the caller could re-slice the src for further unmarshaling.
This improves performance of these functions in hot loops of VictoriaLogs a bit.
Added `rate` and `rate_avg` output
Resource usage is the same as for increase output, tested on a benchmark
---------
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 9c3d44c8c9)
Though labels compressor is quite resource intensive, each aggregator
and deduplicator instance has it's own compressor. Made it shared across
all aggregators to consume less resources while using multiple
aggregators.
Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>
(cherry picked from commit a9283e06a3)
Stream aggregation may yield inaccurate results if it processes incomplete data.
This issue can arise when data is sourced from clients that maintain a queue of unsent data, such as Prometheus or vmagent.
If the queue isn't fully cleared within the aggregation interval, only a portion of the time series may be included in that period, leading to distorted calculations.
To mitigate this we add an option to ignore first N aggregation intervals. It is expected, that client queues
will be cleared during the time while aggregation ignores first N intervals and all subsequent aggregations
will be correct.
(cherry picked from commit c0e4ccb7b5)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
This reverts commit eb40395a1c.
Reason for revert: it has been appeared that the performance gain on multiple CPU cores
wasn't visible because the benchmark was generating incorrect pushSample.key.
See a207e0bf687d65f5198207477248d70c69284296
Previously samples were dropped on the first incomplete interval and the next complete interval.
Also make sure that the de-duplication is performed just before flushing the aggregate state.
This should help the case then dedup_interval = interval.
For example, if `interval: 1m`, then data flush occurs at the end of every minute,
while `interval: 1h` leads to data flush at the end of every hour.
Add `no_align_flush_to_interval` option, which can be used for disabling the alignment.
- Reduce memory usage by up to 5x when de-duplicating samples across big number of time series.
- Reduce memory usage by up to 5x when aggregating across big number of output time series.
- Add lib/promutils.LabelsCompressor, which is going to be used by other VictoriaMetrics components
for reducing memory usage for marshaled []prompbmarshal.Label.
- Add `dedup_interval` option at aggregation config, which allows setting individual
deduplication intervals per each aggregation.
- Add `keep_metric_names` option at aggregation config, which allows keeping the original
metric names in the output samples.
- Add `unique_samples` output, which counts the number of unique sample values.
- Add `increase_prometheus` and `total_prometheus` outputs, which ignore the first sample
per each newly encountered time series.
- Use 64-bit hashes instead of marshaled labels as map keys when calculating `count_series` output.
This makes obsolete https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5579
- Expose various metrics, which may help debugging stream aggregation:
- vm_streamaggr_dedup_state_size_bytes - the size of data structures responsible for deduplication
- vm_streamaggr_dedup_state_items_count - the number of items in the deduplication data structures
- vm_streamaggr_labels_compressor_size_bytes - the size of labels compressor data structures
- vm_streamaggr_labels_compressor_items_count - the number of entries in the labels compressor
- vm_streamaggr_flush_duration_seconds - a histogram, which shows the duration of stream aggregation flushes
- vm_streamaggr_dedup_flush_duration_seconds - a histogram, which shows the duration of deduplication flushes
- vm_streamaggr_flush_timeouts_total - counter for timed out stream aggregation flushes,
which took longer than the configured interval
- vm_streamaggr_dedup_flush_timeouts_total - counter for timed out deduplication flushes,
which took longer than the configured dedup_interval
- Actualize docs/stream-aggregation.md
The memory usage reduction increases CPU usage during stream aggregation by up to 30%.
This commit is based on https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5850
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5898
Sending unfinished aggregate states tend to produce unexpected anomalies with lower values than expected.
The old behavior can be restored by specifying `flush_on_shutdown: true` setting in streaming aggregation config
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Examples:
1) -metricsAuthKey=file:///abs/path/to/file - reads flag value from the given absolute filepath
2) -metricsAuthKey=file://./relative/path/to/file - reads flag value from the given relative filepath
3) -metricsAuthKey=http://some-host/some/path?query_arg=abc - reads flag value from the given url
The flag value is automatically updated when the file contents changes.
* lib/streamaggr: properly reference slice with labels
by limiting slice capacity. It must fix issues with slice modification, in case of append new slice will be allocated, instead of modifying refrenced slice
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5402
* Reduce memory allocations when output_relabel_configs adds new labels to output samples
---------
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
(cherry picked from commit 41f7940f97)
- Use a byte slice instead of a map for tracking indexes for matching series.
This improves performance, since access by slice index is faster than access by map key.
- Re-use the byte slice for tracking indexes for matching series.
This removes unnecessary memory allocations and improves stream aggregation performance a bit.
- Add an ability to return to the previous behvaiour by specifying -remoteWrite.streamAggr.dropInput command-line flag.
In this case all the input samples are dropped when stream aggregation is enabled.
- Backport the new stream aggregation behaviour from vmagent to single-node VictoriaMetrics when -streamAggr.config
option is set.
- Improve docs regarding this change at docs/CHANGELOG.md
- Document the new behavior at docs/stream-aggregation.md
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4243
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4575
* {lib/streamaggr,vmagent/remotewrite}: breaking change for keepInput flag
Changes default behaviour of keepInput flag to write series which did not match any aggregators to the remote write.
See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4243
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
* Update app/vmagent/remotewrite/remotewrite.go
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
---------
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
Previously all the incoming samples were de-duplicated, even if their series doesn't
match aggregation rule filters. This could result in increased CPU usage.
Now the de-duplication isn't applied to samples for series, which do not match
aggregation rule filters. Such samples are just ignored.