VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-12-03 16:21:14 +01:00

Author	SHA1	Message	Date
Aliaksandr Valialkin	9c4b0334f2	all: consistently use stringsutil.JSONString() for formatting JSON strings with fmt.* functions instead of using "%q" formatter The %q formatter may result in incorrectly formatted JSON string if the original string contains special chars such as \x1b . They must be encoded as \u001b , otherwise the resulting JSON string cannot be parsed by JSON parsers. This is a follow-up for `c0caa69939` See https://github.com/VictoriaMetrics/victorialogs-datasource/issues/24	2024-07-17 13:52:13 +02:00
Aliaksandr Valialkin	ad367c17bf	lib/streamaggr/streamaggr.go: typo fix after `5e29ef5ed5`: IgnoredNaNSamples -> ignoredNaNSamples	2024-07-15 21:58:56 +02:00
Aliaksandr Valialkin	db557b86ee	app/vmagent/remotewrite: follow-up for `f153f54d11` - Move the remaining code responsible for stream aggregation initialization from remotewrite.go to streamaggr.go . This improves code maintainability a bit. - Properly shut down streamaggr.Aggregators initialized inside remotewrite.CheckStreamAggrConfigs(). This prevents from potential resource leaks. - Use separate functions for initializing and reloading of global stream aggregation and per-remoteWrite.url stream aggregation. This makes the code easier to read and maintain. This also fixes INFO and ERROR logs emitted by these functions. - Add an ability to specify `name` option in every stream aggregation config. This option is used as `name` label in metrics exposed by stream aggregation at /metrics page. This simplifies investigation of the exposed metrics. - Add `path` label additionally to `name`, `url` and `position` labels at metrics exposed by streaming aggregation. This label should simplify investigation of the exposed metrics. - Remove `match` and `group` labels from metrics exposed by streaming aggregation, since they have little practical applicability: it is hard to use these labels in query filters and aggregation functions. - Rename the metric `vm_streamaggr_flushed_samples_total` to less misleading `vm_streamaggr_output_samples_total` . This metric shows the number of samples generated by the corresponding streaming aggregation rule. This metric has been added in the commit `861852f262` . See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6462 - Remove the metric `vm_streamaggr_stale_samples_total`, since it is unclear how it can be used in practice. This metric has been added in the commit `861852f262` . See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6462 - Remove Alias and aggrID fields from streamaggr.Options struct, since these fields aren't related to optional params, which could modify the behaviour of the constructed streaming aggregator. Convert the Alias field to regular argument passed to LoadFromFile() function, since this argument is mandatory. - Pass Options arg to LoadFromFile() function by reference, since this structure is quite big. This also allows passing nil instead of Options when default options are enough. - Add `name`, `path`, `url` and `position` labels to `vm_streamaggr_dedup_state_size_bytes` and `vm_streamaggr_dedup_state_items_count` metrics, so they have consistent set of labels comparing to the rest of streaming aggregation metrics. - Convert aggregator.aggrStates field type from `map[string]aggrState` to `[]aggrOutput`, where `aggrOutput` contains the corresponding `aggrState` plus all the related metrics (currently only `vm_streamaggr_output_samples_total` metric is exposed with the corresponding `output` label per each configured output function). This simplifies and speeds up the code responsible for updating per-output metrics. This is a follow-up for the commit `2eb1bc4f81` . See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6604 - Added missing urls to docs ( https://docs.victoriametrics.com/stream-aggregation/ ) in error messages. These urls help users figuring out why VictoriaMetrics or vmagent generates the corresponding error messages. The urls were removed for unknown reason in the commit `2eb1bc4f81` . - Fix incorrect update for `vm_streamaggr_output_samples_total` metric in flushCtx.appendSeriesWithExtraLabel() function. While at it, reduce memory usage by limiting the maximum number of samples per flush to 10K. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5467 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6268	2024-07-15 20:24:01 +02:00
Aliaksandr Valialkin	202e5704e6	vendor: update github.com/VictoriaMetrics/metrics from v1.34.1 to v1.35.0 Fix potential memory leaks across VictoriaMetrics codebase after metrics.UnregisterSet(s) call because of missing s.UnregisterAllMetrics() call. This is a follow-up for `6a6e34ab8e` . It is OK if some vmauth metrics aren't visible for a few microseconds when the previous metrics are unregistered and new metrics weren't registered yet. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6247 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4690 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6252 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5805	2024-07-15 10:43:37 +02:00
Aliaksandr Valialkin	48ec66883a	lib/streamaggr: consistently use alphabetical order of benchmarked stream aggregation outputs	2024-07-15 09:53:19 +02:00
Aliaksandr Valialkin	5354374b62	lib/streamaggr: follow-up for `9c3d44c8c9` - Consistently enumerate stream aggregation outputs in alphabetical order across the source code and docs. This should simplify future maintenance of the corresponding code and docs. - Fix the link to `rate_sum()` at `see also` section of `rate_avg()` docs. - Make more clear the docs for `rate_sum()` and `rate_avg()` outputs. - Encapsulate output metric suffix inside rateAggrState. This eliminates possible bugs related to incorrect suffix passing to newRateAggrState(). - Rename rateAggrState.total field to less misleading rateAggrState.increase name, since it calculates counter increase in the current aggregation window. - Set rateLastValueState.prevTimestamp on the first sample in time series instead of the second sample. This makes more clear the code logic. - Move the code for removing outdated entries at rateAggrState into removeOldEntries() function. This make the code logic inside rateAggrState.flushState() more clear. - Do not write output sample with zero value if there are no input series, which could be used for calculating the rate, e.g. if only a single sample is registered for every input series. - Do not take into account input series with a single registered sample when calculating rate_avg(), since this leads to incorrect results. - Move {rate,total}AggrState.flushState() function to the end of rate.go and total.go files, so they look more similar. This shuld simplify future mantenance. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6243	2024-07-15 08:40:09 +02:00
hagen1778	2f65956259	lib/streamaggr: add missing test cases Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-07-12 11:06:45 +02:00
Hui Wang	2eb1bc4f81	vmagent: fix `vm_streamaggr_flushed_samples_total` counter (#6604 ) We use `vm_streamaggr_flushed_samples_total` to show the number of produced samples by aggregation rule, previously it was overcounted, and doesn't account for `output_relabel_configs`. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6462 --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com>	2024-07-12 10:56:07 +02:00
Aliaksandr Valialkin	3c02937a34	all: consistently use 'any' instead of 'interface{}' 'any' type is supported starting from Go1.18. Let's consistently use it instead of 'interface{}' type across the code base, since `any` is easier to read than 'interface{}'.	2024-07-10 00:20:37 +02:00
Aliaksandr Valialkin	cc4d57d650	app/vmagent/remotewrite,lib/streamaggr: re-use common code in tests after `879771808b` - Export streamaggr.LoadFromData() function, so it could be used in tests outside the lib/streamaggr package. This allows removing a hack with creation of temporary files at TestRemoteWriteContext_TryPush_ImmutableTimeseries. - Move common code for mustParsePromMetrics() function into lib/prompbmarshal package, so it could be used in tests for building []prompbmarshal.TimeSeries from string. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6205 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6206	2024-07-03 15:21:36 +02:00
Aliaksandr Valialkin	f17b408643	lib/streamaggr: follow-up for the commit `c0e4ccb7b5` - Clarify docs for `Ignore aggregation intervals on start` feature. - Make more clear the code dealing with ignoreFirstIntervals at aggregator.runFlusher() functions. It is better from readability and maintainability PoV using distinct a.flush() calls for distinct cases instead of merging them into a single a.flush() call. - Take into account the first incomplete interval when tracking the number of skipped aggregation intervals, since this behaviour is easier to understand by the end users. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6137	2024-07-02 21:24:50 +02:00
Andrii Chubatiuk	861852f262	lib/streamaggr: added stale samples metric, added metrics labels (#6462 ) ### Describe Your Changes - added stale metrics counters for input and output samples - added labels for aggregator metrics => `name="{rwctx}:{aggrId}:{aggrSuffix}"` - rwctx - global or number starting from 1 - aggrid - aggregator id starting from 1 - aggrSuffix - <interval>_(by\|without)_label1_label2_labeln e.g: `name="global:1:1m_without_instance_pod"` ### Checklist The following checks are mandatory: - [ ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com>	2024-07-01 14:56:17 +02:00
hagen1778	34771ab293	lib/streamaggr: remove accidentally committed changes Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-06-17 14:24:54 +02:00
Roman Khavronenko	6149adbe10	app/vmselect/promql: check for ranged vectors in aggr funcs if implicit conversions are disabled (#6450 ) Check for ranged vector arguments in aggregate expressions when `-search.disableImplicitConversion` or `-search.logImplicitConversion` are enabled. For example, `sum(up[5m])` will fail to execute if these flags are set. ### Describe Your Changes Please provide a brief description of the changes you made. Be as specific as possible to help others understand the purpose and impact of your modifications. ### Checklist The following checks are mandatory: - [*] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-06-17 14:21:16 +02:00
Roman Khavronenko	51d19485bb	lib/streamaggr: prevent `rate_sum` and `rate_avg` from producing NaNs (#6482 ) ### Describe Your Changes * check if `lastValue` was seen at least twice with different timestamps. Otherwise, the difference between last timestamp and previous timestamp could be `0` and will result into `NaN` calculation * check if there items left in lastValue map after staleness cleanup. Otherwise, `rate_avg` could have produce `NaN` result. ### Checklist The following checks are mandatory: - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-06-14 10:06:22 +02:00
Aliaksandr Valialkin	65a97317e4	lib/streamaggr: prevent from data race inside dedupAggrShard when samplesBuf can be updated in pushSamples() while their values are read in the flush() loop without das.mu lock This issue has been introduced in the commit `253c0cffbe`	2024-06-11 17:31:16 +02:00
Aliaksandr Valialkin	bf2d299420	lib/streamaggr: return back string interning to dedupAggr after 78953723200f15ffc417064d1912bdbb7551505c It should reduce memory allocation rate during stream deduplication	2024-06-10 18:05:42 +02:00
Aliaksandr Valialkin	253c0cffbe	lib/streamaggr: reduce memory allocations by using dedupAggrSample buffer per each dedupAggrShard	2024-06-10 16:38:42 +02:00
Aliaksandr Valialkin	a1e8003754	lib/streamaggr: reduce the number of duplicates per each sample in BenchmarkDedupAggr from 100 to 2 This is closer to typical production setups when deduplication is used for de-duplicating of 2 samples per series.	2024-06-10 16:38:41 +02:00
Aliaksandr Valialkin	0b7c47a40c	lib/streamaggr: use strings.Clone() instead of bytesutil.InternString() for creating series key in dedupAggr Our internal testing shows that this reduces GC overhead when deduplicating tens of millions of active series.	2024-06-10 16:08:34 +02:00
Aliaksandr Valialkin	e8bb4359bb	lib/streamaggr: improve performance for dedupAggr.sizeBytes() and dedupAggr.itemsCount() These functions are called every time `/metrics` page is scraped, so it would be great if they could be sped up for the cases when dedupAggr tracks tens of millions of active time series.	2024-06-10 15:59:37 +02:00
Aliaksandr Valialkin	f45d02a243	lib/streamaggr: remove flushState arg at dedupAggr.flush(), since it is always set to true in production	2024-06-10 15:59:33 +02:00
Aliaksandr Valialkin	36be090cd5	lib/streamaggr: follow-up for `7cb894a777` - Use bytesutil.InternString() instead of strings.Clone() for inputKey and outputKey in aggregatorpushSamples(). This should reduce string allocation rate, since strings can be re-used between aggrState flushes. - Reduce memory allocations at dedupAggrShard by storing dedupAggrSample by value in the active series map. - Remove duplicate call to bytesutil.InternBytes() at Deduplicator, since it is already called inside dedupAggr.pushSamples(). - Add missing string interning at rateAggrState.pushSamples(). Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6402	2024-06-07 16:27:26 +02:00
Roman Khavronenko	7cb894a777	lib/streamaggr: reduce number of inuse objects (#6402 ) The main change is getting rid of interning of sample key. It was discovered that for cases with many unique time series aggregated by vmagent interned keys could grow up to hundreds of millions of objects. This has negative impact on the following aspects: 1. It slows down garbage collection cycles, as GC has to scan all inuse objects periodically. The higher is the number of inuse objects, the longer it takes/the more CPU it takes. 2. It slows down the hot path of samples aggregation where each key needs to be looked up in the map first. The change makes code more fragile, but suppose to provide performance optimization for heavy-loaded vmagents with stream aggregation enabled. --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-06-07 15:45:52 +02:00
Andrii Chubatiuk	185fac03b3	lib/streamaggr: metrics to track dropped, nan samples and samples lag (#6358 ) ### Describe Your Changes Added streamaggr metrics to: - `vm_streamaggr_samples_lag_seconds` - samples lag - `vm_streamaggr_ignored_samples_total{reason="nan"}` - ignored NaN samples - `vm_streamaggr_ignored_samples_total{reason="too_old"}` - ignored old samples	2024-06-06 14:06:11 +02:00
Roman Khavronenko	7ce052b32d	lib/streamaggr: skip empty aggregators (#6307 ) Prevent excessive resource usage when stream aggregation config file contains no matchers by prevent pushing data into Aggregators object. Before this change a lot of extra work was invoked without reason. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-05-20 14:03:28 +02:00
Andrii Chubatiuk	f153f54d11	app/vmagent: add global aggregator (#6268 ) Add global stream aggregation for VMAgent https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5467	2024-05-17 14:00:47 +02:00
Aliaksandr Valialkin	b617dc9c0b	lib/streamaggr: properly return output key from getOutputKey The bug has been introduced in `cc2647d212`	2024-05-14 17:47:21 +02:00
Aliaksandr Valialkin	cc2647d212	lib/encoding: optimize UnmarshalVarUint64, UnmarshalVarInt64 and UnmarshalBytes a bit Change the return values for these functions - now they return the unmarshaled result plus the size of the unmarshaled result in bytes, so the caller could re-slice the src for further unmarshaling. This improves performance of these functions in hot loops of VictoriaLogs a bit.	2024-05-14 01:23:54 +02:00
Andrii Chubatiuk	ce25d68b45	lib/streamaggr: added rate_sum and rate_avg to benchmarks, lint fix (#6264 ) fixed lint for rate outputs	2024-05-13 16:40:37 +02:00
Andrii Chubatiuk	9c3d44c8c9	lib/streamaggr: added rate and rate_avg output (#6243 ) Added `rate` and `rate_avg` output Resource usage is the same as for increase output, tested on a benchmark --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com>	2024-05-13 15:39:49 +02:00
Roman Khavronenko	8a03e987cb	lib/streamaggr: set correct suffix `<output>_prometheus` (#6228 ) Set correct suffix `<output>_prometheus` for aggregation outputs `increase_prometheus` and `total_prometheus` Before, outputs `total` and `total_prometheus` or `increase` and `increase_prometheus` had the same suffix. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-05-08 13:11:30 +02:00
Andrii Chubatiuk	a9283e06a3	streamaggr: made labels compressor shared (#6173 ) Though labels compressor is quite resource intensive, each aggregator and deduplicator instance has it's own compressor. Made it shared across all aggregators to consume less resources while using multiple aggregators. Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>	2024-05-08 13:10:53 +02:00
hagen1778	bae3874e6a	app/streamaggr: follow-up after `c0e4ccb7b5` * rm vmagent mentions from vminsert flags * improve documentation wording, add links to related sections * mention `ignore_first_intervals` in the stream aggr options * update flags description * add basic test for config parsing validation Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-04-22 14:22:59 +02:00
Andrii Chubatiuk	c0e4ccb7b5	lib/streamaggr: add option to ignore first N aggregation intervals (#6137 ) Stream aggregation may yield inaccurate results if it processes incomplete data. This issue can arise when data is sourced from clients that maintain a queue of unsent data, such as Prometheus or vmagent. If the queue isn't fully cleared within the aggregation interval, only a portion of the time series may be included in that period, leading to distorted calculations. To mitigate this we add an option to ignore first N aggregation intervals. It is expected, that client queues will be cleared during the time while aggregation ignores first N intervals and all subsequent aggregations will be correct.	2024-04-22 13:52:04 +02:00
Aliaksandr Valialkin	4d2b9fe6b2	all: replace old https://docs.victoriametrics.com/stream-aggregation.html url with the new one - https://docs.victoriametrics.com/stream-aggregation/	2024-04-18 02:19:11 +02:00
Aliaksandr Valialkin	f8d10a7106	lib/streamaggr: update the minimum allowed timestamp for incoming samples before flushing the samples to the storage This should prevent from dropping samples with old timestamps during long flushes. This is a follow-up for `1cedaf61cb`	2024-04-04 02:25:51 +03:00
Aliaksandr Valialkin	918cccaddf	all: fix golangci-lint(revive) warnings after `0c0ed61ce7` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6001	2024-04-02 23:16:29 +03:00
Aliaksandr Valialkin	4553521f9a	lib/streamaggr: ignore out of order samples for `last` output This is a follow-up for `6a465f6e29` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5931	2024-03-18 01:03:36 +02:00
Aliaksandr Valialkin	1cedaf61cb	app/{vmagent,vminsert}: add an ability to ignore input samples outside the current aggregation interval for stream aggregation See https://docs.victoriametrics.com/stream-aggregation.html#ignoring-old-samples	2024-03-17 23:03:47 +02:00
Aliaksandr Valialkin	6a465f6e29	lib/streamaggr: ignore out of order samples when calculating increase, increase_prometheus, total and total_prometheus outputs Out of order samples may result in unexpected spikes for these outputs. So it is better to ignore such samples. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5931	2024-03-17 22:03:03 +02:00
Aliaksandr Valialkin	cbd80efcc1	lib/streamaggr: follow-up for `15e33d56f1` - Properly set pushSample.timestamp when flushing de-duplicated samples to stream aggregation This is needed for https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5931 - Re-classify this change as feature instead of bugfix at docs/CHANGELOG.md - Verify de-duplication logic for samples with different timestamps Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5643 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5939	2024-03-17 21:37:16 +02:00
Andrii Chubatiuk	15e33d56f1	lib/streamaggr: pick sample with bigger timestamp or value on deduplicator (#5939 ) Apply the same deduplication logic as in https://docs.victoriametrics.com/#deduplication This would require more memory for deduplication, since we need to track timestamp for each record. However, deduplication should become more consistent. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5643 --------- Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>	2024-03-12 22:47:29 +01:00
Aliaksandr Valialkin	5582a24ecf	lib/streamaggr: add tests for keep_metric_names and drop_input_labels options	2024-03-06 18:34:04 +02:00
Aliaksandr Valialkin	da611ad628	app/{vmagent,vminsert}: add `-streamAggr.dropInputSamples` command-line flag for dropping the specified labels from input samples before deduplication and streaming aggregation	2024-03-05 02:15:01 +02:00
Aliaksandr Valialkin	ed523b5bbc	app/{vminsert,vmagent}: allow using -streamAggr.dedupInterval without -streamAggr.config This allows performing online de-duplication of incoming samples	2024-03-05 00:45:30 +02:00
Aliaksandr Valialkin	22d63ac7cd	lib/streamaggr: do not reset aggregation state after the aggregation took longer than the configured interval It is better from user PoV preserving this state until the next flush	2024-03-04 20:03:06 +02:00
Aliaksandr Valialkin	32653db7d5	lib/streamaggr: add missing "s" suffix in the warning message when the de-duplication or aggregation couldnt be finished in a timely manner	2024-03-04 19:37:58 +02:00
Aliaksandr Valialkin	6319d029a8	lib/streamaggr: benchmark only flush routines in BenchmarkDedupAggrFlushSerial and BenchmarkAggregatorsFlushSerial	2024-03-04 19:12:28 +02:00
Aliaksandr Valialkin	074abd5bee	Revert "lib/streamaggr: do not flush dedup shards in parallel" This reverts commit `eb40395a1c`. Reason for revert: it has been appeared that the performance gain on multiple CPU cores wasn't visible because the benchmark was generating incorrect pushSample.key. See a207e0bf687d65f5198207477248d70c69284296	2024-03-04 19:12:28 +02:00

1 2

93 Commits