docs: mention information loss when downsampling gauges (#5204)

Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit 23369321f1) Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-12-15 16:30:55 +01:00 · 2023-10-30 15:29:06 +01:00 · 2023-10-30 15:29:06 +01:00 · afd160f1dc
commit afd160f1dc
parent c6d74108fb
2 changed files with 28 additions and 4 deletions
--- a/docs/README.md
+++ b/docs/README.md
@ -1900,8 +1900,20 @@ See how to request a free trial license [here](https://victoriametrics.com/produ
 * `-downsampling.period=30d:5m,180d:1h` instructs VictoriaMetrics to deduplicate samples older than 30 days with 5 minutes interval and to deduplicate samples older than 180 days with 1 hour interval.
-Downsampling is applied independently per each time series. It can reduce disk space usage and improve query performance if it is applied to time series with big number of samples per each series. The downsampling doesn't improve query performance if the database contains big number of time series with small number of samples per each series (aka [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate)), since downsampling doesn't reduce the number of time series. So the majority of time is spent on searching for the matching time series. 
+Downsampling is applied independently per each time series and leaves a single [raw sample](https://docs.victoriametrics.com/keyConcepts.html#raw-samples)
-It is possible to use [stream aggregation](https://docs.victoriametrics.com/stream-aggregation.html) in vmagent or recording rules in [vmalert](https://docs.victoriametrics.com/vmalert.html) in order to [reduce the number of time series](https://docs.victoriametrics.com/vmalert.html#downsampling-and-aggregation-via-vmalert).
+with the biggest [timestamp](https://en.wikipedia.org/wiki/Unix_time) on the interval, in the same way as [deduplication](#deduplication).
 It works the best for [counters](https://docs.victoriametrics.com/keyConcepts.html#counter) and [histograms](https://docs.victoriametrics.com/keyConcepts.html#histogram),
 as their values are always increasing. But downsampling [gauges](https://docs.victoriametrics.com/keyConcepts.html#gauge)
 would mean losing the changes within the downsampling interval. Please note, you can use [recording rules](https://docs.victoriametrics.com/vmalert.html#rules)
 to apply custom aggregation functions, like min/max/avg etc., in order to make gauges more resilient to downsampling.
 Downsampling can reduce disk space usage and improve query performance if it is applied to time series with big number 
 of samples per each series. The downsampling doesn't improve query performance if the database contains big number 
 of time series with small number of samples per each series (aka [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate)),
 since downsampling doesn't reduce the number of time series. So the majority of time is spent on searching for the matching time series. 
 It is possible to use [stream aggregation](https://docs.victoriametrics.com/stream-aggregation.html) in vmagent or 
 recording rules in [vmalert](https://docs.victoriametrics.com/vmalert.html) in order to 
 [reduce the number of time series](https://docs.victoriametrics.com/vmalert.html#downsampling-and-aggregation-via-vmalert).
 Downsampling happens during [background merges](https://docs.victoriametrics.com/#storage) 
 and can't be performed if there is not enough of free disk space or if vmstorage 
--- a/docs/Single-server-VictoriaMetrics.md
+++ b/docs/Single-server-VictoriaMetrics.md
@ -1908,8 +1908,20 @@ See how to request a free trial license [here](https://victoriametrics.com/produ
 * `-downsampling.period=30d:5m,180d:1h` instructs VictoriaMetrics to deduplicate samples older than 30 days with 5 minutes interval and to deduplicate samples older than 180 days with 1 hour interval.
-Downsampling is applied independently per each time series. It can reduce disk space usage and improve query performance if it is applied to time series with big number of samples per each series. The downsampling doesn't improve query performance if the database contains big number of time series with small number of samples per each series (aka [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate)), since downsampling doesn't reduce the number of time series. So the majority of time is spent on searching for the matching time series. 
+Downsampling is applied independently per each time series and leaves a single [raw sample](https://docs.victoriametrics.com/keyConcepts.html#raw-samples)
-It is possible to use [stream aggregation](https://docs.victoriametrics.com/stream-aggregation.html) in vmagent or recording rules in [vmalert](https://docs.victoriametrics.com/vmalert.html) in order to [reduce the number of time series](https://docs.victoriametrics.com/vmalert.html#downsampling-and-aggregation-via-vmalert).
+with the biggest [timestamp](https://en.wikipedia.org/wiki/Unix_time) on the interval, in the same way as [deduplication](#deduplication).
 It works the best for [counters](https://docs.victoriametrics.com/keyConcepts.html#counter) and [histograms](https://docs.victoriametrics.com/keyConcepts.html#histogram),
 as their values are always increasing. But downsampling [gauges](https://docs.victoriametrics.com/keyConcepts.html#gauge)
 would mean losing the changes within the downsampling interval. Please note, you can use [recording rules](https://docs.victoriametrics.com/vmalert.html#rules)
 to apply custom aggregation functions, like min/max/avg etc., in order to make gauges more resilient to downsampling.
 Downsampling can reduce disk space usage and improve query performance if it is applied to time series with big number 
 of samples per each series. The downsampling doesn't improve query performance if the database contains big number 
 of time series with small number of samples per each series (aka [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate)),
 since downsampling doesn't reduce the number of time series. So the majority of time is spent on searching for the matching time series. 
 It is possible to use [stream aggregation](https://docs.victoriametrics.com/stream-aggregation.html) in vmagent or 
 recording rules in [vmalert](https://docs.victoriametrics.com/vmalert.html) in order to 
 [reduce the number of time series](https://docs.victoriametrics.com/vmalert.html#downsampling-and-aggregation-via-vmalert).
 Downsampling happens during [background merges](https://docs.victoriametrics.com/#storage) 
 and can't be performed if there is not enough of free disk space or if vmstorage