vmagent: updated dashboard and alert for stream aggregation (#6427)

### Describe Your Changes

Added streaming aggregation section to vmagent dashboards
Added alert for streaming aggregation and deduplication flush timeouts
Removed deprecated compose versions from compose files


Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
This commit is contained in:
Andrii Chubatiuk 2024-06-10 12:49:00 +03:00 committed by GitHub
parent 318e9e9de0
commit 2da45a8368
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
9 changed files with 1456 additions and 205 deletions

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -135,3 +135,23 @@ groups:
summary: "Configuration reload failed for vmagent instance {{ $labels.instance }}" summary: "Configuration reload failed for vmagent instance {{ $labels.instance }}"
description: "Configuration hot-reload failed for vmagent on instance {{ $labels.instance }}. description: "Configuration hot-reload failed for vmagent on instance {{ $labels.instance }}.
Check vmagent's logs for detailed error message." Check vmagent's logs for detailed error message."
- alert: StreamAggrFlushTimeout
expr: |
increase(vm_streamaggr_flush_timeouts_total[5m]) > 0
labels:
severity: warning
annotations:
summary: "Streaming aggregation at \"{{ $labels.job }}\" (instance {{ $labels.instance }}) can't be finished within the configured aggregation interval."
description: "Stream aggregation process can't keep up with the load and might produce incorrect aggregation results. Check logs for more details.
Possible solutions: increase aggregation interval; aggregate smaller number of series; reduce samples' ingestion rate to stream aggregation."
- alert: StreamAggrDedupFlushTimeout
expr: |
increase(vm_streamaggr_dedup_flush_timeouts_total[5m]) > 0
labels:
severity: warning
annotations:
summary: "Deduplication \"{{ $labels.job }}\" (instance {{ $labels.instance }}) can't be finished within configured deduplication interval."
description: "Deduplication process can't keep up with the load and might produce incorrect results. Check docs https://docs.victoriametrics.com/stream-aggregation/#deduplication and logs for more details.
Possible solutions: increase deduplication interval; deduplicate smaller number of series; reduce samples' ingestion rate."

View File

@ -1,4 +1,3 @@
version: '3.5'
services: services:
# Metrics collector. # Metrics collector.
# It scrapes targets defined in --promscrape.config # It scrapes targets defined in --promscrape.config

View File

@ -1,4 +1,3 @@
version: "3.5"
services: services:
# Grafana instance configured with VictoriaLogs as datasource # Grafana instance configured with VictoriaLogs as datasource
grafana: grafana:

View File

@ -1,4 +1,3 @@
version: "3.5"
services: services:
# Metrics collector. # Metrics collector.
# It scrapes targets defined in --promscrape.config # It scrapes targets defined in --promscrape.config

View File

@ -1,4 +1,3 @@
version: "3.5"
services: services:
grafana: grafana:
container_name: grafana container_name: grafana

View File

@ -1,4 +1,3 @@
version: "3.5"
services: services:
grafana: grafana:
container_name: grafana container_name: grafana

View File

@ -30,6 +30,10 @@ See also [LTS releases](https://docs.victoriametrics.com/lts-releases/).
## tip ## tip
* FEATURE: [alerts-vmagent](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/deployment/docker/alerts-vmagent.yml): add new alerting rules `StreamAggrFlushTimeout` and `StreamAggrDedupFlushTimeout` to notify about issues during stream aggregation.
* FEATURE: [dashboards/vmagent](https://grafana.com/grafana/dashboards/12683): add row `Streaming aggregation` with panels related to [streaming aggregation](https://docs.victoriametrics.com/stream-aggregation/) process.
## [v1.102.0-rc1](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.102.0-rc1) ## [v1.102.0-rc1](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.102.0-rc1)
Released at 2024-06-07 Released at 2024-06-07