Commit Graph

18 Commits

Author SHA1 Message Date
Roman Khavronenko
deb2f87074
deployment: add panel and alerts for displying go scheduler latency (#7078)
The panel and alerting rule should help to understand whether VM
component doesn't have enough CPU resources or gets throttled. The alert
is applicable for all VM components.
The panel was added to vmalert, vmagent, vmsingle, vm clusert and
victorialogs dashes.

-------------------

This alerting rule should have help us identify resource shortage for
sandbox vmagent - see [this
link](https://play.victoriametrics.com/select/accounting/1/6a716b0f-38bc-4856-90ce-448fd713e3fe/prometheus/graph/#/?g0.range_input=23d13h25m25s424ms&g0.end_input=2024-09-23T14%3A11%3A00&g0.relative_time=none&g0.tab=0&g0.expr=histogram_quantile%280.99%2C+sum%28rate%28go_sched_latencies_seconds_bucket%7Bjob%3D%22vmagent-monitoring-vmagent%22%7D%5B5m%5D%29%29+by+%28le%2C+job%2C+instance%29%29+%3E+0.1)
for example. We weren't aware of resource shortage, because VM metrics
assumed this vmagent had 1vCPU while in fact its limit was 0.2vCPU.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 4d0b41e63b)
2024-09-24 16:58:14 +02:00
Hui Wang
7a21e6cb6b
vmalert-dashboard: replace variable query metric (#6505)
`vmalert_iteration_total` series number is 4 time less than
`vmalert_iteration_duration_seconds`, queries will be lighter.

(cherry picked from commit 75ad6c1b49)
2024-06-19 10:37:10 +02:00
hagen1778
89819f2054
dashboards: use $__interval variable for offsets and look-behind windows in annotations
This should improve precision of `restarts` and `version change` annotations when
 zooming-in/zooming-out on the dashboards.

 The change also makes `restarts` dashboard visible on the panels, so user can disable it from
 displaying if needed. This could be useful when restarts overlap with version change events.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 9dd9b4442f)
2024-05-22 16:40:08 +02:00
Hui Wang
5b8c3fc9d0
app/vmalert: support DNS SRV record in -remoteWrite.url (#6299)
part of https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6053,
supports [DNS SRV](https://en.wikipedia.org/wiki/SRV_record) address in
`-remoteWrite.url` command-line option.

(cherry picked from commit d7b5062917)
2024-05-22 10:53:22 +02:00
hagen1778
0dd3fec2b7
deployment/dashboards: fix AnnotationQueryRunner error in Grafana
The error appears when executing annotations query against Prometheus backend
because the query itself hasn't specified look-behind window (which is allowed
in VictoriaMetrics query engine).

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6309
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit c746ba154d)
2024-05-21 16:37:23 +02:00
hagen1778
0f72ab8ef6
deployment: bump Grafana version to 10.4.2
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 9256df17fa)
2024-04-30 10:30:00 +02:00
Aliaksandr Valialkin
87641fa7e7
all: replace old https://docs.victoriametrics.com/Troubleshooting.html url with the new one - https://docs.victoriametrics.com/troubleshooting/ 2024-04-18 03:27:18 +02:00
Aliaksandr Valialkin
a99005eff6
all: replace old https://docs.victoriametrics.com/vmalert.html url with the new one - https://docs.victoriametrics.com/vmalert/ 2024-04-18 01:44:54 +02:00
Aliaksandr Valialkin
c0457ac11a
all: replace remaining https://docs.victoriametrics.com/vmagent.html urls with the new one - https://docs.victoriametrics.com/vmagent/ 2024-04-18 01:36:20 +02:00
hagen1778
51745ec5ff
dashboards: update links in various panels
* use docs.victoriametrics.com instead of github docs
* add links to common terms used in VictoriaMetrics

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-03-04 17:00:54 +02:00
hagen1778
bdbab7bed5
dashboards/all: add new panel CPU spent on GC
It should help identifying cases when too much CPU is spent on garbage collection,
 and advice users on how this can be addressed.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-02-05 11:42:28 +02:00
hagen1778
3dab94a6c1
dashboards: update to grafana/grafana:10.3.1
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-02-05 10:50:36 +02:00
Dmytro Kozlov
6a41e1ec0c
app/vmalert: replace error metrics for gauges with counter metrics (#5217)
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5160

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 935bec447b)
2023-12-06 19:41:34 +01:00
hagen1778
9debdb497c
dashboards/vmalert: add new panel Missed evaluations
The new panel supposed to indicate alerting groups that miss their evaluations.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit aaf9e3d526)
2023-10-31 10:35:57 +01:00
hagen1778
497c708aaa
dashboards: fix Errors rate to Alertmanager filter
The panel `Errors rate to Alertmanager` had `group` label filter
applied to the expression, while the metric `vmalert_alerts_send_errors_total`
doesn't have that label. This resulted into always empty results.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 8874b525b7)
2023-10-31 10:35:57 +01:00
hagen1778
46770409d9
dashboards/vmalert: respect job and instance filters in No data errors
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit c2d252c045)
2023-10-17 10:26:32 +02:00
hagen1778
d7bae2b78f
dashboards/vmalert: use desc sorting for tooltips on panels
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit edba9f6266)
2023-10-17 10:26:32 +02:00
Roman Khavronenko
c81e90223c
dashboards: provide copies of Grafana dashboards alternated with Vict… (#4905)
dashboards: provide copies of Grafana dashboards alternated with VictoriaMetrics datasource

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-08-29 11:20:16 +02:00