Commit Graph

137 Commits

Author SHA1 Message Date
Aliaksandr Valialkin
1c09e71f5b app/vminsert: add -disableRerouting command-line flag for disabling re-routing if some vmstorage nodes have lower performance than the others
Refactor the rerouting mechanism and make it more resilient to cases when some of vmstorage nodes are temporarily unavailable.

Reduce the probability of rerouting storm.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/791
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1054
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1165
2021-06-04 04:33:52 +03:00
Roman Khavronenko
78c388b246
dashboard: update descriptions for panel (#1275)
This commit fixes panels descriptions for `Concurrent flushes on disk` (vmstorage)
and `Concurrent inserts` (vminsert).
2021-05-07 11:25:00 +03:00
Aliaksandr Valialkin
6dc5d3b357 all: rename https://victoriametrics.github.io to https://docs.victoriametrics.com 2021-04-20 20:20:01 +03:00
Roman Khavronenko
2b1f6b2373
dashboard: plot avg GC duration instead of quantile 1 for better perception (#1229) 2021-04-19 13:29:31 +03:00
Artem Navoiev
39b5de9f24 [draft] per tenant statistic (#121)
* [draft] per tenant statistic

* updates metric name
update graph
adds link and example config

* quick fix

* adds grafana dashboard
adds example alert

Co-authored-by: f41gh7 <nik@victoriametrics.com>
2021-04-14 11:23:41 +03:00
Roman Khavronenko
7fc9239536
Cluster dashboard update (#1185)
* dashboard: change FreeDiskSpace panel to show percentage of used space instead

* dashboard: disable area fill for Cache hit ratio

* dashboard: minor display updates

* dashboard: add panel `Concurrent flushes on disk`

* dashboard: add `Rows ignored` panel

* dashboard: update ChurnRate panel with proper description and additional query over 24h time window
2021-04-05 22:28:02 +03:00
Roman Khavronenko
b736c40053 Dashboards update (#1153)
* dashboard: update single node dashboard

* add number of new series created over last 24h;
* bump version requirements.

* dashboard: update vmagent dashboard

* add panel for open file descriptors;
* add panel for disk I/O;
* add panel for `vmagent_remotewrite_packets_dropped_total` metric;
* bump version requirements.
2021-03-29 12:41:28 +03:00
Roman Khavronenko
540c00f2a0
dashboard: update cluster node dashboard (#1136)
* add panel `Open FDs` for file descriptors metrics;
* add panel `Disk writes/reads` to show the real read/write
load on storage layer;
* add stats panel to show available CPUs, memory and disk space.
2021-03-18 12:04:12 +02:00
Roman Khavronenko
14323af8c9
Add Labels limit exceeded panel to dashboard (#1076)
New panel supposed to display events when VM drops extra label
on exceeding `maxLabelsPerTimeseries` limit.
2021-02-19 00:50:21 +02:00
Roman Khavronenko
22baf8fe25
dashboard: release to grafana.com (#941) 2020-12-06 13:33:52 +02:00
Roman Khavronenko
b53cf5d083
dashboard: Prometheus compatibility fix for Storage full ETA panel (#939) 2020-12-06 01:19:20 +02:00
Roman Khavronenko
d5ba66c303
Cluster dashboard (#931)
* dashboard: add `Storage full ETA` panel

Backport of https://github.com/VictoriaMetrics/VictoriaMetrics/pull/858

* dashboard: add `Storage reachability` panel
2020-11-30 11:30:11 +02:00
Aleksey Shirokih
a7e17f14f5
dashboard: add account id to datapoints ingestion rate (#772) 2020-09-19 22:00:16 +01:00
Roman Khavronenko
432c0383db Single dashboards update (#736)
* dashboard: rename var `datasource` to `ds` for consistency reason

Dasbhoards for cluster version or vmagent operate with datasource variable
named `ds`. For consistency sake we rename this variable in single node version
as well.

* dashboard: add instance variable picker

See dashboard reviews here https://grafana.com/grafana/dashboards/10229/reviews

* dashboard: limit number of buckets in histogram to 12 for vmagent dashboard

* dashboard: bump version requirement in description for single version

* dashboard: drop extra series override for single version

* dashboard: set Y-min to zero for most of panels in vmagent dashboard
2020-09-02 15:18:29 +03:00
Roman Khavronenko
801a26340f
dashboard: set Y-min to zero for most of panels in cluster dashboard (#737)
Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2020-09-02 15:15:13 +03:00
John Belmonte
5ea6f86dd8 use Y-min 0 on Grafana dashboard graphs (#732) 2020-09-02 15:06:17 +03:00
Roman Khavronenko
9460bf782e
dashboards/victoriametrics: update Concurrent inserts panel #632 (#645)
Panel `Concurrent inserts` was moved to `vminsert` row. Its metrics and description
was updated accordingly to #632 issue.
2020-07-22 12:43:23 +03:00
Roman Khavronenko
21cf6a1ec4
deployment/docker: replace Prometheus with vmagent (#635)
* replace Prometheus with vmagent in docker compose env;
* cluster dashboard: exclude vmagent from job list;
* cluster dasbhoard: reference datasource var instead of datasource name.
2020-07-17 02:18:03 +03:00
Roman Khavronenko
87946dcc53 vmagent: update grafana dashboard (#634)
* reference datasource variable instead of datasource name;
* change unit from `bytes` to `bits/s` for Network panel.
2020-07-17 02:12:20 +03:00
Roman Khavronenko
cb4c433260 vmagent: add grafana dashboard (#629)
`vmagent` Grafana dashboard suppose to provide basic observability over multiple
`vmagent` instances. Dashboard is saved in Grafana export format so it can be easily
imported. It was also integrated into docker-compose environment.
2020-07-15 13:58:30 +03:00
Roman Khavronenko
a171f9b03e
dashboard: update cluster-version dashboard. (#558)
Fix "Bytes per point" panel query #551.
2020-06-12 22:07:28 +03:00
Roman Khavronenko
00a7eab43d
dashboards: update troubleshooting row (#506)
* Slow metrics load panel was removed since it is hard to interpret without
additional metrics and stats;
* Slow inserts panel was updated to display percentage of slow inserts comparing
to total number of inserts to show the real impact.
2020-05-20 00:51:12 +03:00
Roman Khavronenko
8e29b4a716
dashboards: updates and fixes for cluster version (#500)
* The new update introduces new row "Troubleshooting" that
contains panels for churn rate and slow-queries/inserts/loads metrics. This row supposed to be reveal the cause of low performance or other issues;
* CPU panel got `short` units instead of `seconds`;
* Overview row was updated with panel showing bytes-per-datapoint stat;
* Overview row was updated with panel showing free disk space.
2020-05-19 11:57:20 +03:00
Aleksey Shirokih
137e371219
Avoid ugly y-label for rows inserted (#457) 2020-05-02 19:06:37 +01:00
Roman Khavronenko
9373a62f8a
Update dashboard according to new Grafana version and some metric renames. (#392)
The list of changes is following:
* fix Uptime panel column styles according to changes introduced in 6.7 Grafana version
* fix panel `vminsert/Rows per insert` due to metric rename - see #336
* change default datasource to VictoriaMetrics since dashboard now uses MetricsQL for `vminsert/Rows per insert` panel
2020-03-28 01:17:38 +02:00
Edouard Hur
2ec248453b
do not fill max lines (#307) 2020-02-03 21:21:04 +00:00
Roman Khavronenko
ce8eb8a207
improve description for Pending datapoints panel; (#301)
Use bits/s for network usage panels;
2020-02-03 02:07:07 +02:00
Alexander Danilov
ced989c966
Fix current/max graphs (#298) 2020-01-29 23:48:36 +00:00
Roman Khavronenko
924af22ced #251 - add Logging rate panel (#255) 2019-12-09 13:07:09 +02:00
Aliaksandr Valialkin
b1c3284fd0 dashboards: remove deprecated dashboards - now only victoriametrics.json is officially supported 2019-11-23 12:43:38 +02:00
Roman Khavronenko
ce8cc76a42 add links and fix cache metric name (#233) 2019-11-12 15:06:56 +02:00
Roman Khavronenko
828f0a2a4b prepare dashboard for external sharing (#231) 2019-11-12 00:23:24 +02:00
Aliaksandr Valialkin
01801e9e03 dashboards: there will no 1.28.4 release. It will be 1.29.0 2019-11-10 22:05:10 +02:00
Roman Khavronenko
7247a7862d add description, churn rate panel, storage.ingestion rate panel (#228) 2019-11-10 20:32:10 +02:00
Roman Khavronenko
4e7a2a41a4 Cluster dashboard (#222)
* add dashboard for cluster version

* fix queries and panels

* review fixes

* use resident memory for memory usage panel

* fix job selectors
2019-11-07 12:09:27 +02:00
Aliaksandr Valialkin
dd7bba94a3 dashboards: use rate instead of irate, because irate doesn't capture spikes
See https://medium.com/@valyala/why-irate-from-prometheus-doesnt-capture-spikes-45f9896d7832 for details
2019-07-20 15:55:48 +03:00
Jiri Tyr
0aed0e0b5d Adding Grafana dashboards for VM cluster (#105) 2019-07-20 10:25:09 +03:00