The change list is the following:
* bump Grafana version to 9.2.6;
* replace old "Graph" panel with "TimeSeries" panel;
* show % usage of Mem and CPU additionally to of absolute values;
* `Caches` row was removed. All needed info for caches is now part of `Troubleshooting`;
* add Annotations for Alert triggers. Not all alerts are supposed to be displayed
on the dashboard, but only those with label `show_at: dashboard`.
See `alerts.yml` change.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* some unexpected DS UIDs were removed;
* replace `$instance.*` filter with `$instance` since we respect
the instance port anyway;
* remove predefined datasource for `clusterbytenant`
in favour of datasource variable `ds`.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dashboards/cluster: few updates
* apply consistent formatting across panels;
* make resource usage panels per component more detailed;
* add extra panels to vmselect for displaying
`vm_rows_read_per_query`, `vm_rows_scanned_per_query`,
`vm_rows_read_per_series` and `vm_series_read_per_query` metrics.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dashboards/single: few updates
* apply consistent formatting across panels;
* add extra panels to Performance for displaying
`vm_rows_read_per_query`, `vm_rows_scanned_per_query`,
`vm_rows_read_per_series` and `vm_series_read_per_query` metrics.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dashboards/vmagent: few updates
* apply consistent formatting across panels;
* add panels for showing number of samples ingested
or scraped;
* adapt resource usage panels for multiple selected jobs/instances;
* add adhoc variable;
* display vmagent's version in Stats.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dashboards/vmalert: few updates
* apply consistent formatting across panels;
* adapt resource usage panels for multiple selected jobs/instances;
* show vmalert version in Stats section.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Using metric `vm_concurrent_queries` in relation to `vm_concurrent_select_capacity`
is incorrect. Switching to `vm_concurrent_select_current` in `Concurrent selects` panel.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Before we used fixed `5m` interval for expressions with `rate` func.
Unfortunately, this interval wasn't a fit for all the cases. So we
switch to `$__rate_interval` instead.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
The new panel supposed to reflect the pressure on indexDB
caused by churn rate or new series registration.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
The diskUsage stats panel was showing disk usage without including
size of the index, which is not correct. The filter was removed
to reflect the total disk usage.
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2368
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dashboards: add `CPU percentage` panel for cluster dashboards
The new panel `CPU percentage` was added instead if adding a limit
to the existing `CPU` panel because dasbhoard may display big number
of components each with own limits. The separate panel should provide
a clear display of CPU load.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dashboards: sync vmagent and vmalert changes from single version
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* docker: remove unsupported param from vmagent config
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* alerts: add `TooHighCPUUsage` alert for all VM components
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dashboards/cluster: add panels for vmstorage in read-only mode
vmstorage readonly status panel was addded to "vmstorage" row.
A one more panel for showing vminsert->vmstorage readonly status
was added to troubleshooting row.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dashboards/cluster: add "Cache usage" panel
The new panel supposed to show the % of the used cache
compared to allowed size by type.
It should help to determine underutilized types of caches.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dashboards/cluster: add "Merges deferred" panel
The new panel supposed to show if there were deferred merges
due to insufficient disk space.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dashboards/cluster: update Network panel for vminsert
* delete bytes_written query, since in most cases it is insiginificant
* change display type to Stack
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dashboards/cluster: bump version requirement
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* rm cumulative visualisation for panel `Disk space used`.
It uses % threshold and cumulative display breaks it.
* remove area filling for resource usage row;
* add job name for panels in resource usage row.
* dashboard: update vmagent dash
The update contains the following changes:
* display anonymous memory usage metric. This metric suppose to reflect
memory usage of the process which can't be freed by OS;
* add legends to all panels. This is important for cases when users share
the screenshots;
* modify panels for Grafana v8.0.0
* dashboard: update cluster dash
The update contains the following changes:
* move stats panels to Configuration row, so it can be collapsed;
* display anonymous memory usage metric. This metric suppose to reflect
memory usage of the process which can't be freed by OS;
* add legends to all panels. This is important for cases when users share
the screenshots;
* modify panels for Grafana v8.0.0
* dashboard: change FreeDiskSpace panel to show percentage of used space instead
* dashboard: disable area fill for Cache hit ratio
* dashboard: minor display updates
* dashboard: add panel `Concurrent flushes on disk`
* dashboard: add `Rows ignored` panel
* dashboard: update ChurnRate panel with proper description and additional query over 24h time window
* add panel `Open FDs` for file descriptors metrics;
* add panel `Disk writes/reads` to show the real read/write
load on storage layer;
* add stats panel to show available CPUs, memory and disk space.
* Slow metrics load panel was removed since it is hard to interpret without
additional metrics and stats;
* Slow inserts panel was updated to display percentage of slow inserts comparing
to total number of inserts to show the real impact.
* The new update introduces new row "Troubleshooting" that
contains panels for churn rate and slow-queries/inserts/loads metrics. This row supposed to be reveal the cause of low performance or other issues;
* CPU panel got `short` units instead of `seconds`;
* Overview row was updated with panel showing bytes-per-datapoint stat;
* Overview row was updated with panel showing free disk space.
The list of changes is following:
* fix Uptime panel column styles according to changes introduced in 6.7 Grafana version
* fix panel `vminsert/Rows per insert` due to metric rename - see #336
* change default datasource to VictoriaMetrics since dashboard now uses MetricsQL for `vminsert/Rows per insert` panel