VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-12-18 14:40:26 +01:00

Author	SHA1	Message	Date
Roman Khavronenko	cae148d5c6	dashboards: cluster dashboard update (#3380 ) The purpose of the update is to make the dash more usable for large installations with many instances. Panels which showed metrics per-instance (Mem, CPU) now are showing metrics per-job or min/max/avg aggregations in % instead. This supposed to help immediately to identify resource shortage and remain usable for small and big installations. For cases when detailed info is needed, to the bottom of the dashboard a new row `Drilldown` was added. Panels like Mem or CPU now contain a `data-link` named `Drilldown` (cis shown on line click) which takes user to more detailed panel. The change list is the following: * bump Grafana version to 9.1.0; * replace old "Graph" panel with "TimeSeries" panel; * improve Uptime panel to show number of instances per job; * show % usage of Mem and CPU instead of absolute values; * `Caches` row was removed. All needed info for caches is now part of `Troubleshooting`; * add `Drilldown` section for detailed resource usage; * add Annotations for Alert triggers. Not all alerts are supposed to be displayed on the dashboard, but only those with label `show_at: dashboard`. See `alerts-cluster.yml` change. Signed-off-by: hagen1778 <roman@victoriametrics.com> Signed-off-by: hagen1778 <roman@victoriametrics.com>	2022-11-24 13:20:10 -08:00
Roman Khavronenko	0efc20d7b8	dashboards: replace `Index size` panel with `Active series` (#3157 ) Panel `Index size` showed itself impractical for users. So replacing it with `Active series` panel. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/776#issuecomment-1255823734 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2022-09-26 08:48:25 +03:00
Roman Khavronenko	5dfe63e102	Dashboards (#3120 ) * dashboards/cluster: few updates * apply consistent formatting across panels; * make resource usage panels per component more detailed; * add extra panels to vmselect for displaying `vm_rows_read_per_query`, `vm_rows_scanned_per_query`, `vm_rows_read_per_series` and `vm_series_read_per_query` metrics. Signed-off-by: hagen1778 <roman@victoriametrics.com> * dashboards/single: few updates * apply consistent formatting across panels; * add extra panels to Performance for displaying `vm_rows_read_per_query`, `vm_rows_scanned_per_query`, `vm_rows_read_per_series` and `vm_series_read_per_query` metrics. Signed-off-by: hagen1778 <roman@victoriametrics.com> * dashboards/vmagent: few updates * apply consistent formatting across panels; * add panels for showing number of samples ingested or scraped; * adapt resource usage panels for multiple selected jobs/instances; * add adhoc variable; * display vmagent's version in Stats. Signed-off-by: hagen1778 <roman@victoriametrics.com> * dashboards/vmalert: few updates * apply consistent formatting across panels; * adapt resource usage panels for multiple selected jobs/instances; * show vmalert version in Stats section. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2022-09-19 15:04:37 +03:00
Max Golionko	e07f23a1b9	moved cluster dashboard to master (#3074 ) dashboards: move cluster dashboard to master branch This change should simplify dashboards management.	2022-09-08 11:47:25 +03:00
Roman Khavronenko	3c583c16a1	dashboards: add `Cache usage %` panel to Caches row (#2960 ) https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2941 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2022-08-08 11:45:17 +02:00
Roman Khavronenko	23e85e0fc5	vmagent: expose metric `vmagent_remotewrite_queues` (#2871 ) The new metric `vmagent_remotewrite_queues` exports a static value of number of configured remote write queus. This metric is useful to calculate total saturation per each configured URL with given number of queues. See corresponding changes to vmagent alerts and dashboard. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2022-07-18 14:41:04 +03:00
Roman Khavronenko	018782c24e	dashboards: small visual tweaks for vmagent's dashboard (#2828 ) * remove lines filling * filter series with zero values * update descriptions Signed-off-by: hagen1778 <roman@victoriametrics.com>	2022-07-05 13:20:41 +03:00
Roman Khavronenko	fc03950efa	dashboards: update cluster dashboard (#2773 ) * dashboards: update cluster dashboard * add assisted merges panel https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2754 * add mem panel per each component * remove lines filling for some panels for clarity Signed-off-by: hagen1778 <roman@victoriametrics.com> * Update dashboards/victoriametrics.json	2022-06-23 09:46:28 +02:00
Roman Khavronenko	246d2df361	dashboards: add cpu usage panels per each component type (#2723 ) The change adds extra panel per each component, showing the amount of used CPU cores and the limit (summary of all instances). https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2696 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2022-06-16 20:49:55 +03:00
Artem Navoiev	5b4b922433	dashboards: update cluster by tenant dashboard (#2695 ) Signed-off-by: Artem Navoiev <tenmozes@gmail.com>	2022-06-09 13:15:40 +03:00
Roman Khavronenko	d956f6f68e	Dashboar cluster update (#2674 ) * dashboard: fix query for `CPU percentage` panel Signed-off-by: hagen1778 <roman@victoriametrics.com> * dashboard: replace Uptime panel with Version panel Signed-off-by: hagen1778 <roman@victoriametrics.com>	2022-06-02 16:03:48 +02:00
Roman Khavronenko	46c06334ee	dashboards: use `vm_concurrent_select_current` instead of `vm_concurrent_queries` (#2655 ) Using metric `vm_concurrent_queries` in relation to `vm_concurrent_select_capacity` is incorrect. Switching to `vm_concurrent_select_current` in `Concurrent selects` panel. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2022-05-30 12:16:24 +03:00
Nikolay	514843be77	dashboards: adds dashboard for operator (#2621 ) Apply suggestions from code review Co-authored-by: Roman Khavronenko <roman@victoriametrics.com> Adds proper interval to rate functions	2022-05-23 11:49:03 +03:00
Roman Khavronenko	8c30640828	dashboards: bump version requirement for cluster dashboard (#2537 ) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2022-05-05 13:40:07 +03:00
hagen1778	e856d74b7b	dashboards: replace fixed interval of `5m` for `rate` expressions Before we used fixed `5m` interval for expressions with `rate` func. Unfortunately, this interval wasn't a fit for all the cases. So we switch to `$__rate_interval` instead. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2022-04-24 23:25:32 +03:00
hagen1778	16b3374874	dashboards: add new panel `IndexDB items rate` The new panel supposed to reflect the pressure on indexDB caused by churn rate or new series registration. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2022-04-24 23:25:32 +03:00
hagen1778	1762256c7e	dashboards: mention that `Rows.Sent` can be affected by replication Signed-off-by: hagen1778 <roman@victoriametrics.com>	2022-04-24 23:25:32 +03:00
hagen1778	4255cb7559	dashboards: rm "Deferred merges" panel since it could be misleading See more context here https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1682#issuecomment-938608067 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2022-04-24 23:25:32 +03:00
hagen1778	0bbc7221f3	dashboards: add adhoc filter to dasbhoard variables The adhoc filter allows to quickly apply global filters without modifying the panels. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2022-04-24 23:25:32 +03:00
hagen1778	80e8413f3a	dashboards: remove index filter from stats panel for DiskUsage The diskUsage stats panel was showing disk usage without including size of the index, which is not correct. The filter was removed to reflect the total disk usage. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2368 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2022-04-24 23:25:32 +03:00
Roman Khavronenko	3569352fe0	dashboards: update the threshold for slow inserts % on the dashboard (#2198 ) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2022-02-15 21:57:21 +02:00
Roman Khavronenko	3458a3d593	Monitoring cluster (#2191 ) * dashboards: add `CPU percentage` panel for cluster dashboards The new panel `CPU percentage` was added instead if adding a limit to the existing `CPU` panel because dasbhoard may display big number of components each with own limits. The separate panel should provide a clear display of CPU load. Signed-off-by: hagen1778 <roman@victoriametrics.com> * dashboards: sync vmagent and vmalert changes from single version Signed-off-by: hagen1778 <roman@victoriametrics.com> * docker: remove unsupported param from vmagent config Signed-off-by: hagen1778 <roman@victoriametrics.com> * alerts: add `TooHighCPUUsage` alert for all VM components Signed-off-by: hagen1778 <roman@victoriametrics.com>	2022-02-15 11:57:58 +02:00
Roman Khavronenko	4010f548b5	dashboards: migrate from old table panel in cluster dashboard (#1993 ) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2021-12-22 11:21:06 +02:00
Roman Khavronenko	89facbc5c4	dashboards/vmagent: fix cached datasource uid (#1984 ) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2021-12-20 17:34:55 +02:00
Roman Khavronenko	0311d3cc89	Dashboards cluster (#1983 ) * dashboards/cluster: add panels for vmstorage in read-only mode vmstorage readonly status panel was addded to "vmstorage" row. A one more panel for showing vminsert->vmstorage readonly status was added to troubleshooting row. Signed-off-by: hagen1778 <roman@victoriametrics.com> * dashboards/cluster: add "Cache usage" panel The new panel supposed to show the % of the used cache compared to allowed size by type. It should help to determine underutilized types of caches. Signed-off-by: hagen1778 <roman@victoriametrics.com> * dashboards/cluster: add "Merges deferred" panel The new panel supposed to show if there were deferred merges due to insufficient disk space. Signed-off-by: hagen1778 <roman@victoriametrics.com> * dashboards/cluster: update Network panel for vminsert * delete bytes_written query, since in most cases it is insiginificant * change display type to Stack Signed-off-by: hagen1778 <roman@victoriametrics.com> * dashboards/cluster: bump version requirement Signed-off-by: hagen1778 <roman@victoriametrics.com>	2021-12-20 17:32:05 +02:00
Roman Khavronenko	ada18cd963	Dashboards vmagent updates (#1973 ) * dashboards/vmagent: shuffle panels for better visibility More important error/dropped panels were moved higher on the main row. Network usage panel moved to Resource usage row. Signed-off-by: hagen1778 <roman@victoriametrics.com> * dashboards/vmagent: add Troubleshooting row to show top 5 instances/jobs by churn rate New panels are supposed to show top 5 jobs or targets which generate the most of the churn rate. They were placed into a new row "Troubleshooting". Signed-off-by: hagen1778 <roman@victoriametrics.com> * dashboards/vmagent: add panels for showing persistent queue saturation New panels were added to Torubleshooting row to show the persistent queue saturation. The corresponding alerts were added and linked to these panels as well. Signed-off-by: hagen1778 <roman@victoriametrics.com> * dashboards/vmagent: add alert "RejectedRemoteWriteDataBlocksAreDropped" New alert suppose to send a notification when vmagent starts to drop data blocks rejected by configured remote write destiantion. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2021-12-20 12:19:17 +02:00
Aliaksandr Valialkin	6346b78fa8	dashboards: consistently use regexp filters for template vars (#1799 ) Template vars may contain regexp when `all` is selected (.*) or when multiple values are selected (foo\|bar). So they must be passed to regexp filters.	2021-11-09 16:50:08 +02:00
Roman Khavronenko	d763837130	dashboards: add cardnilaity limiter panels for vmagent (#1720 ) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2021-10-19 09:00:05 +03:00
Roman Khavronenko	18313f3f8e	Cluster dashboard update (#1594 ) * dashboards: sync `vmagent` updates from master branch * dashboards: add new `Storage connection saturation` panel for cluster dashboard * dashboards: add new cluster alert for corresponding `Storage connection saturation` panel	2021-09-01 17:05:17 +03:00
Roman Khavronenko	be3e31a574	dashboards: bump vmagent version requirement	2021-09-01 16:08:17 +03:00
Roman Khavronenko	af8c1feddb	Single dashboards upd (#1593 ) * dasbhoard: replace `null` datasources null datasource value may confuse Grafana and make it drop panel query in some versions. * docker: bump grafana image version * dashboards: add URL variable selector to vmagent dashboard * dashboards: add new panel `Remote write connection saturation` to vmagent dashboard * alerts: add new alert for `Remote write connection saturation` panel of vmagent dashboard * dashboards: add "Logging rate" panel to vmagent dashboard	2021-09-01 12:24:55 +03:00
Roman Khavronenko	434f33d04d	Cluster sync master changes (#1592 ) * docker: add README for docker compose env * docker: add vmalert Grafana dashboard	2021-09-01 10:25:07 +03:00
Roman Khavronenko	c6cf821600	dashboard: several minor fixes (#1418 ) * move panel `Disk writes/reads` to `Resource usage` row * rename row `storage` to `vmstorage` * remove cumulative display for `Storage ETA` panel	2021-07-01 05:45:35 +03:00
Roman Khavronenko	5cb378f5b5	dasbhoard: display tweaks (#1387 ) * rm cumulative visualisation for panel `Disk space used`. It uses % threshold and cumulative display breaks it. * remove area filling for resource usage row; * add job name for panels in resource usage row.	2021-06-18 10:48:40 +03:00
Roman Khavronenko	db39c4a7d1	dashboard: bump version requirements (#1379 )	2021-06-14 13:32:32 +03:00
Roman Khavronenko	1053d3e5a9	Dashboard cluster (#1375 ) * dashboard: update vmagent dash The update contains the following changes: * display anonymous memory usage metric. This metric suppose to reflect memory usage of the process which can't be freed by OS; * add legends to all panels. This is important for cases when users share the screenshots; * modify panels for Grafana v8.0.0 * dashboard: update cluster dash The update contains the following changes: * move stats panels to Configuration row, so it can be collapsed; * display anonymous memory usage metric. This metric suppose to reflect memory usage of the process which can't be freed by OS; * add legends to all panels. This is important for cases when users share the screenshots; * modify panels for Grafana v8.0.0	2021-06-14 13:03:54 +03:00
Aliaksandr Valialkin	1c09e71f5b	app/vminsert: add `-disableRerouting` command-line flag for disabling re-routing if some vmstorage nodes have lower performance than the others Refactor the rerouting mechanism and make it more resilient to cases when some of vmstorage nodes are temporarily unavailable. Reduce the probability of rerouting storm. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/791 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1054 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1165	2021-06-04 04:33:52 +03:00
Roman Khavronenko	78c388b246	dashboard: update descriptions for panel (#1275 ) This commit fixes panels descriptions for `Concurrent flushes on disk` (vmstorage) and `Concurrent inserts` (vminsert).	2021-05-07 11:25:00 +03:00
Aliaksandr Valialkin	6dc5d3b357	all: rename https://victoriametrics.github.io to https://docs.victoriametrics.com	2021-04-20 20:20:01 +03:00
Roman Khavronenko	2b1f6b2373	dashboard: plot avg GC duration instead of quantile `1` for better perception (#1229 )	2021-04-19 13:29:31 +03:00
Artem Navoiev	39b5de9f24	[draft] per tenant statistic (#121 ) * [draft] per tenant statistic * updates metric name update graph adds link and example config * quick fix * adds grafana dashboard adds example alert Co-authored-by: f41gh7 <nik@victoriametrics.com>	2021-04-14 11:23:41 +03:00
Roman Khavronenko	7fc9239536	Cluster dashboard update (#1185 ) * dashboard: change FreeDiskSpace panel to show percentage of used space instead * dashboard: disable area fill for Cache hit ratio * dashboard: minor display updates * dashboard: add panel `Concurrent flushes on disk` * dashboard: add `Rows ignored` panel * dashboard: update ChurnRate panel with proper description and additional query over 24h time window	2021-04-05 22:28:02 +03:00
Roman Khavronenko	b736c40053	Dashboards update (#1153 ) * dashboard: update single node dashboard * add number of new series created over last 24h; * bump version requirements. * dashboard: update vmagent dashboard * add panel for open file descriptors; * add panel for disk I/O; * add panel for `vmagent_remotewrite_packets_dropped_total` metric; * bump version requirements.	2021-03-29 12:41:28 +03:00
Roman Khavronenko	540c00f2a0	dashboard: update cluster node dashboard (#1136 ) * add panel `Open FDs` for file descriptors metrics; * add panel `Disk writes/reads` to show the real read/write load on storage layer; * add stats panel to show available CPUs, memory and disk space.	2021-03-18 12:04:12 +02:00
Roman Khavronenko	14323af8c9	Add `Labels limit exceeded` panel to dashboard (#1076 ) New panel supposed to display events when VM drops extra label on exceeding `maxLabelsPerTimeseries` limit.	2021-02-19 00:50:21 +02:00
Roman Khavronenko	22baf8fe25	dashboard: release to grafana.com (#941 )	2020-12-06 13:33:52 +02:00
Roman Khavronenko	b53cf5d083	dashboard: Prometheus compatibility fix for `Storage full ETA` panel (#939 )	2020-12-06 01:19:20 +02:00
Roman Khavronenko	d5ba66c303	Cluster dashboard (#931 ) * dashboard: add `Storage full ETA` panel Backport of https://github.com/VictoriaMetrics/VictoriaMetrics/pull/858 * dashboard: add `Storage reachability` panel	2020-11-30 11:30:11 +02:00
Aleksey Shirokih	a7e17f14f5	dashboard: add account id to datapoints ingestion rate (#772 )	2020-09-19 22:00:16 +01:00
Roman Khavronenko	432c0383db	Single dashboards update (#736 ) * dashboard: rename var `datasource` to `ds` for consistency reason Dasbhoards for cluster version or vmagent operate with datasource variable named `ds`. For consistency sake we rename this variable in single node version as well. * dashboard: add instance variable picker See dashboard reviews here https://grafana.com/grafana/dashboards/10229/reviews * dashboard: limit number of buckets in histogram to 12 for vmagent dashboard * dashboard: bump version requirement in description for single version * dashboard: drop extra series override for single version * dashboard: set Y-min to zero for most of panels in vmagent dashboard	2020-09-02 15:18:29 +03:00
Roman Khavronenko	801a26340f	dashboard: set Y-min to zero for most of panels in cluster dashboard (#737 ) Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>	2020-09-02 15:15:13 +03:00
John Belmonte	5ea6f86dd8	use Y-min 0 on Grafana dashboard graphs (#732 )	2020-09-02 15:06:17 +03:00
Roman Khavronenko	9460bf782e	dashboards/victoriametrics: update `Concurrent inserts` panel #632 (#645 ) Panel `Concurrent inserts` was moved to `vminsert` row. Its metrics and description was updated accordingly to #632 issue.	2020-07-22 12:43:23 +03:00
Roman Khavronenko	21cf6a1ec4	deployment/docker: replace Prometheus with vmagent (#635 ) * replace Prometheus with vmagent in docker compose env; * cluster dashboard: exclude vmagent from job list; * cluster dasbhoard: reference datasource var instead of datasource name.	2020-07-17 02:18:03 +03:00
Roman Khavronenko	87946dcc53	vmagent: update grafana dashboard (#634 ) * reference datasource variable instead of datasource name; * change unit from `bytes` to `bits/s` for Network panel.	2020-07-17 02:12:20 +03:00
Roman Khavronenko	cb4c433260	vmagent: add grafana dashboard (#629 ) `vmagent` Grafana dashboard suppose to provide basic observability over multiple `vmagent` instances. Dashboard is saved in Grafana export format so it can be easily imported. It was also integrated into docker-compose environment.	2020-07-15 13:58:30 +03:00
Roman Khavronenko	a171f9b03e	dashboard: update cluster-version dashboard. (#558 ) Fix "Bytes per point" panel query #551.	2020-06-12 22:07:28 +03:00
Roman Khavronenko	00a7eab43d	dashboards: update troubleshooting row (#506 ) * Slow metrics load panel was removed since it is hard to interpret without additional metrics and stats; * Slow inserts panel was updated to display percentage of slow inserts comparing to total number of inserts to show the real impact.	2020-05-20 00:51:12 +03:00
Roman Khavronenko	8e29b4a716	dashboards: updates and fixes for cluster version (#500 ) * The new update introduces new row "Troubleshooting" that contains panels for churn rate and slow-queries/inserts/loads metrics. This row supposed to be reveal the cause of low performance or other issues; * CPU panel got `short` units instead of `seconds`; * Overview row was updated with panel showing bytes-per-datapoint stat; * Overview row was updated with panel showing free disk space.	2020-05-19 11:57:20 +03:00
Aleksey Shirokih	137e371219	Avoid ugly y-label for rows inserted (#457 )	2020-05-02 19:06:37 +01:00
Roman Khavronenko	9373a62f8a	Update dashboard according to new Grafana version and some metric renames. (#392 ) The list of changes is following: * fix Uptime panel column styles according to changes introduced in 6.7 Grafana version * fix panel `vminsert/Rows per insert` due to metric rename - see #336 * change default datasource to VictoriaMetrics since dashboard now uses MetricsQL for `vminsert/Rows per insert` panel	2020-03-28 01:17:38 +02:00
Edouard Hur	2ec248453b	do not fill max lines (#307 )	2020-02-03 21:21:04 +00:00
Roman Khavronenko	ce8eb8a207	improve description for `Pending datapoints` panel; (#301 ) Use bits/s for network usage panels;	2020-02-03 02:07:07 +02:00
Alexander Danilov	ced989c966	Fix current/max graphs (#298 )	2020-01-29 23:48:36 +00:00
Roman Khavronenko	924af22ced	#251 - add Logging rate panel (#255 )	2019-12-09 13:07:09 +02:00
Aliaksandr Valialkin	b1c3284fd0	dashboards: remove deprecated dashboards - now only victoriametrics.json is officially supported	2019-11-23 12:43:38 +02:00
Roman Khavronenko	ce8cc76a42	add links and fix cache metric name (#233 )	2019-11-12 15:06:56 +02:00
Roman Khavronenko	828f0a2a4b	prepare dashboard for external sharing (#231 )	2019-11-12 00:23:24 +02:00
Aliaksandr Valialkin	01801e9e03	dashboards: there will no 1.28.4 release. It will be 1.29.0	2019-11-10 22:05:10 +02:00
Roman Khavronenko	7247a7862d	add description, churn rate panel, storage.ingestion rate panel (#228 )	2019-11-10 20:32:10 +02:00
Roman Khavronenko	4e7a2a41a4	Cluster dashboard (#222 ) * add dashboard for cluster version * fix queries and panels * review fixes * use resident memory for memory usage panel * fix job selectors	2019-11-07 12:09:27 +02:00
Aliaksandr Valialkin	dd7bba94a3	dashboards: use `rate` instead of `irate`, because `irate` doesn't capture spikes See https://medium.com/@valyala/why-irate-from-prometheus-doesnt-capture-spikes-45f9896d7832 for details	2019-07-20 15:55:48 +03:00
Jiri Tyr	0aed0e0b5d	Adding Grafana dashboards for VM cluster (#105 )	2019-07-20 10:25:09 +03:00

1 2 3 4

173 Commits