VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-11-27 02:46:47 +01:00

Author	SHA1	Message	Date
hagen1778	ec81deb7e8	dashboards: fix `Ingestion` row for vmagent dashboard Previously, clicking on Ingestion row could result in a visual blip. Re-ordering panels within the row seems to fix it. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-11-25 12:40:56 +01:00
hagen1778	6b903d79a9	dashboards: rename `datapoints` to `logs` for vlogs dashboard Logs has more clear menaing than `datapoints` in this case. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-11-14 07:14:39 -07:00
hagen1778	02e5fb81c5	dashboards: make dashboards-sync after `683f8c2780` Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-11-12 17:58:05 -07:00
hagen1778	6fdc111fd2	dashboards: set Y-min to 0 for stats panel with range queries Y-min set to 0 gives better understanding of changes, as it shows absolute change. Otherwise, the panel will show relative change and could make a false impression of the changes. Other panels in dashboards are either instant (no historical data displayed), or already set to Y-min: 0. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-11-12 17:56:17 -07:00
Hui Wang	32b89447ae	dashboards: add `file` label filter to vmalert dashboard panels (#7515 ) Previously, metrics from groups with the same name but in different files could be mixed in the results. e.g. the evaluation time [here](https://grafana.maas.victoriametrics.com/d/LzldHAVnz/victoriametrics-vmalert?orgId=1&var-ds=PE8D8DB4BEE4E4B22&var-job=All&var-instance=All&var-file=%2Fetc%2Fvmalert%2Fconfig%2Fvm-per-tenant-rulefiles-0%2Fmaas-tenant-1011-maas-1011-vm-health.yaml&var-group=All&var-topk=5&editPanel=23) is the total for multiple groups from different tenants.	2024-11-12 09:00:39 -07:00
Zakhar Bessarab	d73e5bdb8b	dashboards: add dashboards with victoria-logs datasource (#7424 ) ### Describe Your Changes Sync list of dashboards to be provided with Prometheus and VictoriaMetrics' datasources. ### Checklist The following checks are mandatory: - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2024-11-05 16:53:14 +01:00
Artem Fetishev	683f8c2780	dashboards: add Restarts panel (#7394 ) Reopening PR #7373 from a branch in VictoriaMetrics repo in order to enable edits and rebase. - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: Artem Fetishev <rtm@victoriametrics.com> Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com>	2024-10-30 16:44:08 +01:00
Zhu Jiekun	cd2222aa95	dashboards: fix query for full ETA vm_free_disk_space_bytes - vm_free_disk_space_limit_bytes (#7355 ) Some checks failed publish-docs / Build (push) Has been cancelled Details ### Describe Your Changes Fix https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7334 available disk space should be ``` (vm_free_disk_space_bytes{job=~...} - vm_free_disk_space_limit_bytes{job=~...}) ``` instead of ``` vm_free_disk_space_bytes{job=~...} ``` ### Checklist The following checks are mandatory: - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/).	2024-10-25 15:09:14 +02:00
Artem Fetishev	ca787c70d1	dashboards: fix vmagent monitoring chart descriptions (#7283 ) ### Describe Your Changes Fix vmagent monitoring chart descriptions ### Checklist The following checks are mandatory: - [x ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>	2024-10-17 12:12:47 +02:00
Hui Wang	d3f110373c	dashboards: fix description about pending datapoints (#7235 ) See [our playground](https://play-grafana.victoriametrics.com/d/oS7Bi_0Wz_vm/victoriametrics-cluster-vm?orgId=1&var-ds=P996FABE17B5F6D1E&var-job=All&var-job_insert=All&var-job_select=All&var-job_storage=All&var-instance=All) for reference.	2024-10-11 13:47:14 +02:00
Nikolay	fbaa026ae6	dashboards: updates operator dashboard (#7139 ) * Replaces deprecated graphs with Timeseries panels * Adds new latency dashboards for rest client and golang scheduler * Adds new overview panels * Adds VM Datasource version of dashboard --------- Signed-off-by: f41gh7 <nik@victoriametrics.com> Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>	2024-09-30 15:35:39 +02:00
Roman Khavronenko	4d0b41e63b	deployment: add panel and alerts for displying go scheduler latency (#7078 ) The panel and alerting rule should help to understand whether VM component doesn't have enough CPU resources or gets throttled. The alert is applicable for all VM components. The panel was added to vmalert, vmagent, vmsingle, vm clusert and victorialogs dashes. ------------------- This alerting rule should have help us identify resource shortage for sandbox vmagent - see [this link](https://play.victoriametrics.com/select/accounting/1/6a716b0f-38bc-4856-90ce-448fd713e3fe/prometheus/graph/#/?g0.range_input=23d13h25m25s424ms&g0.end_input=2024-09-23T14%3A11%3A00&g0.relative_time=none&g0.tab=0&g0.expr=histogram_quantile%280.99%2C+sum%28rate%28go_sched_latencies_seconds_bucket%7Bjob%3D%22vmagent-monitoring-vmagent%22%7D%5B5m%5D%29%29+by+%28le%2C+job%2C+instance%29%29+%3E+0.1) for example. We weren't aware of resource shortage, because VM metrics assumed this vmagent had 1vCPU while in fact its limit was 0.2vCPU. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-09-23 16:54:42 +02:00
hagen1778	4dcb6a3719	dashboards/vmagent: fix legend captions for stream aggregation related panels. Before they were displaying wrong label names. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-09-03 14:23:35 +02:00
zjbztianya	1b1e61030b	dashboards: typo fix (#6920 ) ### Describe Your Changes Correct the spelling error of 'vminsert' in the dashboards. ### Checklist The following checks are mandatory: - [ ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/).	2024-09-03 10:27:01 +02:00
hagen1778	d225a2eb56	dashboards: add `Scrape duration 0.99 quantile` panel The new panel will show the 99th quantile of scrape duration in seconds. This should help identifying vmagent instances that experiences too high scraping durations. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-07-30 14:57:17 +02:00
Aliaksandr Valialkin	db557b86ee	app/vmagent/remotewrite: follow-up for `f153f54d11` - Move the remaining code responsible for stream aggregation initialization from remotewrite.go to streamaggr.go . This improves code maintainability a bit. - Properly shut down streamaggr.Aggregators initialized inside remotewrite.CheckStreamAggrConfigs(). This prevents from potential resource leaks. - Use separate functions for initializing and reloading of global stream aggregation and per-remoteWrite.url stream aggregation. This makes the code easier to read and maintain. This also fixes INFO and ERROR logs emitted by these functions. - Add an ability to specify `name` option in every stream aggregation config. This option is used as `name` label in metrics exposed by stream aggregation at /metrics page. This simplifies investigation of the exposed metrics. - Add `path` label additionally to `name`, `url` and `position` labels at metrics exposed by streaming aggregation. This label should simplify investigation of the exposed metrics. - Remove `match` and `group` labels from metrics exposed by streaming aggregation, since they have little practical applicability: it is hard to use these labels in query filters and aggregation functions. - Rename the metric `vm_streamaggr_flushed_samples_total` to less misleading `vm_streamaggr_output_samples_total` . This metric shows the number of samples generated by the corresponding streaming aggregation rule. This metric has been added in the commit `861852f262` . See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6462 - Remove the metric `vm_streamaggr_stale_samples_total`, since it is unclear how it can be used in practice. This metric has been added in the commit `861852f262` . See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6462 - Remove Alias and aggrID fields from streamaggr.Options struct, since these fields aren't related to optional params, which could modify the behaviour of the constructed streaming aggregator. Convert the Alias field to regular argument passed to LoadFromFile() function, since this argument is mandatory. - Pass Options arg to LoadFromFile() function by reference, since this structure is quite big. This also allows passing nil instead of Options when default options are enough. - Add `name`, `path`, `url` and `position` labels to `vm_streamaggr_dedup_state_size_bytes` and `vm_streamaggr_dedup_state_items_count` metrics, so they have consistent set of labels comparing to the rest of streaming aggregation metrics. - Convert aggregator.aggrStates field type from `map[string]aggrState` to `[]aggrOutput`, where `aggrOutput` contains the corresponding `aggrState` plus all the related metrics (currently only `vm_streamaggr_output_samples_total` metric is exposed with the corresponding `output` label per each configured output function). This simplifies and speeds up the code responsible for updating per-output metrics. This is a follow-up for the commit `2eb1bc4f81` . See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6604 - Added missing urls to docs ( https://docs.victoriametrics.com/stream-aggregation/ ) in error messages. These urls help users figuring out why VictoriaMetrics or vmagent generates the corresponding error messages. The urls were removed for unknown reason in the commit `2eb1bc4f81` . - Fix incorrect update for `vm_streamaggr_output_samples_total` metric in flushCtx.appendSeriesWithExtraLabel() function. While at it, reduce memory usage by limiting the maximum number of samples per flush to 10K. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5467 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6268	2024-07-15 20:24:01 +02:00
Hui Wang	c1c2286e09	vmagent-dashboard: update streaming aggregation panels (#6588 ) ### Describe Your Changes Please provide a brief description of the changes you made. Be as specific as possible to help others understand the purpose and impact of your modifications. ### Checklist The following checks are mandatory: - [ ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/).	2024-07-05 15:10:37 +02:00
hagen1778	e45d80cd79	dashboards: fix wrong templating for vmauth Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-07-02 13:08:11 +02:00
Andrii Chubatiuk	861852f262	lib/streamaggr: added stale samples metric, added metrics labels (#6462 ) ### Describe Your Changes - added stale metrics counters for input and output samples - added labels for aggregator metrics => `name="{rwctx}:{aggrId}:{aggrSuffix}"` - rwctx - global or number starting from 1 - aggrid - aggregator id starting from 1 - aggrSuffix - <interval>_(by\|without)_label1_label2_labeln e.g: `name="global:1:1m_without_instance_pod"` ### Checklist The following checks are mandatory: - [ ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com>	2024-07-01 14:56:17 +02:00
Nikolay	14b9ef1e4d	dashboards: add dashboard and alerts for vmauth (#6491 ) Signed-off-by: f41gh7 <nik@victoriametrics.com> Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>	2024-06-25 11:15:29 +02:00
hagen1778	b201d1722d	dashboards: fix typo in panel descriptions for vmagent Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-06-21 11:42:38 +02:00
Hui Wang	75ad6c1b49	vmalert-dashboard: replace variable query metric (#6505 ) `vmalert_iteration_total` series number is 4 time less than `vmalert_iteration_duration_seconds`, queries will be lighter.	2024-06-19 09:40:34 +02:00
Andrii Chubatiuk	2da45a8368	vmagent: updated dashboard and alert for stream aggregation (#6427 ) ### Describe Your Changes Added streaming aggregation section to vmagent dashboards Added alert for streaming aggregation and deduplication flush timeouts Removed deprecated compose versions from compose files Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com>	2024-06-10 11:49:00 +02:00
hagen1778	9dd9b4442f	dashboards: use `$__interval` variable for offsets and look-behind windows in annotations This should improve precision of `restarts` and `version change` annotations when zooming-in/zooming-out on the dashboards. The change also makes `restarts` dashboard visible on the panels, so user can disable it from displaying if needed. This could be useful when restarts overlap with version change events. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-05-22 16:32:51 +02:00
Hui Wang	d7b5062917	app/vmalert: support DNS SRV record in `-remoteWrite.url` (#6299 ) part of https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6053, supports [DNS SRV](https://en.wikipedia.org/wiki/SRV_record) address in `-remoteWrite.url` command-line option.	2024-05-22 10:52:51 +02:00
hagen1778	c746ba154d	deployment/dashboards: fix `AnnotationQueryRunner` error in Grafana The error appears when executing annotations query against Prometheus backend because the query itself hasn't specified look-behind window (which is allowed in VictoriaMetrics query engine). https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6309 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-05-21 11:39:02 +02:00
hagen1778	d386a68b59	dashboards: add new panel `Concurrent selects` to `vmstorage` row The panel will show how many ongoing select queries are processed by vmstorage and should help to identify resource bottlenecks. See panel description for more details. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-04-29 13:55:19 +02:00
hagen1778	9256df17fa	deployment: bump Grafana version to 10.4.2 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-04-29 12:10:24 +02:00
hagen1778	8606b48ce5	dashboards: add `Network Usage` panel to `Resource Usage` row https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4478 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-04-29 11:54:17 +02:00
Dima Lazerka	564463259a	deployment/dashboards: properly show version for non-stable docker images (#6150 ) re: .-(?:tags\|heads)-(.)-(?:0\|dirty)-.* cases: victoria-metrics-20240419-160209-heads-enterprise-single-node-0-g08f933ab0c enterprise-single-node victoria-metrics-20240201-133950-tags-v1.97.1-enterprise-0-g760a8733b v1.97.1-enterprise victoria-metrics-20240419-160209-heads-rotation-part-2-0-ge2367b6d1-dirty-848b54cd rotation-part-2-0-ge2367b6d1 victoria-metrics-20240419-160209-heads-lts-1.93-enterprise-search-contention-0-g30ef4aad21-amd64 lts-1.93-enterprise-search-contention victoria-metrics-20240425-150852-tags-v1.101.0-enterprise-0-g718138c64 v1.101.0-enterprise Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Dzmitry Lazerka <dlazerka@gmail.com> Co-authored-by: hagen1778 <roman@victoriametrics.com>	2024-04-29 11:11:28 +02:00
Zakhar Bessarab	6b493582da	dashboards/victoria-metrics-single: allow selecting multiple instance values (#5870 ) Allowing to select multiple instance IPs makes it much easier to view metrics for longer periods of time in dynamic environments such as Kubernetes. In k8s update will also cause IP to change making it harder to use dashboard to check the status. See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5869 --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com>	2024-04-29 10:34:36 +02:00
hagen1778	035de57e5e	dashboards: show max number of active merges instead of cumulative The cumulative number of active merges could be red herring as it its value depends on the number of vmstorages. For example, vmstorage could be added or removed and this will affect the panel. Or, each vmstorage could start a merging process (i.e. for downsampling) and visiually it could look like a massive change. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-04-23 16:41:48 +02:00
Aliaksandr Valialkin	59d495d469	all: replace old https://docs.victoriametrics.com/Troubleshooting.html url with the new one - https://docs.victoriametrics.com/troubleshooting/	2024-04-18 03:26:36 +02:00
Aliaksandr Valialkin	f4b1cbfef0	all: replace old https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html url with the new one - https://docs.victoriametrics.com/cluster-victoriametrics/	2024-04-18 02:54:20 +02:00
Aliaksandr Valialkin	8eeb045d3f	all: replace old https://docs.victoriametrics.com/MetricsQL.html url with the new one - https://docs.victoriametrics.com/metricsql/	2024-04-18 02:14:53 +02:00
Aliaksandr Valialkin	b4fac26360	all: replace old https://docs.victoriametrics.com/vmalert.html url with the new one - https://docs.victoriametrics.com/vmalert/	2024-04-18 01:44:12 +02:00
Aliaksandr Valialkin	4927e64700	all: replace remaining https://docs.victoriametrics.com/vmagent.html urls with the new one - https://docs.victoriametrics.com/vmagent/	2024-04-18 01:36:13 +02:00
hagen1778	f781c42ea4	dashboards: add more context to cluster dashboard panels Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-03-05 15:00:49 +01:00
hagen1778	0ab1069363	dashboards: update links in various panels * use docs.victoriametrics.com instead of github docs * add links to common terms used in VictoriaMetrics Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-03-04 15:43:31 +01:00
hagen1778	ecccd2a1cc	dashboards: add legend details to network panels in cluster dash Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-02-16 10:20:38 +01:00
hagen1778	3380043424	dashboards: follow-up `4369bc1df2` * add more details to changelog * simplify panels description * remove capacity planning recommendation, as it proves it incompetent Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-02-08 09:51:43 +01:00
Hui Wang	4369bc1df2	deployment/dashboards: fix `Storage full ETA` panels (#5747 ) During background downsampling, rate(vm_deduplicated_samples_total{type="merge"}) could be much bigger than rate(vm_rows_added_to_storage_total) and it could last quite some time, which causes negative values of Storage full ETA and confuses users, see playground. Instead of trying to get more accurate results during downsampling, I think it's ok to ignore vm_deduplicated_samples_total at all, it's more reasonable to see Storage full ETA increase after downsampling. --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-02-08 09:43:39 +01:00
hagen1778	487a94565b	dashboards/all: add new panel `CPU spent on GC` It should help identifying cases when too much CPU is spent on garbage collection, and advice users on how this can be addressed. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-02-02 16:21:21 +01:00
hagen1778	29a9b31584	dashboards: add `Targets scraped/s` A new stat panel shows the number of targets scraped by the vmagent per-second. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-02-02 15:48:26 +01:00
hagen1778	db11b94e30	dashboards: update to grafana/grafana:10.3.1 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-02-02 15:41:08 +01:00
hagen1778	02492bc1a4	dashboards/single: fix typo in query for `version` annotation The typo falsely produced many version change events. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-01-31 09:13:46 +01:00
hagen1778	c23e8bee89	dashboards: specify where to see details about dropped labels Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-01-29 07:37:51 +01:00
hagen1778	b25ef138ce	dashboards: reflect dashboard rename in copy script This is a follow-up for `ff33e60a3d` Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-01-22 16:51:24 +01:00
hagen1778	b0287867fe	deployment/dashboards: change title `VictoriaMetrics` to `VictoriaMetrics - single-node` The new title should provide better understanding of this dashboard purpose. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-01-16 20:39:52 +01:00
hagen1778	463455665b	dashboards: update cluster dashboard * add panels for detailed visualization of traffic usage between vmstorage, vminsert, vmselect components and their clients. New panels are available in the rows dedicated to specific components. * update "Slow Queries" panel to show percentage of the slow queries to the total number of read queries served by vmselect. The percentage value should make it more clear for users whether there is a service degradation. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-01-08 11:58:31 +01:00

1 2

67 Commits