Commit Graph

9 Commits

Author SHA1 Message Date
hagen1778
f2b06484f2
alerts: move ConcurrentFlushesHitTheLimit alert to health alerts
The `ConcurrentFlushesHitTheLimit` could be related to components like
vminsert, vmstorage, vm-single-node and vmagent. Moving this alert
to the `health` section of alerts will be benefitial for all components
and will remove the duplicates from single/cluster alerts.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-08-11 04:39:28 -07:00
Aliaksandr Valialkin
531b35b6c0
docs/Troubleshooting.md: document an additional case, which could result in slow inserts
If `-cacheExpireDuration` is lower than the interval between ingested samples for the same time series,
then vm_slow_row_inserts_total` metric is increased.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3976#issuecomment-1476883183
2023-03-20 14:33:27 -07:00
Aliaksandr Valialkin
b275983403
lib/writeconcurrencylimiter: improve the logic behind -maxConcurrentInserts limit
Previously the -maxConcurrentInserts was limiting the number of established client connections,
which write data to VictoriaMetrics. Some of these connections could be idle.
Such connections do not consume big amounts of CPU and RAM, so there is a little sense in limiting
the number of such connections. So now the -maxConcurrentInserts command-line option
limits the number of concurrently executed insert requests, not including idle connections.

It is recommended removing -maxConcurrentInserts command-line option, since the default value
for this option should work good for most cases.
2023-01-06 22:07:16 -08:00
Artem Navoiev
393f4ab86f
update links to grafana dashboards (#3534)
docs: update links to grafana dashboards

Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2022-12-28 11:22:02 -08:00
Aliaksandr Valialkin
d2e34b8052
{dashboards,alerts}: subtitute {type="indexdb"} with {type=~"indexdb.*"} inside queries after 8189770c50
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3337
2022-12-05 16:00:42 -08:00
Roman Khavronenko
73340dcb01
dashboards: add Disk space usage % and Disk space usage % by type panels (#3436)
The new panels have been added to the vmstorage and drilldown rows.

`Disk space usage %` is supposed to show disk space usage percentage.
This panel is now also referred by `DiskRunsOutOfSpace` alerting rule.
This panel has Drilldown option to show absolute values.

`Disk space usage % by type` shows the relation between datapoints
and indexdb size. It supposed to help identify cases when indexdb
starts to take too much disk space.
This panel has Drilldown option to show absolute values.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-12-05 00:19:36 -08:00
Roman Khavronenko
cae148d5c6
dashboards: cluster dashboard update (#3380)
The purpose of the update is to make the dash more usable
for large installations with many instances. Panels which showed
metrics per-instance (Mem, CPU) now are showing metrics per-job or min/max/avg
aggregations in % instead. This supposed to help immediately to identify
resource shortage and remain usable for small and big installations.

For cases when detailed info is needed, to the bottom of the dashboard
a new row `Drilldown` was added. Panels like Mem or CPU now contain
a `data-link` named `Drilldown` (cis shown on line click) which takes
user to more detailed panel.

The change list is the following:
* bump Grafana version to 9.1.0;
* replace old "Graph" panel with "TimeSeries" panel;
* improve Uptime panel to show number of instances per job;
* show % usage of Mem and CPU instead of absolute values;
* `Caches` row was removed. All needed info for caches is now part of `Troubleshooting`;
* add `Drilldown` section for detailed resource usage;
* add Annotations for Alert triggers. Not all alerts are supposed to be displayed
on the dashboard, but only those with label `show_at: dashboard`.
See `alerts-cluster.yml` change.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-11-24 13:20:10 -08:00
Zakhar Bessarab
434b00cee8
docker-compose: move TooManyLogs into vm-health alerts set (#3199) 2022-10-05 22:42:31 +03:00
Roman Khavronenko
f772ee8326
deployment/docker: move cluster compose env to master branch (#3130)
* deployment/docker: move cluster compose env to master branch

The change supposed to simplify the process of maintaining for
single/cluster docker-compose envs, alerts, dashboards. It also
supposes to reduce confusion for users when looking for cluster
related alerts/configs.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* deployment/docker: move cluster compose env to master branch

Review updates.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-21 12:03:10 +03:00