Commit Graph

12 Commits

Author SHA1 Message Date
hagen1778
8fb68152e6
alerts: simplify aggregation of alerting rules
This is follow-up after
75196d7234

It updates some of the alerting rules to remove unnecessary aggregations.
It keeps aggregations for expressions which are using multiple time series
filters to make sure their label will match.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-12-11 15:17:30 +01:00
hagen1778
4e0a779efe
deployment/alerts: update TooHighMemoryUsage annotation
The memory usage isn't measured on 5m interval anymore.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-10-24 09:53:44 +02:00
hagen1778
003ef3a518
deployment/alerts: make TooHighMemoryUsage more tolerable to spikes
Using `min_over_time` should reduce the amount of false positives when
component is running in near-the-threshold state. Now it should trigger
only if all collected samples were above the threshold on 10m interval.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-10-24 09:39:46 +02:00
hagen1778
de651165bd
alerting: account for vmauth component for alerts ServiceDown and TooManyRestarts
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-10-03 16:45:33 +02:00
hagen1778
2e4d0d0e41
alerts: move ConcurrentFlushesHitTheLimit alert to health alerts
The `ConcurrentFlushesHitTheLimit` could be related to components like
vminsert, vmstorage, vm-single-node and vmagent. Moving this alert
to the `health` section of alerts will be benefitial for all components
and will remove the duplicates from single/cluster alerts.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-08-03 10:46:26 +02:00
Haleygo
ee933541b2
add vmalertmanager filter for health alerts (#4665) 2023-07-19 20:29:45 +02:00
Roman Khavronenko
01520d3e5d
alerts: update TooHighMemoryUsage threshold (#4256)
It appears that 90% usage for anonymous mem usage
is already concerning. So we lowering the threshold to 80%.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-05-07 22:18:56 +02:00
Max Golionko
f3b829125e
add vmsingle filter for health alerts (#4238) 2023-05-02 20:54:42 +08:00
Max Golionko
5c955dd876
alerts: relax job filter to support job names created by VMOperator (#4203) 2023-04-26 15:32:25 +02:00
Roman Khavronenko
d3608be313
alerts: add TooManyTSIDMisses alerting rule (#3959)
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3502#issuecomment-1358374954

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-03-17 09:46:51 +01:00
Zakhar Bessarab
6711eec109
docker-compose: move TooManyLogs into vm-health alerts set (#3199) 2022-10-05 19:23:36 +02:00
Roman Khavronenko
5714a68ac6
deployment/docker: move cluster compose env to master branch (#3130)
* deployment/docker: move cluster compose env to master branch

The change supposed to simplify the process of maintaining for
single/cluster docker-compose envs, alerts, dashboards. It also
supposes to reduce confusion for users when looking for cluster
related alerts/configs.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* deployment/docker: move cluster compose env to master branch

Review updates.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-21 11:48:38 +03:00