diff --git a/deployment/docker/alerts-vmalert.yml b/deployment/docker/alerts-vmalert.yml index 35f5c1a8b..267ada71c 100644 --- a/deployment/docker/alerts-vmalert.yml +++ b/deployment/docker/alerts-vmalert.yml @@ -51,6 +51,19 @@ groups: produces 0 samples over the last 30min. It might be caused by a misconfiguration or incorrect query expression." + - alert: TooManyMissedIterations + expr: sum(increase(vmalert_iteration_missed_total[5m])) by(job, instance, group) > 0 + for: 15m + labels: + severity: warning + annotations: + summary: "vmalert instance {{ $labels.instance }} is missing rules evaluations" + description: "vmalert instance {{ $labels.instance }} is missing rules evaluations for group \"{{ $labels.group }}\". + The group evaluation time takes longer than the configured evaluation interval. This may result in missed + alerting notifications or recording rules samples. Try increasing evaluation interval or concurrency of + group \"{{ $labels.group }}\". See https://docs.victoriametrics.com/vmalert.html#groups. + If rule expressions are taking longer than expected, please see https://docs.victoriametrics.com/Troubleshooting.html#slow-queries." + - alert: RemoteWriteErrors expr: sum(increase(vmalert_remotewrite_errors_total[5m])) by(job, instance) > 0 for: 15m diff --git a/docs/CHANGELOG.md b/docs/CHANGELOG.md index ec1bf8e65..8ae5eaf70 100644 --- a/docs/CHANGELOG.md +++ b/docs/CHANGELOG.md @@ -49,6 +49,7 @@ The sandbox cluster installation is running under the constant load generated by * FEATURE: [vmbackup](https://docs.victoriametrics.com/vmbackup.html): add `-deleteAllObjectVersions` command-line flag, which can be used for forcing removal of all object versions in remote object storage. See [this](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5121) issue and [these docs](https://docs.victoriametrics.com/vmbackup.html#permanent-deletion-of-objects-in-s3-compatible-storages) for the details. * FEATURE: [Alerting rules for VictoriaMetrics](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/master/deployment/docker#alerts): account for `vmauth` component for alerts `ServiceDown` and `TooManyRestarts`. * FEATURE: [Alerting rules for VictoriaMetrics](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/master/deployment/docker#alerts): make `TooHighMemoryUsage` more tolerable to spikes or near-the-threshold states. The change should reduce number of false positives. +* FEATURE: [Alerting rules for VictoriaMetrics](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/master/deployment/docker#alerts): add `TooManyMissedIterations` alerting rule for vmalert to detect groups that miss their evaulations due to slow queries. * FEATURE: [vmui](https://docs.victoriametrics.com/#vmui): add support for functions, labels, values in autocomplete. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3006). * FEATURE: [vmui](https://docs.victoriametrics.com/#vmui): retain specified time interval when executing a query from `Top Queries`. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5097). * FEATURE: [vmui](https://docs.victoriametrics.com/#vmui): improve repeated VMUI page load times by enabling caching of static js and css at web browser side according to [these recommendations](https://developer.chrome.com/docs/lighthouse/performance/uses-long-cache-ttl/).