VictoriaMetrics/docs/anomaly-detection/FAQ.md
Artem Navoiev 0c06934a59
vmanonaly docs add .html for the section document models
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-17 15:25:30 +01:00

5.4 KiB

sort weight title menu aliases
3 3 FAQ
docs
identifier parent weight
vmanomaly-faq anomaly-detection 3
/anomaly-detection/FAQ.html

FAQ - VictoriaMetrics Anomaly Detection

What is VictoriaMetrics Anomaly Detection (vmanomaly)?

VictoriaMetrics Anomaly Detection, also known as vmanomaly, is a service for detecting unexpected changes in time series data. Utilizing machine learning models, it computes and pushes back an "anomaly score" for user-specified metrics. This hands-off approach to anomaly detection reduces the need for manual alert setup and can adapt to various metrics, improving your observability experience.

Please refer to our guide section to find out more.

How does vmanomaly work?

vmanomaly applies built-in (or custom) anomaly detection algorithms, specified in a config file. Although a single config file supports one model, running multiple instances of vmanomaly with different configs is possible and encouraged for parallel processing or better support for your use case (i.e. simpler model for simple metrics, more sophisticated one for metrics with trends and seasonalities).

Please refer to about section to find out more.

What data does vmanomaly operate on?

vmanomaly operates on data fetched from VictoriaMetrics, where you can leverage full power of MetricsQL for data selection, sampling, and processing. Users can also apply global filters for more targeted data analysis, enhancing scope limitation and tenant visibility.

Respective config is defined in a reader section.

Handling noisy input data

vmanomaly operates on data fetched from VictoriaMetrics using MetricsQL queries, so the initial data quality can be fine-tuned with aggregation, grouping, and filtering to reduce noise and improve anomaly detection accuracy.

Output produced by vmanomaly

vmanomaly models generate metrics like anomaly_score, yhat, yhat_lower, yhat_upper, and y. These metrics provide a comprehensive view of the detected anomalies. The service also produces health check metrics for monitoring its performance.

Choosing the right model for vmanomaly

Selecting the best model for vmanomaly depends on the data's nature and the types of anomalies to detect. For instance, Z-score is suitable for data without trends or seasonality, while more complex patterns might require models like Prophet.

Please refer to respective blogpost on anomaly types and alerting heuristics for more details.

Still not 100% sure what to use? We are here to help.

Alert generation in vmanomaly

While vmanomaly detects anomalies and produces scores, it does not directly generate alerts. The anomaly scores are written back to VictoriaMetrics, where an external alerting tool, like vmalert, can be used to create alerts based on these scores for integrating it with your alerting management system.

Preventing alert fatigue

Produced anomaly scores are designed in such a way that values from 0.0 to 1.0 indicate non-anomalous data, while a value greater than 1.0 is generally classified as an anomaly. However, there are no perfect models for anomaly detection, that's why reasonable defaults expressions like anomaly_score > 1 may not work 100% of the time. However, anomaly scores, produced by vmanomaly are written back as metrics to VictoriaMetrics, where tools like vmalert can use MetricsQL expressions to fine-tune alerting thresholds and conditions, balancing between avoiding false negatives and reducing false positives.

Resource consumption of vmanomaly

vmanomaly itself is a lightweight service, resource usage is primarily dependent on scheduling (how often and on what data to fit/infer your models), # and size of timeseries returned by your queries, and the complexity of the employed models. Its resource usage is directly related to these factors, making it adaptable to various operational scales.

Scaling vmanomaly

vmanomaly can be scaled horizontally by launching multiple independent instances, each with its own MetricsQL queries and configurations. This flexibility allows it to handle varying data volumes and throughput demands efficiently.