update wiki pages

Vika 2024-01-08 09:31:57 +00:00
parent e7eacb6234
commit 6d6d684a8f
22 changed files with 2467 additions and 17 deletions

@ -0,0 +1,124 @@
---
# sort: 4
weight: 4
title: CHANGELOG
menu:
docs:
identifier: "vmanomaly-changelog"
parent: "anomaly-detection"
sort: 4
weight: 4
aliases:
- /anomaly-detection/CHANGELOG.html
---
# CHANGELOG
Please find the changelog for VictoriaMetrics Anomaly Detection below.
The following `tip` changes can be tested by building from the `latest` tag:
```bash
docker pull us-docker.pkg.dev/victoriametrics-test/public/vmanomaly-trial:latest
```
Please find [launch instructions here](/vmanomaly.html#run-vmanomaly-docker-container).
# tip
## v1.7.2
Released: 2023-12-21
- FIX: fit/infer calls are now skipped if we have insufficient *valid* data to run on.
- FIX: proper handling of `inf` and `NaN` in fit/infer calls.
- FEATURE: add counter of skipped model runs `vmanomaly_model_runs_skipped` to healthcheck metrics.
- FEATURE: add exponential retries wrapper to VmReader's `read_metrics()`.
- FEATURE: add `BacktestingScheduler` for consecutive retrospective fit/infer calls.
- FEATURE: add improved & numerically stable anomaly scores.
- IMPROVEMENT: add full config validation. The probability of getting errors in later stages (say, model fit) is greatly reduced now. All the config validation errors that needs to be fixed are now a part of logging.
> **note**: this is an backward-incompatible change, as `model` config section now expects key-value args for internal model defined in nested `args`.
- IMPROVEMENT: add explicit support of `gzip`-ed responses from vmselect in VmReader.
## v1.6.0
Released: 2023-10-30
- IMPROVEMENT:
- now all the produced healthcheck metrics have `vmanomaly_` prefix for easier accessing.
- updated docs for monitoring.
> **note**: this is an backward-incompatible change, as metric names will be changed, resulting in new metrics creation, i.e. `model_datapoints_produced` will become `vmanomaly_model_datapoints_produced`
- IMPROVEMENT: Set default value for `--log_level` from `DEBUG` to `INFO` to reduce logs verbosity.
- IMPROVEMENT: Add alias `--log-level` to `--log_level`.
- FEATURE: Added `extra_filters` parameter to reader. It allows to apply global filters to all queries.
- FEATURE: Added `verify_tls` parameter to reader and writer. It allows to disable TLS verification for remote endpoint.
- FEATURE: Added `bearer_token` parameter to reader and writer. It allows to pass bearer token for remote endpoint for authentication.
- BUGFIX: Fixed passing `workers` parameter for reader. Previously it would throw a runtime error if `workers` was specified.
## v1.5.1
Released: 2023-09-18
- IMPROVEMENT: Infer from the latest seen datapoint for each query. Handles the case datapoints arrive late.
## v1.5.0
Released: 2023-08-11
- FEATURE: add `--license` and `--license-file` command-line flags for license code verification.
- IMPROVEMENT: Updated Python to 3.11.4 and updated dependencies.
- IMPROVEMENT: Guide documentation for Custom Model usage.
## v1.4.2
Released: 2023-06-09
- FIX: Fix case with received metric labels overriding generated.
## v1.4.1
Released: 2023-06-09
- IMPROVEMENT: Update dependencies.
## v1.4.0
Released: 2023-05-06
- FEATURE: Reworked self-monitoring grafana dashboard for vmanomaly.
- IMPROVEMENT: Update python version and dependencies.
## v1.3.0
Released: 2023-03-21
- FEATURE: Parallelized queries. See `reader.workers` param to control parallelism. By default it's value is equal to number of queries (sends all the queries at once).
- IMPROVEMENT: Updated self-monitoring dashboard.
- IMPROVEMENT: Reverted back default bind address for /metrics server to 0.0.0.0, as vmanomaly is distributed in Docker images.
- IMPROVEMENT: Silenced Prophet INFO logs about yearly seasonality.
## v1.2.2
Released: 2023-03-19
- FIX: Fix `for` metric label to pass QUERY_KEY.
- FEATURE: Added `timeout` config param to reader, writer, monitoring.push.
- FIX: Don't hang if scheduler-model thread exits.
- FEATURE: Now reader, writer and monitoring.push will not halt the process if endpoint is inaccessible or times out, instead they will increment metrics `*_response_count{code=~"timeout|connection_error"}`.
## v1.2.1
Released: 2023-02-18
- FIX: Fixed scheduler thread starting.
- FIX: Fix rolling model fit+infer.
- BREAKING CHANGE: monitoring.pull server now binds by default on 127.0.0.1 instead of 0.0.0.0. Please specify explicitly in monitoring.pull.addr what IP address it should bind to for serving /metrics.
## v1.2.0
Released: 2023-02-04
- FEATURE: With arg `--watch` watches for config(s) changes and reloads the service automatically.
- IMPROVEMENT: Remove "provide_series" from HoltWinters model. Only Prophet model now has it, because it may produce a lot of series if "holidays" is on.
- IMPROVEMENT: if Prophet's "provide_series" is omitted, then all series are returned.
- DEPRECATION: Config monitoring.endpount_url is deprecated in favor of monitoring.url.
- DEPRECATION: Remove 'enable' param from config monitoring.pull. Now /metrics server is started whenever monitoring.pull is present.
- IMPROVEMENT: include example configs into the docker image at /vmanomaly/config/*
- IMPROVEMENT: include self-monitoring grafana dashboard into the docker image under /vmanomaly/dashboard/vmanomaly_grafana_dashboard.json
## v1.1.0
Released: 2023-01-23
- IMPROVEMENT: update Python dependencies
- FEATURE: Add _multivariate_ IsolationForest model.
## v1.0.1
Released: 2023-01-06
- FIX: prophet model incorrectly predicted two points in case of only one
## v1.0.0-beta
Released: 2022-12-08
- First public release is available

55
anomaly-detection/FAQ.md Normal file

@ -0,0 +1,55 @@
---
# sort: 3
weight: 3
title: FAQ
menu:
docs:
identifier: "vmanomaly-faq"
parent: "anomaly-detection"
weight: 3
sort: 3
aliases:
- /anomaly-detection/FAQ.html
---
# FAQ - VictoriaMetrics Anomaly Detection
## What is VictoriaMetrics Anomaly Detection (vmanomaly)?
VictoriaMetrics Anomaly Detection, also known as `vmanomaly`, is a service for detecting unexpected changes in time series data. Utilizing machine learning models, it computes and pushes back an ["anomaly score"](/anomaly-detection/components/models/models.html#vmanomaly-output) for user-specified metrics. This hands-off approach to anomaly detection reduces the need for manual alert setup and can adapt to various metrics, improving your observability experience.
Please refer to [our guide section](/anomaly-detection/#practical-guides-and-installation) to find out more.
## How Does vmanomaly Work?
`vmanomaly` applies built-in (or custom) [anomaly detection algorithms](/anomaly-detection/components/models), specified in a config file. Although a single config file supports one model, running multiple instances of `vmanomaly` with different configs is possible and encouraged for parallel processing or better support for your use case (i.e. simpler model for simple metrics, more sophisticated one for metrics with trends and seasonalities).
Please refer to [about](/vmanomaly.html#about) section to find out more.
## What Data Does vmanomaly Operate On?
`vmanomaly` operates on data fetched from VictoriaMetrics, where you can leverage full power of [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html) for data selection, sampling, and processing. Users can also [apply global filters](https://docs.victoriametrics.com/#prometheus-querying-api-enhancements) for more targeted data analysis, enhancing scope limitation and tenant visibility.
Respective config is defined in a [`reader`](/anomaly-detection/components/reader.html#vm-reader) section.
## Handling Noisy Input Data
`vmanomaly` operates on data fetched from VictoriaMetrics using [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html) queries, so the initial data quality can be fine-tuned with aggregation, grouping, and filtering to reduce noise and improve anomaly detection accuracy.
## Output Produced by vmanomaly
`vmanomaly` models generate [metrics](/anomaly-detection/components/models/models.html#vmanomaly-output) like `anomaly_score`, `yhat`, `yhat_lower`, `yhat_upper`, and `y`. These metrics provide a comprehensive view of the detected anomalies. The service also produces [health check metrics](/anomaly-detection/components/monitoring.html#metrics-generated-by-vmanomaly) for monitoring its performance.
## Choosing the Right Model for vmanomaly
Selecting the best model for `vmanomaly` depends on the data's nature and the types of anomalies to detect. For instance, [Z-score](anomaly-detection/components/models/models.html#z-score) is suitable for data without trends or seasonality, while more complex patterns might require models like [Prophet](anomaly-detection/components/models/models.html#prophet).
Please refer to [respective blogpost on anomaly types and alerting heuristics](https://victoriametrics.com/blog/victoriametrics-anomaly-detection-handbook-chapter-2/) for more details.
Still not 100% sure what to use? We are [here to help](/anomaly-detection/#get-in-touch).
## Alert Generation in vmanomaly
While `vmanomaly` detects anomalies and produces scores, it *does not directly generate alerts*. The anomaly scores are written back to VictoriaMetrics, where an external alerting tool, like [`vmalert`](/vmalert.html), can be used to create alerts based on these scores for integrating it with your alerting management system.
## Preventing Alert Fatigue
Produced anomaly scores are designed in such a way that values from 0.0 to 1.0 indicate non-anomalous data, while a value greater than 1.0 is generally classified as an anomaly. However, there are no perfect models for anomaly detection, that's why reasonable defaults expressions like `anomaly_score > 1` may not work 100% of the time. However, anomaly scores, produced by `vmanomaly` are written back as metrics to VictoriaMetrics, where tools like [`vmalert`](/vmalert.html) can use [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html) expressions to fine-tune alerting thresholds and conditions, balancing between avoiding [false negatives](https://victoriametrics.com/blog/victoriametrics-anomaly-detection-handbook-chapter-1/#false-negative) and reducing [false positives](https://victoriametrics.com/blog/victoriametrics-anomaly-detection-handbook-chapter-1/#false-positive).
## Resource Consumption of vmanomaly
`vmanomaly` itself is a lightweight service, resource usage is primarily dependent on [scheduling](/anomaly-detection/components/scheduler.html) (how often and on what data to fit/infer your models), [# and size of timeseries returned by your queries](/anomaly-detection/components/reader.html#vm-reader), and the complexity of the employed [models](anomaly-detection/components/models). Its resource usage is directly related to these factors, making it adaptable to various operational scales.
## Scaling vmanomaly
`vmanomaly` can be scaled horizontally by launching multiple independent instances, each with its own [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html) queries and [configurations](/anomaly-detection/components/). This flexibility allows it to handle varying data volumes and throughput demands efficiently.

@ -0,0 +1,60 @@
---
# sort: 14
title: VictoriaMetrics Anomaly Detection
weight: 0
disableToc: true
menu:
docs:
parent: 'victoriametrics'
sort: 0
weight: 0
aliases:
- /anomaly-detection.html
---
# VictoriaMetrics Anomaly Detection
In the dynamic and complex world of system monitoring, VictoriaMetrics Anomaly Detection, being a part of our [Enterprise offering](https://victoriametrics.com/products/enterprise/), stands as a pivotal tool for achieving advanced observability. It empowers SREs and DevOps teams by automating the intricate task of identifying abnormal behavior in time-series data. It goes beyond traditional threshold-based alerting, utilizing machine learning techniques to not only detect anomalies but also minimize false positives, thus reducing alert fatigue. By providing simplified alerting mechanisms atop of [unified anomaly scores](/anomaly-detection/components/models/models.html#vmanomaly-output), it enables teams to spot and address potential issues faster, ensuring system reliability and operational efficiency.
## Key Components
Explore the integral components that configure VictoriaMetrics Anomaly Detection:
* [Get familiar with components](/anomaly-detection/components)
- [Models](/anomaly-detection/components/models)
- [Reader](/anomaly-detection/components/reader.html)
- [Scheduler](/anomaly-detection/components/scheduler.html)
- [Writer](/anomaly-detection/components/writer.html)
- [Monitoring](/anomaly-detection/components/monitoring.html)
## Practical Guides and Installation
Begin your VictoriaMetrics Anomaly Detection journey with ease using our guides and installation instructions:
- **Quick Start Guide**: Jumpstart your anomaly detection setup to simplify the process of integrating anomaly detection into your observability ecosystem. Get started [**here**](/anomaly-detection/guides/guide-vmanomaly-vmalert.html).
- **Installation Options**: Choose the method that best fits your environment:
- **Docker Installation**: Ideal for containerized environments. Follow our [Docker guide](../vmanomaly.md#run-vmanomaly-docker-container) for a smooth setup.
- **Helm Chart Installation**: Perfect for Kubernetes users. Deploy using our [Helm charts](https://github.com/VictoriaMetrics/helm-charts/tree/master/charts/victoria-metrics-anomaly) for an efficient integration.
> Note: starting from [v1.5.0](./CHANGELOG.md#v150) `vmanomaly` requires a [license key](/vmanomaly.html#licensing) to run. You can obtain a trial license key [**here**](https://victoriametrics.com/products/enterprise/trial/index.html).
## Deep Dive into Anomaly Detection
Enhance your knowledge with our handbook on Anomaly Detection & Root Cause Analysis and stay updated:
* Anomaly Detection Handbook
- [Introduction to Time Series Anomaly Detection](https://victoriametrics.com/blog/victoriametrics-anomaly-detection-handbook-chapter-1/)
- [Types of Anomalies in Time Series Data](https://victoriametrics.com/blog/victoriametrics-anomaly-detection-handbook-chapter-2/)
- [Techniques and Models for Anomaly Detection](https://victoriametrics.com/blog/victoriametrics-anomaly-detection-handbook-chapter-3/)
* Follow the [`#anomaly-detection`](https://victoriametrics.com/blog/tags/anomaly-detection/) tag in our blog
## Frequently Asked Questions (FAQ)
Got questions about VictoriaMetrics Anomaly Detection? Chances are, we've got the answers ready for you.
Dive into [our FAQ section](/anomaly-detection/FAQ.html) to find responses to common questions.
## Get in Touch
We're eager to connect with you and tailor our solutions to your specific needs. Here's how you can engage with us:
* [Book a Demo](https://calendly.com/fred-navruzov/) to discover what our product can do.
* Interested in exploring our [Enterprise features](https://new.victoriametrics.com/products/enterprise), including Anomaly Detection? [Request your trial license](https://new.victoriametrics.com/products/enterprise/trial/) today and take the first step towards advanced system observability.
---
Our [CHANGELOG is just a click away](./CHANGELOG.md), keeping you informed about the latest updates and enhancements.

@ -0,0 +1,27 @@
---
# sort: 1
title: Components
weight: 0
menu:
docs:
identifier: "vmanomaly-components"
parent: "anomaly-detection"
weight: 0
sort: 1
aliases:
- /anomaly-detection/components/
- /anomaly-detection/components/index.html
---
# Components
This chapter describes different components, that correspond to respective sections of a config to launch VictoriaMetrics Anomaly Detection (or simply [`vmanomaly`](/vmanomaly.html)) service:
- [Model(s) section](models/README.md) - Required
- [Reader section](reader.html) - Required
- [Scheduler section](scheduler.html) - Required
- [Writer section](writer.html) - Required
- [Monitoring section](monitoring.html) - Optional
> **Note**: starting from [v1.7.0](../CHANGELOG.md#v172), once the service starts, automated config validation is performed. Please see container logs for errors that need to be fixed to create fully valid config, visiting sections above for examples and documentation.

@ -0,0 +1,20 @@
---
title: Models
weight: 1
# sort: 1
menu:
docs:
identifier: "vmanomaly-models"
parent: "vmanomaly-components"
weight: 1
# sort: 1
aliases:
- /anomaly-detection/components/models.html
---
# Models
This section describes `Model` component of VictoriaMetrics Anomaly Detection (or simply [`vmanomaly`](/vmanomaly.html)) and the guide of how to define respective section of a config to launch the service.
Please find a guide of how to use [built-in models](/anomaly-detection/docs/models/models.html) for anomaly detection, as well as how to define and use your own [custom model](/anomaly-detection/docs/models/custom_model.html).

@ -0,0 +1,174 @@
---
# sort: 2
weight: 2
title: Custom Model Guide
# disableToc: true
menu:
docs:
parent: "vmanomaly-models"
weight: 2
# sort: 2
aliases:
- /anomaly-detection/components/models/custom_model.html
---
# Custom Model Guide
**Note**: vmanomaly is a part of [enterprise package](https://docs.victoriametrics.com/enterprise.html). Please [contact us](https://victoriametrics.com/contact-us/) to find out more.
Apart from vmanomaly predefined models, users can create their own custom models for anomaly detection.
Here in this guide, we will
- Make a file containing our custom model definition
- Define VictoriaMetrics Anomaly Detection config file to use our custom model
- Run service
**Note**: The file containing the model should be written in [Python language](https://www.python.org/) (3.11+)
## 1. Custom model
We'll create `custom_model.py` file with `CustomModel` class that will inherit from vmanomaly `Model` base class.
In the `CustomModel` class there should be three required methods - `__init__`, `fit` and `infer`:
* `__init__` method should initiate parameters for the model.
**Note**: if your model relies on configs that have `arg` [key-value pair argument](./models.md#section-overview), do not forget to use Python's `**kwargs` in method's signature and to explicitly call
```python
super().__init__(**kwargs)
```
to initialize the base class each model derives from
* `fit` method should contain the model training process.
* `infer` should return Pandas.DataFrame object with model's inferences.
For the sake of simplicity, the model in this example will return one of two values of `anomaly_score` - 0 or 1 depending on input parameter `percentage`.
<div class="with-copy" markdown="1">
```python
import numpy as np
import pandas as pd
import scipy.stats as st
import logging
from model.model import Model
logger = logging.getLogger(__name__)
class CustomModel(Model):
"""
Custom model implementation.
"""
def __init__(self, percentage: float = 0.95, **kwargs):
super().__init__(**kwargs)
self.percentage = percentage
self._mean = np.nan
self._std = np.nan
def fit(self, df: pd.DataFrame):
# Model fit process:
y = df['y']
self._mean = np.mean(y)
self._std = np.std(y)
if self._std == 0.0:
self._std = 1 / 65536
def infer(self, df: pd.DataFrame) -> np.array:
# Inference process:
y = df['y']
zscores = (y - self._mean) / self._std
anomaly_score_cdf = st.norm.cdf(np.abs(zscores))
df_pred = df[['timestamp', 'y']].copy()
df_pred['anomaly_score'] = anomaly_score_cdf > self.percentage
df_pred['anomaly_score'] = df_pred['anomaly_score'].astype('int32', errors='ignore')
return df_pred
```
</div>
## 2. Configuration file
Next, we need to create `config.yaml` file with VM Anomaly Detection configuration and model input parameters.
In the config file `model` section we need to put our model class `model.custom.CustomModel` and all parameters used in `__init__` method.
You can find out more about configuration parameters in vmanomaly docs.
<div class="with-copy" markdown="1">
```yaml
scheduler:
infer_every: "1m"
fit_every: "1m"
fit_window: "1d"
model:
# note: every custom model should implement this exact path, specified in `class` field
class: "model.model.CustomModel"
# custom model params are defined here
percentage: 0.9
reader:
datasource_url: "http://localhost:8428/"
queries:
ingestion_rate: 'sum(rate(vm_rows_inserted_total)) by (type)'
churn_rate: 'sum(rate(vm_new_timeseries_created_total[5m]))'
writer:
datasource_url: "http://localhost:8428/"
metric_format:
__name__: "custom_$VAR"
for: "$QUERY_KEY"
model: "custom"
run: "test-format"
monitoring:
# /metrics server.
pull:
port: 8080
push:
url: "http://localhost:8428/"
extra_labels:
job: "vmanomaly-develop"
config: "custom.yaml"
```
</div>
## 3. Running model
Let's pull the docker image for vmanomaly:
<div class="with-copy" markdown="1">
```sh
docker pull us-docker.pkg.dev/victoriametrics-test/public/vmanomaly-trial:latest
```
</div>
Now we can run the docker container putting as volumes both config and model file:
**Note**: place the model file to `/model/custom.py` path when copying
<div class="with-copy" markdown="1">
```sh
docker run -it \
--net [YOUR_NETWORK] \
-v [YOUR_LICENSE_FILE_PATH]:/license.txt \
-v $(PWD)/custom_model.py:/vmanomaly/src/model/custom.py \
-v $(PWD)/custom.yaml:/config.yaml \
us-docker.pkg.dev/victoriametrics-test/public/vmanomaly-trial:latest /config.yaml \
--license-file=/license.txt
```
</div>
Please find more detailed instructions (license, etc.) [here](/vmanomaly.html#run-vmanomaly-docker-container)
## Output
As the result, this model will return metric with labels, configured previously in `config.yaml`.
In this particular example, 2 metrics will be produced. Also, there will be added other metrics from input query result.
```
{__name__="custom_anomaly_score", for="ingestion_rate", model="custom", run="test-format"}
{__name__="custom_anomaly_score", for="churn_rate", model="custom", run="test-format"}
```

@ -0,0 +1,323 @@
---
# sort: 1
weight: 1
title: Built-in Models
# disableToc: true
menu:
docs:
parent: "vmanomaly-models"
# sort: 1
weight: 1
aliases:
- /anomaly-detection/components/models/models.html
---
# Models config parameters
## Section Overview
VM Anomaly Detection (`vmanomaly` hereinafter) models support 2 groups of parameters:
- **`vmanomaly`-specific** arguments - please refer to *Parameters specific for vmanomaly* and *Default model parameters* subsections for each of the models below.
- Arguments to **inner model** (say, [Facebook's Prophet](https://facebook.github.io/prophet/docs/quick_start.html#python-api)), passed in a `args` argument as key-value pairs, that will be directly given to the model during initialization to allow granular control. Optional.
**Note**: For users who may not be familiar with Python data types such as `list[dict]`, a [dictionary](https://www.w3schools.com/python/python_dictionaries.asp) in Python is a data structure that stores data values in key-value pairs. This structure allows for efficient data retrieval and management.
**Models**:
* [ARIMA](#arima)
* [Holt-Winters](#holt-winters)
* [Prophet](#prophet)
* [Rolling Quantile](#rolling-quantile)
* [Seasonal Trend Decomposition](#seasonal-trend-decomposition)
* [Z-score](#z-score)
* [MAD (Median Absolute Deviation)](#mad-median-absolute-deviation)
* [Isolation forest (Multivariate)](#isolation-forest-multivariate)
* [Custom model](#custom-model)
---
## [ARIMA](https://en.wikipedia.org/wiki/Autoregressive_integrated_moving_average)
Here we use ARIMA implementation from `statsmodels` [library](https://www.statsmodels.org/dev/generated/statsmodels.tsa.arima.model.ARIMA.html)
*Parameters specific for vmanomaly*:
\* - mandatory parameters.
* `class`\* (string) - model class name `"model.arima.ArimaModel"`
* `z_threshold` (float) - [standard score](https://en.wikipedia.org/wiki/Standard_score) for calculating boundaries to define anomaly score. Defaults to 2.5.
* `provide_series` (list[string]) - List of columns to be produced and returned by the model. Defaults to `["anomaly_score", "yhat", "yhat_lower" "yhat_upper", "y"]`. Output can be **only a subset** of a given column list.
* `resample_freq` (string) = Frequency to resample input data into, e.g. data comes at 15 seconds resolution, and resample_freq is '1m'. Then fitting data will be downsampled to '1m' and internal model is trained at '1m' intervals. So, during inference, prediction data would be produced at '1m' intervals, but interpolated to "15s" to match with expected output, as output data must have the same timestamps.
*Default model parameters*:
* `order`\* (list[int]) - ARIMA's (p,d,q) order of the model for the autoregressive, differences, and moving average components, respectively.
* `args`: (dict) - Inner model args (key-value pairs). See accepted params in [model documentation](https://www.statsmodels.org/dev/generated/statsmodels.tsa.arima.model.ARIMA.html). Defaults to empty (not provided). Example: {"trend": "c"}
*Config Example*
<div class="with-copy" markdown="1">
```yaml
model:
class: "model.arima.ArimaModel"
# ARIMA's (p,d,q) order
order:
- 1
- 1
- 0
z_threshold: 2.7
resample_freq: '1m'
# Inner model args (key-value pairs) accepted by statsmodels.tsa.arima.model.ARIMA
args:
trend: 'c'
```
</div>
---
## [Holt-Winters](https://en.wikipedia.org/wiki/Exponential_smoothing)
Here we use Holt-Winters Exponential Smoothing implementation from `statsmodels` [library](https://www.statsmodels.org/dev/generated/statsmodels.tsa.holtwinters.ExponentialSmoothing.html). All parameters from this library can be passed to the model.
*Parameters specific for vmanomaly*:
\* - mandatory parameters.
* `class`\* (string) - model class name `"model.holtwinters.HoltWinters"`
* `frequency`\* (string) - Must be set equal to sampling_period. Model needs to know expected data-points frequency (e.g. '10m').
If omitted, frequency is guessed during fitting as **the median of intervals between fitting data timestamps**. During inference, if incoming data doesn't have the same frequency, then it will be interpolated.
E.g. data comes at 15 seconds resolution, and our resample_freq is '1m'. Then fitting data will be downsampled to '1m' and internal model is trained at '1m' intervals. So, during inference, prediction data would be produced at '1m' intervals, but interpolated to "15s" to match with expected output, as output data must have the same timestamps.
As accepted by pandas.Timedelta (e.g. '5m').
* `seasonality` (string) - As accepted by pandas.Timedelta.
If `seasonal_periods` is not specified, it is calculated as `seasonality` / `frequency`
Used to compute "seasonal_periods" param for the model (e.g. '1D' or '1W').
* `z_threshold` (float) - [standard score](https://en.wikipedia.org/wiki/Standard_score) for calculating boundaries to define anomaly score. Defaults to 2.5.
*Default model parameters*:
* If [parameter](https://www.statsmodels.org/dev/generated/statsmodels.tsa.holtwinters.ExponentialSmoothing.html#statsmodels.tsa.holtwinters.ExponentialSmoothing-parameters) `seasonal` is not specified, default value will be `add`.
* If [parameter](https://www.statsmodels.org/dev/generated/statsmodels.tsa.holtwinters.ExponentialSmoothing.html#statsmodels.tsa.holtwinters.ExponentialSmoothing-parameters) `initialization_method` is not specified, default value will be `estimated`.
* `args`: (dict) - Inner model args (key-value pairs). See accepted params in [model documentation](https://www.statsmodels.org/dev/generated/statsmodels.tsa.holtwinters.ExponentialSmoothing.html#statsmodels.tsa.holtwinters.ExponentialSmoothing-parameters). Defaults to empty (not provided). Example: {"seasonal": "add", "initialization_method": "estimated"}
*Config Example*
<div class="with-copy" markdown="1">
```yaml
model:
class: "model.holtwinters.HoltWinters"
seasonality: '1d'
frequency: '1h'
# Inner model args (key-value pairs) accepted by statsmodels.tsa.holtwinters.ExponentialSmoothing
args:
seasonal: 'add'
initialization_method: 'estimated'
```
</div>
Resulting metrics of the model are described [here](#vmanomaly-output).
---
## [Prophet](https://facebook.github.io/prophet/)
Here we utilize the Facebook Prophet implementation, as detailed in their [library documentation](https://facebook.github.io/prophet/docs/quick_start.html#python-api). All parameters from this library are compatible and can be passed to the model.
*Parameters specific for vmanomaly*:
\* - mandatory parameters.
* `class`\* (string) - model class name `"model.prophet.ProphetModel"`
* `seasonalities` (list[dict]) - Extra seasonalities to pass to Prophet. See [`add_seasonality()`](https://facebook.github.io/prophet/docs/seasonality,_holiday_effects,_and_regressors.html#modeling-holidays-and-special-events:~:text=modeling%20the%20cycle-,Specifying,-Custom%20Seasonalities) Prophet param.
* `provide_series` - model resulting metrics. If not specified [standard metrics](#vmanomaly-output) will be provided.
**Note**: Apart from standard vmanomaly output Prophet model can provide [additional metrics](#additional-output-metrics-produced-by-fb-prophet).
**Additional output metrics produced by FB Prophet**
Depending on chosen `seasonality` parameter FB Prophet can return additional metrics such as:
- `trend`, `trend_lower`, `trend_upper`
- `additive_terms`, `additive_terms_lower`, `additive_terms_upper`,
- `multiplicative_terms`, `multiplicative_terms_lower`, `multiplicative_terms_upper`,
- `daily`, `daily_lower`, `daily_upper`,
- `hourly`, `hourly_lower`, `hourly_upper`,
- `holidays`, `holidays_lower`, `holidays_upper`,
- and a number of columns for each holiday if `holidays` param is set
*Config Example*
<div class="with-copy" markdown="1">
```yaml
model:
class: "model.prophet.ProphetModel"
seasonalities:
- name: 'hourly'
period: 0.04166666666
fourier_order: 30
# Inner model args (key-value pairs) accepted by
# https://facebook.github.io/prophet/docs/quick_start.html#python-api
args:
# See https://facebook.github.io/prophet/docs/uncertainty_intervals.html
interval_width: 0.98
country_holidays: 'US'
```
</div>
Resulting metrics of the model are described [here](#vmanomaly-output)
---
## [Rolling Quantile](https://en.wikipedia.org/wiki/Quantile)
*Parameters specific for vmanomaly*:
\* - mandatory parameters.
* `class`\* (string) - model class name `"model.rolling_quantile.RollingQuantileModel"`
* `quantile`\* (float) - quantile value, from 0.5 to 1.0. This constraint is implied by 2-sided confidence interval.
* `window_steps`\* (integer) - size of the moving window. (see 'sampling_period')
*Config Example*
<div class="with-copy" markdown="1">
```yaml
model:
class: "model.rolling_quantile.RollingQuantileModel"
quantile: 0.9
window_steps: 96
```
</div>
Resulting metrics of the model are described [here](#vmanomaly-output).
---
## [Seasonal Trend Decomposition](https://en.wikipedia.org/wiki/Seasonal_adjustment)
Here we use Seasonal Decompose implementation from `statsmodels` [library](https://www.statsmodels.org/dev/generated/statsmodels.tsa.seasonal.seasonal_decompose.html). Parameters from this library can be passed to the model. Some parameters are specifically predefined in vmanomaly and can't be changed by user(`model`='additive', `two_sided`=False).
*Parameters specific for vmanomaly*:
\* - mandatory parameters.
* `class`\* (string) - model class name `"model.std.StdModel"`
* `period`\* (integer) - Number of datapoints in one season.
* `z_threshold` (float) - [standard score](https://en.wikipedia.org/wiki/Standard_score) for calculating boundaries to define anomaly score. Defaults to 2.5.
*Config Example*
<div class="with-copy" markdown="1">
```yaml
model:
class: "model.std.StdModel"
period: 2
```
</div>
Resulting metrics of the model are described [here](#vmanomaly-output).
**Additional output metrics produced by Seasonal Trend Decomposition model**
* `resid` - The residual component of the data series.
* `trend` - The trend component of the data series.
* `seasonal` - The seasonal component of the data series.
---
## [MAD (Median Absolute Deviation)](https://en.wikipedia.org/wiki/Median_absolute_deviation)
The MAD model is a robust method for anomaly detection that is *less sensitive* to outliers in data compared to standard deviation-based models. It considers a point as an anomaly if the absolute deviation from the median is significantly large.
*Parameters specific for vmanomaly*:
\* - mandatory parameters.
* `class`\* (string) - model class name `"model.mad.MADModel"`
* `threshold` (float) - The threshold multiplier for the MAD to determine anomalies. Defaults to 2.5. Higher values will identify fewer points as anomalies.
*Config Example*
<div class="with-copy" markdown="1">
```yaml
model:
class: "model.mad.MADModel"
threshold: 2.5
```
Resulting metrics of the model are described [here](#vmanomaly-output).
---
## [Z-score](https://en.wikipedia.org/wiki/Standard_score)
*Parameters specific for vmanomaly*:
\* - mandatory parameters.
* `class`\* (string) - model class name `"model.zscore.ZscoreModel"`
* `z_threshold` (float) - [standard score](https://en.wikipedia.org/wiki/Standard_score) for calculation boundaries and anomaly score. Defaults to 2.5.
*Config Example*
<div class="with-copy" markdown="1">
```yaml
model:
class: "model.zscore.ZscoreModel"
z_threshold: 2.5
```
</div>
Resulting metrics of the model are described [here](#vmanomaly-output).
## [Isolation forest](https://en.wikipedia.org/wiki/Isolation_forest) (Multivariate)
Detects anomalies using binary trees. The algorithm has a linear time complexity and a low memory requirement, which works well with high-volume data. It can be used on both univatiate and multivariate data, but it is more effective in multivariate case.
**Important**: Be aware of [the curse of dimensionality](https://en.wikipedia.org/wiki/Curse_of_dimensionality). Don't use single multivariate model if you expect your queries to return many time series of less datapoints that the number of metrics. In such case it is hard for a model to learn meaningful dependencies from too sparse data hypercube.
Here we use Isolation Forest implementation from `scikit-learn` [library](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.IsolationForest.html). All parameters from this library can be passed to the model.
*Parameters specific for vmanomaly*:
\* - mandatory parameters.
* `class`\* (string) - model class name `"model.isolation_forest.IsolationForestMultivariateModel"`
* `contamination` - The amount of contamination of the data set, i.e. the proportion of outliers in the data set. Used when fitting to define the threshold on the scores of the samples. Default value - "auto". Should be either `"auto"` or be in the range (0.0, 0.5].
* `args`: (dict) - Inner model args (key-value pairs). See accepted params in [model documentation](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.IsolationForest.html). Defaults to empty (not provided). Example: {"random_state": 42, "n_estimators": 100}
*Config Example*
<div class="with-copy" markdown="1">
```yaml
model:
# To use univariate model, substitute class argument with "model.isolation_forest.IsolationForestModel".
class: "model.isolation_forest.IsolationForestMultivariateModel"
contamination: "auto"
args:
n_estimators: 100
# i.e. to assure reproducibility of produced results each time model is fit on the same input
random_state: 42
```
</div>
Resulting metrics of the model are described [here](#vmanomaly-output).
---
## Custom model
You can find a guide on setting up a custom model [here](./custom_model.md).
## vmanomaly output
When vmanomaly is executed, it generates various metrics, the specifics of which depend on the model employed.
These metrics can be renamed in the writer's section.
The default metrics produced by vmanomaly include:
- `anomaly_score`: This is the *primary* metric.
- It is designed in such a way that values from 0.0 to 1.0 indicate non-anomalous data.
- A value greater than 1.0 is generally classified as an anomaly, although this threshold can be adjusted in the alerting configuration.
- The decision to set the changepoint at 1 was made to ensure consistency across various models and alerting configurations, such that a score above 1 consistently signifies an anomaly.
- `yhat`: This represents the predicted expected value.
- `yhat_lower`: This indicates the predicted lower boundary.
- `yhat_upper`: This refers to the predicted upper boundary.
- `y`: This is the original value obtained from the query result.
**Important**: Be aware that if `NaN` (Not a Number) or `Inf` (Infinity) values are present in the input data during `infer` model calls, the model will produce `NaN` as the `anomaly_score` for these particular instances.
## Healthcheck metrics
Each model exposes [several healthchecks metrics](./../monitoring.html#models-behaviour-metrics) to its `health_path` endpoint:

@ -0,0 +1,297 @@
---
# sort: 5
title: Monitoring
weight: 5
menu:
docs:
parent: "vmanomaly-components"
weight: 5
# sort: 5
aliases:
- /anomaly-detection/components/monitoring.html
---
# Monitoring
There are 2 models to monitor VictoriaMetrics Anomaly Detection behavior - [push](https://docs.victoriametrics.com/keyConcepts.html#push-model) and [pull](https://docs.victoriametrics.com/keyConcepts.html#pull-model). Parameters for each of them should be specified in the config file, `monitoring` section.
## Pull Model Config parameters
<table>
<thead>
<tr>
<th>Parameter</th>
<th>Default</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>addr</code></td>
<td><code>"0.0.0.0"</code></td>
<td>Server IP Address</td>
</tr>
<tr>
<td><code>port</code></td>
<td><code>8080</code></td>
<td>Port</td>
</tr>
</tbody>
</table>
## Push Config parameters
<table>
<thead>
<tr>
<th>Parameter</th>
<th>Default</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>url</code></td>
<td></td>
<td>Link where to push metrics to. Example: <code>"http://localhost:8480/"</code></td>
</tr>
<tr>
<td><code>tenant_id</code></td>
<td></td>
<td>Tenant ID for cluster version. Example: <code>"0:0"</code></td>
</tr>
<tr>
<td><code>health_path</code></td>
<td><code>"health"</code></td>
<td>Absolute, to override <code>/health</code> path</td>
</tr>
<tr>
<td><code>user</code></td>
<td></td>
<td>BasicAuth username</td>
</tr>
<tr>
<td><code>password</code></td>
<td></td>
<td>BasicAuth password</td>
</tr>
<tr>
<td><code>timeout</code></td>
<td><code>"5s"</code></td>
<td>Stop waiting for a response after a given number of seconds.</td>
</tr>
<tr>
<td><code>extra_labels</code></td>
<td></td>
<td>Section for custom labels specified by user.</td>
</tr>
</tbody>
</table>
## Monitoring section config example
<div class="with-copy" markdown="1">
``` yaml
monitoring:
pull: # Enable /metrics endpoint.
addr: "0.0.0.0"
port: 8080
push:
url: "http://localhost:8480/"
tenant_id: "0:0" # For cluster version only
health_path: "health"
user: "USERNAME"
password: "PASSWORD"
timeout: "5s"
extra_labels:
job: "vmanomaly-push"
test: "test-1"
```
</div>
## Metrics generated by vmanomaly
<table>
<thead>
<tr>
<th>Metric</th>
<th>Type</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>vmanomaly_start_time_seconds</code></td>
<td>Gauge</td>
<td>vmanomaly start time in UNIX time</td>
</tr>
</tbody>
</table>
### Models Behaviour Metrics
Label names [description](#labelnames)
<table>
<thead>
<tr>
<th>Metric</th>
<th>Type</th>
<th>Description</th>
<th>Labelnames</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>vmanomaly_model_runs</code></td>
<td>Counter</td>
<td>How many times models ran (per model)</td>
<td><code>stage, query_key</code></td>
</tr>
<tr>
<td><code>vmanomaly_model_run_duration_seconds</code></td>
<td>Summary</td>
<td>How much time (in seconds) model invocations took</td>
<td><code>stage, query_key</code></td>
</tr>
<tr>
<td><code>vmanomaly_model_datapoints_accepted</code></td>
<td>Counter</td>
<td>How many datapoints did models accept</td>
<td><code>stage, query_key</code></td>
</tr>
<tr>
<td><code>vmanomaly_model_datapoints_produced</code></td>
<td>Counter</td>
<td>How many datapoints were generated by models</td>
<td><code>stage, query_key</code></td>
</tr>
<tr>
<td><code>vmanomaly_models_active</code></td>
<td>Gauge</td>
<td>How many models are currently inferring</td>
<td><code>query_key</code></td>
</tr>
<tr>
<td><code>vmanomaly_model_runs_skipped</code></td>
<td>Counter</td>
<td>How many times a run was skipped (per model)</td>
<td><code>stage, query_key</code></td>
</tr>
</tbody>
</table>
### Writer Behaviour Metrics
Label names [description](#labelnames)
<table>
<thead>
<tr>
<th>Metric</th>
<th>Type</th>
<th>Description</th>
<th>Labelnames</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>vmanomaly_writer_request_duration_seconds</code></td>
<td>Summary</td>
<td>How much time (in seconds) did requests to VictoriaMetrics take</td>
<td><code>url, query_key</code></td>
</tr>
<tr>
<td><code>vmanomaly_writer_response_count</code></td>
<td>Counter</td>
<td>Response code counts we got from VictoriaMetrics</td>
<td><code>url, query_key, code</code></td>
</tr>
<tr>
<td><code>vmanomaly_writer_sent_bytes</code></td>
<td>Counter</td>
<td>How much bytes were sent to VictoriaMetrics</td>
<td><code>url, query_key</code></td>
</tr>
<tr>
<td><code>vmanomaly_writer_request_serialize_seconds</code></td>
<td>Summary</td>
<td>How much time (in seconds) did serializing take</td>
<td><code>query_key</code></td>
</tr>
<tr>
<td><code>vmanomaly_writer_datapoints_sent</code></td>
<td>Counter</td>
<td>How many datapoints were sent to VictoriaMetrics</td>
<td><code>query_key</code></td>
</tr>
<tr>
<td><code>vmanomaly_writer_timeseries_sent</code></td>
<td>Counter</td>
<td>How many timeseries were sent to VictoriaMetrics</td>
<td><code>query_key</code></td>
</tr>
</tbody>
</table>
### Reader Behaviour Metrics
Label names [description](#labelnames)
<table>
<thead>
<tr>
<th>Metric</th>
<th>Type</th>
<th>Description</th>
<th>Labelnames</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>vmanomaly_reader_request_duration_seconds</code></td>
<td>Summary</td>
<td>How much time (in seconds) did queries to VictoriaMetrics take</td>
<td><code>url, query_key</code></td>
</tr>
<tr>
<td><code>vmanomaly_reader_response_count</code></td>
<td>Counter</td>
<td>Response code counts we got from VictoriaMetrics</td>
<td><code>url, query_key, code</code></td>
</tr>
<tr>
<td><code>vmanomaly_reader_received_bytes</code></td>
<td>Counter</td>
<td>How much bytes were received in responses</td>
<td><code>query_key</code></td>
</tr>
<tr>
<td><code>vmanomaly_reader_response_parsing_seconds</code></td>
<td>Summary</td>
<td>How much time (in seconds) did parsing take for each step</td>
<td><code>step</code></td>
</tr>
<tr>
<td><code>vmanomaly_reader_timeseries_received</code></td>
<td>Counter</td>
<td>How many timeseries were received from VictoriaMetrics</td>
<td><code>query_key</code></td>
</tr>
<tr>
<td><code>vmanomaly_reader_datapoints_received</code></td>
<td>Counter</td>
<td>How many rows were received from VictoriaMetrics</td>
<td><code>query_key</code></td>
</tr>
</tbody>
</table>
### Labelnames
<code>stage</code> - stage of model - 'fit', 'infer' or 'fit_infer' for models that do it simultaneously.
<code>query_key</code> - query alias from [`reader`](/anomaly-detection/components/reader.html) config section.
<code>url</code> - writer or reader url endpoint.
<code>code</code> - response status code or `connection_error`, `timeout`.
<code>step</code> - json or dataframe reading step.

@ -0,0 +1,262 @@
---
# sort: 2
title: Reader
weight: 2
menu:
docs:
parent: "vmanomaly-components"
# sort: 2
weight: 2
aliases:
- /anomaly-detection/components/reader.html
---
# Reader
<!--
There are 4 sources available to read data into VM Anomaly Detection from: VictoriaMetrics, (ND)JSON file, QueryRange, or CSV file. Depending on the data source, different parameters should be specified in the config file in the `reader` section.
-->
VictoriaMetrics Anomaly Detection (`vmanomaly`) primarily uses [VmReader](#vm-reader) to ingest data. This reader focuses on fetching time-series data directly from VictoriaMetrics with the help of powerful [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html) expressions for aggregating, filtering and grouping your data, ensuring seamless integration and efficient data handling.
Future updates will introduce additional readers, expanding the range of data sources `vmanomaly` can work with.
## VM reader
### Config parameters
<table>
<thead>
<tr>
<th>Parameter</th>
<th>Example</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>class</code></td>
<td><code>"reader.vm.VmReader"</code></td>
<td>Name of the class needed to enable reading from VictoriaMetrics or Prometheus. VmReader is the default option, if not specified.</td>
</tr>
<tr>
<td><code>queries</code></td>
<td><code>"ingestion_rate: 'sum(rate(vm_rows_inserted_total[5m])) by (type) > 0'"</code></td>
<td>PromQL/MetricsQL query to select data in format: <code>QUERY_ALIAS: "QUERY"</code>. As accepted by <code>"/query_range?query=%s"</code>.</td>
</tr>
<tr>
<td><code>datasource_url</code></td>
<td><code>"http://localhost:8481/"</code></td>
<td>Datasource URL address</td>
</tr>
<tr>
<td><code>tenant_id</code></td>
<td><code>"0:0"</code></td>
<td>For cluster version only, tenants are identified by accountID or accountID:projectID</td>
</tr>
<tr>
<td><code>sampling_period</code></td>
<td><code>"1h"</code></td>
<td>Optional. Frequency of the points returned. Will be converted to <code>"/query_range?step=%s"</code> param (in seconds).</td>
</tr>
<tr>
<td><code>query_range_path</code></td>
<td><code>"api/v1/query_range"</code></td>
<td>Performs PromQL/MetricsQL range query. Default <code>"api/v1/query_range"</code></td>
</tr>
<tr>
<td><code>health_path</code></td>
<td><code>"health"</code></td>
<td>Absolute or relative URL address where to check availability of the datasource. Default is <code>"health"</code>.</td>
</tr>
<tr>
<td><code>user</code></td>
<td><code>"USERNAME"</code></td>
<td>BasicAuth username</td>
</tr>
<tr>
<td><code>password</code></td>
<td><code>"PASSWORD"</code></td>
<td>BasicAuth password</td>
</tr>
<tr>
<td><code>timeout</code></td>
<td><code>"30s"</code></td>
<td>Timeout for the requests, passed as a string. Defaults to "30s"</td>
</tr>
<tr>
<td><code>verify_tls</code></td>
<td><code>"false"</code></td>
<td>Allows disabling TLS verification of the remote certificate.</td>
</tr>
<tr>
<td><code>bearer_token</code></td>
<td><code>"token"</code></td>
<td>Token is passed in the standard format with header: "Authorization: bearer {token}"</td>
</tr>
<tr>
<td><code>extra_filters</code></td>
<td><code>"[]"</code></td>
<td>List of strings with series selector. See: <a href="https://docs.victoriametrics.com/#prometheus-querying-api-enhancements">Prometheus querying API enhancements</a></td>
</tr>
</tbody>
</table>
Config file example:
```yaml
reader:
class: "reader.vm.VmReader"
datasource_url: "http://localhost:8428/"
tenant_id: "0:0"
queries:
ingestion_rate: 'sum(rate(vm_rows_inserted_total[5m])) by (type) > 0'
sampling_period: '1m'
```
### Healthcheck metrics
`VmReader` exposes [several healthchecks metrics](./monitoring.html#reader-behaviour-metrics).
<!--
# TODO: uncomment and maintain after multimodel config refactor, 2nd priority
## NDJSON reader
Accepts data in the same format as <code>/export</code>.
File content example:
```
{"metric":{"__name__":"metric1","job":"vm"},"values":[745487.56,96334.13,277822.84,159596.94],"timestamps":[1640908800000,1640908802000,1640908803000,1640908804000]}
{"metric":{"__name__":"metric2","job":"vm"},"values":[217822.84,159596.94,745487.56,96334.13],"timestamps":[1640908800000,1640908802000,1640908803000,1640908804000]}
```
### Config parameters
<table>
<thead>
<tr>
<th>Parameter</th>
<th>Example</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>class</code></td>
<td><code>"reader.ndjson.NdjsonReader"</code></td>
<td>Name of the class needed to enable reading from JSON line format file.</td>
</tr>
<tr>
<td><code>path</code></td>
<td><code>"tests/reader/export.ndjson"</code></td>
<td>Path to file in JSON line format</td>
</tr>
</tbody>
</table>
Config file example:
```yaml
reader:
class: "reader.ndjson.NdjsonReader"
path: "tests/reader/export.ndjson"
```
## QueryRange
This datasource is VictoriaMetrics handler for [Prometheus querying API](https://prometheus.io/docs/prometheus/latest/querying/api/).
[Range query](https://docs.victoriametrics.com/keyConcepts.html#range-query) executes the query expression at the given time range with the given step.
### Config parameters
<table>
<thead>
<tr>
<th>Parameter</th>
<th>Example</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>class</code></td>
<td><code>"reader.query_range.QueryRangeReader"</code></td>
<td>Name of the class enabling Query Range reader.</td>
</tr>
<tr>
<td><code>path</code></td>
<td><code>"http://localhost:8428/api/v1/query_range?query=sum(rate(vm_rows_inserted_total[30])) by (type)"</code></td>
<td>URL with query</td>
</tr>
</tbody>
</table>
Config file example:
```yaml
reader:
class: "reader.query_range.QueryRangeReader"
path: "http://localhost:8428/api/v1/query_range?query=sum(rate(vm_rows_inserted_total[30])) by (type)"
```
## CSV reader
### Data format
File should be in `.csv` format and must contain 2 columns with the names: `timestamp` and `y` - metric's datetimes and values accordinally. Order of the columns doesn't matter.
* `timestamp` can be represented eather in explicit datetime format like `2021-04-21 05:18:19` or in UNIX time in seconds like `1618982299`.
* `y` should be a numeric value.
File content example:
```
timestamp,y
2020-07-12 23:09:05,61.0
2020-07-13 23:09:05,63.0
2020-07-14 23:09:05,63.0
2020-07-15 23:09:05,66.0
2020-07-20 23:09:05,68.0
2020-07-21 23:09:05,69.0
2020-07-22 23:09:05,69.0
```
### Config parameters
<table>
<thead>
<tr>
<th>Parameter</th>
<th>Type</th>
<th>Example</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>class</code></td>
<td>str</td>
<td><code>"reader.csv.CsvReader"</code></td>
<td>Name of the class enabling CSV reader</td>
</tr>
<tr>
<td><code>path</code></td>
<td>str</td>
<td><code>"data/v1/jumpsup.csv"</code></td>
<td>.csv file location (local path). <b>The file existence is checked during config validation</b></td>
</tr>
<tr>
<td><code>metric_name</code></td>
<td>str</td>
<td><code>"value"</code></td>
<td>Optional. Alias for metric. If not specified, filename without extension will be used. In this example, `jumpsup`.</td>
</tr>
</tbody>
</table>
Config file example:
```yaml
reader:
class: "reader.csv.CsvReader"
path: "data/v1/jumpsup.csv"
metric_name: "value"
```
-->

@ -0,0 +1,354 @@
---
# sort: 3
title: Scheduler
weight: 3
menu:
docs:
parent: "vmanomaly-components"
weight: 3
# sort: 3
aliases:
- /anomaly-detection/components/scheduler.html
---
# Scheduler
Scheduler defines how often to run and make inferences, as well as what timerange to use to train the model.
Is specified in `scheduler` section of a config for VictoriaMetrics Anomaly Detection.
## Parameters
`class`: str, default=`"scheduler.periodic.PeriodicScheduler"`,
options={`"scheduler.periodic.PeriodicScheduler"`, `"scheduler.oneoff.OneoffScheduler"`, `"scheduler.backtesting.BacktestingScheduler"`}
- `"scheduler.periodic.PeriodicScheduler"`: periodically runs the models on new data. Useful for consecutive re-trainings to counter [data drift](https://www.datacamp.com/tutorial/understanding-data-drift-model-drift) and model degradation over time.
- `"scheduler.oneoff.OneoffScheduler"`: runs the process once and exits. Useful for testing.
- `"scheduler.backtesting.BacktestingScheduler"`: imitates consecutive backtesting runs of OneoffScheduler. Runs the process once and exits. Use to get more granular control over testing on historical data.
**Depending on selected class, different parameters should be used**
## Periodic scheduler
### Parameters
For periodic scheduler parameters are defined as differences in times, expressed in difference units, e.g. days, hours, minutes, seconds.
Examples: `"50s"`, `"4m"`, `"3h"`, `"2d"`, `"1w"`.
<table>
<thead>
<tr>
<th></th>
<th>Time granularity</th>
</tr>
</thead>
<tbody>
<tr>
<td>s</td>
<td>seconds</td>
</tr>
<tr>
<td>m</td>
<td>minutes</td>
</tr>
<tr>
<td>h</td>
<td>hours</td>
</tr>
<tr>
<td>d</td>
<td>days</td>
</tr>
<tr>
<td>w</td>
<td>weeks</td>
</tr>
</tbody>
</table>
<table>
<thead>
<tr>
<th>Parameter</th>
<th>Type</th>
<th>Example</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>fit_window</code></td>
<td>str</td>
<td><code>"14d"</code></td>
<td>What time range to use for training the models. Must be at least 1 second.</td>
</tr>
<tr>
<td><code>infer_every</code></td>
<td>str</td>
<td><code>"1m"</code></td>
<td>How often a model will write its conclusions on newly added data. Must be at least 1 second.</td>
</tr>
<tr>
<td><code>fit_every</code></td>
<td>str, Optional</td>
<td><code>"1h"</code></td>
<td>How often to completely retrain the models. If missing value of <code>infer_every</code> is used and retrain on every inference run.</td>
</tr>
</tbody>
</table>
### Periodic scheduler config example
```yaml
scheduler:
class: "scheduler.periodic.PeriodicScheduler"
fit_window: "14d"
infer_every: "1m"
fit_every: "1h"
```
This part of the config means that `vmanomaly` will calculate the time window of the previous 14 days and use it to train a model. Every hour model will be retrained again on 14 days data, which will include + 1 hour of new data. The time window is strictly the same 14 days and doesn't extend for the next retrains. Every minute `vmanomaly` will produce model inferences for newly added data points by using the model that is kept in memory at that time.
## Oneoff scheduler
### Parameters
For Oneoff scheduler timeframes can be defined in Unix time in seconds or ISO 8601 string format.
ISO format supported time zone offset formats are:
* Z (UTC)
* ±HH:MM
* ±HHMM
* ±HH
If a time zone is omitted, a timezone-naive datetime is used.
### Defining fitting timeframe
<table>
<thead>
<tr>
<th>Format</th>
<th>Parameter</th>
<th>Type</th>
<th>Example</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>ISO 8601</td>
<td><code>fit_start_iso</code></td>
<td>str</td>
<td><code>"2022-04-01T00:00:00Z", "2022-04-01T00:00:00+01:00", "2022-04-01T00:00:00+0100", "2022-04-01T00:00:00+01"</code></td>
<td rowspan=2>Start datetime to use for training a model. ISO string or UNIX time in seconds.</td>
</tr>
<tr>
<td>UNIX time</td>
<td><code>fit_start_s</code></td>
<td>float</td>
<td>1648771200</td>
</tr>
<tr>
<td>ISO 8601</td>
<td><code>fit_end_iso</code></td>
<td>str</td>
<td><code>"2022-04-10T00:00:00Z", "2022-04-10T00:00:00+01:00", "2022-04-10T00:00:00+0100", "2022-04-10T00:00:00+01"</code></td>
<td rowspan=2>End datetime to use for training a model. Must be greater than <code>fit_start_*</code>. ISO string or UNIX time in seconds.</td>
</tr>
<tr>
<td>UNIX time</td>
<td><code>fit_end_s</code></td>
<td>float</td>
<td>1649548800</td>
</tr>
</tbody>
</table>
### Defining inference timeframe
<table>
<thead>
<tr>
<th>Format</th>
<th>Parameter</th>
<th>Type</th>
<th>Example</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>ISO 8601</td>
<td><code>infer_start_iso</code></td>
<td>str</td>
<td><code>"2022-04-11T00:00:00Z", "2022-04-11T00:00:00+01:00", "2022-04-11T00:00:00+0100", "2022-04-11T00:00:00+01"</code></td>
<td rowspan=2>Start datetime to use for a model inference. ISO string or UNIX time in seconds.</td>
</tr>
<tr>
<td>UNIX time</td>
<td><code>infer_start_s</code></td>
<td>float</td>
<td>1649635200</td>
</tr>
<tr>
<td>ISO 8601</td>
<td><code>infer_end_iso</code></td>
<td>str</td>
<td><code>"2022-04-14T00:00:00Z", "2022-04-14T00:00:00+01:00", "2022-04-14T00:00:00+0100", "2022-04-14T00:00:00+01"</code></td>
<td rowspan=2>End datetime to use for a model inference. Must be greater than <code>infer_start_*</code>. ISO string or UNIX time in seconds.</td>
</tr>
<tr>
<td>UNIX time</td>
<td><code>infer_end_s</code></td>
<td>float</td>
<td>1649894400</td>
</tr>
</tbody>
</table>
### ISO format scheduler config example
```yaml
scheduler:
class: "scheduler.oneoff.OneoffScheduler"
fit_start_iso: "2022-04-01T00:00:00Z"
fit_end_iso: "2022-04-10T00:00:00Z"
infer_start_iso: "2022-04-11T00:00:00Z"
infer_end_iso: "2022-04-14T00:00:00Z"
```
### UNIX time format scheduler config example
```yaml
scheduler:
class: "scheduler.oneoff.OneoffScheduler"
fit_start_iso: 1648771200
fit_end_iso: 1649548800
infer_start_iso: 1649635200
infer_end_iso: 1649894400
```
## Backtesting scheduler
### Parameters
As for [Oneoff scheduler](#oneoff-scheduler), timeframes can be defined in Unix time in seconds or ISO 8601 string format.
ISO format supported time zone offset formats are:
* Z (UTC)
* ±HH:MM
* ±HHMM
* ±HH
If a time zone is omitted, a timezone-naive datetime is used.
### Defining overall timeframe
This timeframe will be used for slicing on intervals `(fit_window, infer_window == fit_every)`, starting from the *latest available* time point, which is `to_*` and going back, until no full `fit_window + infer_window` interval exists within the provided timeframe.
<table>
<thead>
<tr>
<th>Format</th>
<th>Parameter</th>
<th>Type</th>
<th>Example</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>ISO 8601</td>
<td><code>from_iso</code></td>
<td>str</td>
<td><code>"2022-04-01T00:00:00Z", "2022-04-01T00:00:00+01:00", "2022-04-01T00:00:00+0100", "2022-04-01T00:00:00+01"</code></td>
<td rowspan=2>Start datetime to use for backtesting.</td>
</tr>
<tr>
<td>UNIX time</td>
<td><code>from_s</code></td>
<td>float</td>
<td>1648771200</td>
</tr>
<tr>
<td>ISO 8601</td>
<td><code>to_iso</code></td>
<td>str</td>
<td><code>"2022-04-10T00:00:00Z", "2022-04-10T00:00:00+01:00", "2022-04-10T00:00:00+0100", "2022-04-10T00:00:00+01"</code></td>
<td rowspan=2>End datetime to use for backtesting. Must be greater than <code>from_start_*</code>.</td>
</tr>
<tr>
<td>UNIX time</td>
<td><code>to_s</code></td>
<td>float</td>
<td>1649548800</td>
</tr>
</tbody>
</table>
### Defining training timeframe
The same *explicit* logic as in [Periodic scheduler](#periodic-scheduler)
<table>
<thead>
<tr>
<th>Format</th>
<th>Parameter</th>
<th>Type</th>
<th>Example</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>ISO 8601</td>
<td rowspan=2><code>fit_window</code></td>
<td rowspan=2>str</td>
<td><code>"PT1M", "P1H"</code></td>
<td rowspan=2>What time range to use for training the models. Must be at least 1 second.</td>
</tr>
<tr>
<td>Prometheus-compatible</td>
<td><code>"1m", "1h"</code></td>
</tr>
</tbody>
</table>
### Defining inference timeframe
In `BacktestingScheduler`, the inference window is *implicitly* defined as a period between 2 consecutive model `fit_every` runs. The *latest* inference window starts from `to_s` - `fit_every` and ends on the *latest available* time point, which is `to_s`. The previous periods for fit/infer are defined the same way, by shifting `fit_every` seconds backwards until we get the last full fit period of `fit_window` size, which start is >= `from_s`.
<table>
<thead>
<tr>
<th>Format</th>
<th>Parameter</th>
<th>Type</th>
<th>Example</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>ISO 8601</td>
<td rowspan=2><code>fit_every</code></td>
<td rowspan=2>str</td>
<td><code>"PT1M", "P1H"</code></td>
<td rowspan=2>What time range to use previously trained model to infer on new data until next retrain happens.</td>
</tr>
<tr>
<td>Prometheus-compatible</td>
<td><code>"1m", "1h"</code></td>
</tr>
</tbody>
</table>
### ISO format scheduler config example
```yaml
scheduler:
class: "scheduler.backtesting.BacktestingScheduler"
from_start_iso: '2021-01-01T00:00:00Z'
to_end_iso: '2021-01-14T00:00:00Z'
fit_window: 'P14D'
fit_every: 'PT1H'
```
### UNIX time format scheduler config example
```yaml
scheduler:
class: "scheduler.backtesting.BacktestingScheduler"
from_start_s: 167253120
to_end_s: 167443200
fit_window: '14d'
fit_every: '1h'
```

@ -0,0 +1,270 @@
---
# sort: 4
title: Writer
weight: 4
menu:
docs:
parent: "vmanomaly-components"
weight: 4
# sort: 4
aliases:
- /anomaly-detection/components/writer.html
---
# Writer
<!--
There are 3 ways to export data from VictoriaMetrics Anomaly Detection: VictoriaMetrics, JSON file, or CSV file. Depending on the chosen option, different parameters should be specified in the config file in the `writer` section.
-->
For exporting data, VictoriaMetrics Anomaly Detection (`vmanomaly`) primarily employs the [VmWriter](#vm-writer), which writes produces anomaly scores (preserving initial labelset and optionally applying additional ones) back to VictoriaMetrics. This writer is tailored for smooth data export within the VictoriaMetrics ecosystem.
Future updates will introduce additional export methods, offering users more flexibility in data handling and integration.
## VM writer
### Config parameters
<table>
<thead>
<tr>
<th>Parameter</th>
<th>Example</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>class</code></td>
<td><code>"writer.vm.VmWriter"</code></td>
<td>Name of the class needed to enable writing to VictoriaMetrics or Prometheus. VmWriter is the default option, if not specified.</td>
</tr>
<tr>
<td><code>datasource_url</code></td>
<td><code>"http://localhost:8481/"</code></td>
<td>Datasource URL address</td>
</tr>
<tr>
<td><code>tenant_id</code></td>
<td><code>"0:0"</code></td>
<td>For cluster version only, tenants are identified by accountID or accountID:projectID</td>
</tr>
<!-- Additional rows for metric_format -->
<tr>
<td rowspan="4"><code>metric_format</code></td>
<td><code>__name__: "vmanomaly_$VAR"</code></td>
<td rowspan="4">Metrics to save the output (in metric names or labels). Must have <code>__name__</code> key. Must have a value with <code>$VAR</code> placeholder in it to distinguish between resulting metrics. Supported placeholders:
<ul>
<li><code>$VAR</code> -- Variables that model provides, all models provide the following set: {"anomaly_score", "y", "yhat", "yhat_lower", "yhat_upper"}. Description of standard output is <a href="/anomaly-detection/components/models/models.html#vmanomaly-output">here</a>. Depending on <a href="/anomaly-detection/components/models/models.html">model type</a> it can provide more metrics, like "trend", "seasonality" etc.</li>
<li><code>$QUERY_KEY</code> -- E.g. "ingestion_rate".</li>
</ul>
Other keys are supposed to be configured by the user to help identify generated metrics, e.g., specific config file name etc.
More details on metric formatting are <a href="#metrics-formatting">here</a>.
</td>
</tr>
<tr><td><code>for: "$QUERY_KEY"</code></td></tr>
<tr><td><code>run: "test_metric_format"</code></td></tr>
<tr><td><code>config: "io_vm_single.yaml"</code></td></tr>
<!-- End of additional rows -->
<tr>
<td><code>import_json_path</code></td>
<td><code>"/api/v1/import"</code></td>
<td>Optional, to override the default import path</td>
</tr>
<tr>
<td><code>health_path</code></td>
<td><code>"health"</code></td>
<td>Absolute or relative URL address where to check the availability of the datasource. Optional, to override the default <code>"/health"</code> path.</td>
</tr>
<tr>
<td><code>user</code></td>
<td><code>"USERNAME"</code></td>
<td>BasicAuth username</td>
</tr>
<tr>
<td><code>password</code></td>
<td><code>"PASSWORD"</code></td>
<td>BasicAuth password</td>
</tr>
<tr>
<td><code>timeout</code></td>
<td><code>"5s"</code></td>
<td>Timeout for the requests, passed as a string. Defaults to "5s"</td>
</tr>
<tr>
<td><code>verify_tls</code></td>
<td><code>"false"</code></td>
<td>Allows disabling TLS verification of the remote certificate.</td>
</tr>
<tr>
<td><code>bearer_token</code></td>
<td><code>"token"</code></td>
<td>Token is passed in the standard format with the header: "Authorization: bearer {token}"</td>
</tr>
</tbody>
</table>
Config example:
```yaml
writer:
class: "writer.vm.VmWriter"
datasource_url: "http://localhost:8428/"
tenant_id: "0:0"
metric_format:
__name__: "vmanomaly_$VAR"
for: "$QUERY_KEY"
run: "test_metric_format"
config: "io_vm_single.yaml"
import_json_path: "/api/v1/import"
health_path: "health"
user: "foo"
password: "bar"
```
### Healthcheck metrics
`VmWriter` exposes [several healthchecks metrics](./monitoring.html#writer-behaviour-metrics).
### Metrics formatting
There should be 2 mandatory parameters set in `metric_format` - `__name__` and `for`.
```yaml
__name__: PREFIX1_$VAR
for: PREFIX2_$QUERY_KEY
```
* for `__name__` parameter it will name metrics returned by models as `PREFIX1_anomaly_score`, `PREFIX1_yhat_lower`, etc. Vmanomaly output metrics names described [here](anomaly-detection/components/models/models.html#vmanomaly-output)
* for `for` parameter will add labels `PREFIX2_query_name_1`, `PREFIX2_query_name_2`, etc. Query names are set as aliases in config `reader` section in [`queries`](anomaly-detection/components/reader.html#config-parameters) parameter.
It is possible to specify other custom label names needed.
For example:
```yaml
custom_label_1: label_name_1
custom_label_2: label_name_2
```
Apart from specified labels, output metrics will return labels inherited from input metrics returned by [queries](/anomaly-detection/components/reader.html#config-parameters).
For example if input data contains labels such as `cpu=1, device=eth0, instance=node-exporter:9100` all these labels will be present in vmanomaly output metrics.
So if metric_format section was set up like this:
```yaml
metric_format:
__name__: "PREFIX1_$VAR"
for: "PREFIX2_$QUERY_KEY"
custom_label_1: label_name_1
custom_label_2: label_name_2
```
It will return metrics that will look like:
```yaml
{__name__="PREFIX1_anomaly_score", for="PREFIX2_query_name_1", custom_label_1="label_name_1", custom_label_2="label_name_2", cpu=1, device="eth0", instance="node-exporter:9100"}
{__name__="PREFIX1_yhat_lower", for="PREFIX2_query_name_1", custom_label_1="label_name_1", custom_label_2="label_name_2", cpu=1, device="eth0", instance="node-exporter:9100"}
{__name__="PREFIX1_anomaly_score", for="PREFIX2_query_name_2", custom_label_1="label_name_1", custom_label_2="label_name_2", cpu=1, device="eth0", instance="node-exporter:9100"}
{__name__="PREFIX1_yhat_lower", for="PREFIX2_query_name_2", custom_label_1="label_name_1", custom_label_2="label_name_2", cpu=1, device="eth0", instance="node-exporter:9100"}
```
<!--
# TODO: uncomment and maintain after multimodel config refactor, 2nd priority
## NDJSON writer
Generates data in the same format as <code>/export</code>.
### Config parameters
<table>
<thead>
<tr>
<th>Parameter</th>
<th>Example</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>class</code></td>
<td><code>"writer.ndjson.NdjsonWriter"</code></td>
<td>Name of the class needed to enable writing into JSON line format file.</td>
</tr>
<tr>
<td><code>path</code></td>
<td><code>"data/output.ndjson"</code></td>
<td>Path to file in JSON line format</td>
</tr>
<tr>
<td><code>metric_format</code></td>
<td><code>__name__: "vmanomaly_$VAR"</code></td>
<td>Metrics to save the output (in metric names or labels). Must have <code>__name__</code> key. Must have a value with <code>$VAR</code> placeholder in it to distinguish between resulting metrics. Supported placeholders: <code>$VAR</code>, <code>$QUERY_KEY</code> and others as configured by the user.</td>
</tr>
<tr>
<td><code>override</code></td>
<td><code>True</code></td>
<td>Override file flag. Default True</td>
</tr>
</tbody>
</table>
Config example:
```yaml
writer:
class: "writer.ndjson.NdjsonWriter"
path: 'data/output.ndjson'
metric_format:
__name__: "$VAR"
for: "$QUERY_KEY"
config: "io_ndjson.yaml"
```
## CSV writer
### Config parameters
<table>
<thead>
<tr>
<th>Parameter</th>
<th>Type</th>
<th>Example</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>class</code></td>
<td>str</td>
<td><code>"writer.csv.CsvWriter"</code></td>
<td>Name of the class enabling CSV writer</td>
</tr>
<tr>
<td><code>header</code></td>
<td>bool</td>
<td><code>True</code></td>
<td>Whether to write header (column names). Default True</td>
</tr>
<tr>
<td><code>path</code></td>
<td>str</td>
<td><code>"data/jumpsup.csv"</code></td>
<td>Where to save the results</td>
</tr>
<tr>
<td><code>override</code></td>
<td>bool</td>
<td><code>True</code></td>
<td>Override file flag. Default True</td>
</tr>
<tr>
<td><code>tz</code></td>
<td>str</td>
<td><code>None</code></td>
<td>Optional. Convert default timestamps in UTC to desired timezone, e.g. 'US/Pacific'. By default local timezone is used</td>
</tr>
</tbody>
</table>
Config example:
```yaml
writer:
class: "writer.csv.CsvWriter"
path: "data/io_csv_out.csv"
header: True
override: True
```
-->

@ -0,0 +1,17 @@
---
title: Guides
weight: 2
# sort: 2
menu:
docs:
identifier: "anomaly-detection-guides"
parent: "anomaly-detection"
weight: 2
sort: 2
aliases:
- /anomaly-detection/guides.html
---
# Guides
This section holds guides of how to set up and use VictoriaMetrics Anomaly Detection (or simply [`vmanomaly`](/vmanomaly.html)) service, integrating it with different observability components.

@ -0,0 +1,456 @@
---
weight: 1
~# sort: 1
title: Getting started with vmanomaly
menu:
docs:
parent: "anomaly-detection-guides"
weight: 1
aliases:
- /anomaly-detection/guides/guide-vmanomaly-vmalert.html
---
# Getting started with vmanomaly
**Prerequisites**
- *vmanomaly* is a part of enterprise package. You can get license key [here](https://victoriametrics.com/products/enterprise/trial) to try this tutorial.
- In the tutorial, we'll be using the following VictoriaMetrics components:
- [VictoriaMetrics](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html) (v.1.96.0)
- [vmalert](https://docs.victoriametrics.com/vmalert.html) (v.1.96.0)
- [vmagent](https://docs.victoriametrics.com/vmagent.html) (v.1.96.0)
If you're unfamiliar with the listed components, please read [QuickStart](https://docs.victoriametrics.com/Quick-Start.html) first.
- It is assumed that you are familiar with [Grafana](https://grafana.com/)(v.10.2.1) and [Docker](https://docs.docker.com/get-docker/) and [Docker Compose](https://docs.docker.com/compose/).
## 1. What is vmanomaly?
*VictoriaMetrics Anomaly Detection* ([vmanomaly](https://docs.victoriametrics.com/vmanomaly.html)) is a service that continuously scans time series stored in VictoriaMetrics and detects unexpected changes within data patterns in real-time. It does so by utilizing user-configurable machine learning models.
All the service parameters are defined in a config file.
A single config file supports only one model. It is ok to run multiple vmanomaly processes, each using its own config.
**vmanomaly** does the following:
- periodically queries user-specified metrics
- computes an **anomaly score** for them
- pushes back the computed **anomaly score** to VictoriaMetrics.
### What is anomaly score?
**Anomaly score** is a calculated non-negative (in interval [0, +inf)) numeric value. It takes into account how well data fit a predicted distribution, periodical patterns, trends, seasonality, etc.
The value is designed to:
- *fall between 0 and 1* if model consider that datapoint is following usual pattern,
- *exceed 1* if the datapoint is abnormal.
Then, users can enable alerting rules based on the **anomaly score** with [vmalert](#what-is-vmalert).
## 2. What is vmalert?
[vmalert](https://docs.victoriametrics.com/vmalert.html) is an alerting tool for VictoriaMetrics. It executes a list of the given alerting or recording rules against configured `-datasource.url`.
[Alerting rules](https://docs.victoriametrics.com/vmalert.html#alerting-rules) allow you to define conditions that, when met, will notify the user. The alerting condition is defined in a form of a query expression via [MetricsQL query language](https://docs.victoriametrics.com/MetricsQL.html). For example, in our case, the expression `anomaly_score > 1.0` will notify a user when the calculated anomaly score exceeds a threshold of 1.
## 3. How does vmanomaly works with vmalert?
Compared to classical alerting rules, anomaly detection is more "hands-off" and data-aware. Instead of thinking of critical conditions to define, user can rely on catching anomalies that were not expected to happen. In other words, by setting up alerting rules, a user must know what to look for, ahead of time, while anomaly detection looks for any deviations from past behavior.
Practical use case is to put anomaly score generated by vmanomaly into alerting rules with some threshold.
**In this tutorial we are going to:**
- Configure docker-compose file with all needed services ([VictoriaMetrics](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html), [vmalert](https://docs.victoriametrics.com/vmalert.html), [vmagent](https://docs.victoriametrics.com/vmagent.html), [Grafana](https://grafana.com/), [Node Exporter](https://prometheus.io/docs/guides/node-exporter/) and [vmanomaly](https://docs.victoriametrics.com/vmanomaly.html) ).
- Explore configuration files for [vmanomaly](https://docs.victoriametrics.com/vmanomaly.html) and [vmalert](https://docs.victoriametrics.com/vmalert.html).
- Run our own [VictoriaMetrics](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html) database with data scraped from [Node Exporter](https://prometheus.io/docs/guides/node-exporter/).
- Explore data for analysis in [Grafana](https://grafana.com/).
- Explore vmanomaly results.
- Explore vmalert alerts
_____________________________
## 4. Data to analyze
Let's talk about data used for anomaly detection in this tutorial.
We are going to collect our own CPU usage data with [Node Exporter](https://prometheus.io/docs/guides/node-exporter/) into the VictoriaMetrics database.
On a Node Exporter's metrics page, part of the output looks like this:
```
# HELP node_cpu_seconds_total Seconds the CPUs spent in each mode.
# TYPE node_cpu_seconds_total counter
node_cpu_seconds_total{cpu="0",mode="idle"} 94965.14
node_cpu_seconds_total{cpu="0",mode="iowait"} 51.25
node_cpu_seconds_total{cpu="0",mode="irq"} 0
node_cpu_seconds_total{cpu="0",mode="nice"} 0
node_cpu_seconds_total{cpu="0",mode="softirq"} 1682.18
node_cpu_seconds_total{cpu="0",mode="steal"} 0
node_cpu_seconds_total{cpu="0",mode="system"} 995.37
node_cpu_seconds_total{cpu="0",mode="user"} 12378.05
node_cpu_seconds_total{cpu="1",mode="idle"} 94386.53
node_cpu_seconds_total{cpu="1",mode="iowait"} 51.22
...
```
Here, metric `node_cpu_seconds_total` tells us how many seconds each CPU spent in different modes: _user_, _system_, _iowait_, _idle_, _irq&softirq_, _guest_, or _steal_.
These modes are mutually exclusive. A high _iowait_ means that you are disk or network bound, high _user_ or _system_ means that you are CPU bound.
The metric `node_cpu_seconds_total` is a [counter](https://docs.victoriametrics.com/keyConcepts.html#counter) type of metric. If we'd like to see how much time CPU spent in each of the nodes, we need to calculate the per-second values change via [rate function](https://docs.victoriametrics.com/MetricsQL.html#rate): `rate(node_cpu_seconds_total)`.
Here is how this query may look like in Grafana:
<img alt="node_cpu_rate_graph" src="guide-vmanomaly-vmalert_node-cpu-rate-graph.webp">
This query result will generate 8 time series per each cpu, and we will use them as an input for our VM Anomaly Detection. vmanomaly will start learning configured model type separately for each of the time series.
______________________________
## 5. vmanomaly configuration and parameter description
**Parameter description**:
There are 4 required sections in config file:
`scheduler` - defines how often to run and make inferences, as well as what timerange to use to train the model.
`model` - specific model parameters and configurations,
`reader` - how to read data and where it is located
`writer` - where and how to write the generated output.
Let's look into parameters in each section:
* `scheduler`
* `infer_every` - how often trained models will make inferences on new data. Basically, how often to generate new datapoints for anomaly_score. Format examples: 30s, 4m, 2h, 1d. Time granularity ('s' - seconds, 'm' - minutes, 'h' - hours, 'd' - days).
You can look at this as how often a model will write its conclusions on newly added data. Here in example we are asking every 1 minute: based on the previous data, do these new datapoints look abnormal?
* `fit_every` - how often to retrain the models. The higher the frequency -- the fresher the model, but the more CPU it consumes. If omitted, the models will be retrained on each infer_every cycle. Format examples: 30s, 4m, 2h, 1d. Time granularity ('s' - seconds, 'm' - minutes, 'h' - hours, 'd' - days).
* `fit_window` - what data interval to use for model training. Longer intervals capture longer historical behavior and detect seasonalities better, but is slower to adapt to permanent changes to metrics behavior. Recommended value is at least two full seasons. Format examples: 30s, 4m, 2h, 1d. Time granularity ('s' - seconds, 'm' - minutes, 'h' - hours, 'd' - days).
Here is the previous 14 days of data to put into the model training.
* `model`
* `class` - what model to run. You can use your own model or choose from built-in models: Seasonal Trend Decomposition, Facebook Prophet, ZScore, Rolling Quantile, Holt-Winters, Isolation Forest and ARIMA. Here we use Facebook Prophet (`model.prophet.ProphetModel`).
* `args` - Model specific parameters, represented as YAML dictionary in a simple `key: value` form. For example, you can use parameters that are available in [FB Prophet](https://facebook.github.io/prophet/docs/quick_start.html).
* `reader`
* `datasource_url` - Data source. An HTTP endpoint that serves `/api/v1/query_range`.
* `queries`: - MetricsQL (extension of PromQL) expressions, where you want to find anomalies.
You can put several queries in a form:
`<QUERY_ALIAS>: "QUERY"`. QUERY_ALIAS will be used as a `for` label in generated metrics and anomaly scores.
* `writer`
* `datasource_url` - Output destination. An HTTP endpoint that serves `/api/v1/import`.
Here is an example of the config file `vmanomaly_config.yml`.
<div class="with-copy" markdown="1">
``` yaml
scheduler:
infer_every: "1m"
fit_every: "2h"
fit_window: "14d"
model:
class: "model.prophet.ProphetModel"
args:
interval_width: 0.98
reader:
datasource_url: "http://victoriametrics:8428/"
queries:
node_cpu_rate: "rate(node_cpu_seconds_total)"
writer:
datasource_url: "http://victoriametrics:8428/"
```
</div>
_____________________________________________
## 6. vmanomaly output
As the result of running vmanomaly, it produces the following metrics:
- `anomaly_score` - the main one. Ideally, if it is between 0.0 and 1.0 it is considered to be a non-anomalous value. If it is greater than 1.0, it is considered an anomaly (but you can reconfigure that in alerting config, of course),
- `yhat` - predicted expected value,
- `yhat_lower` - predicted lower boundary,
- `yhat_upper` - predicted upper boundary,
- `y` - initial query result value.
Here is an example of how output metric will be written into VictoriaMetrics:
`anomaly_score{for="node_cpu_rate", cpu="0", instance="node-xporter:9100", job="node-exporter", mode="idle"} 0.85`
____________________________________________
## 7. vmalert configuration
Here we provide an example of the config for vmalert `vmalert_config.yml`.
<div class="with-copy" markdown="1">
``` yaml
groups:
- name: AnomalyExample
rules:
- alert: HighAnomalyScore
expr: 'anomaly_score > 1.0'
labels:
severity: warning
annotations:
summary: Anomaly Score exceeded 1.0. `rate(node_cpu_seconds_total)` is showing abnormal behavior.
```
</div>
In the query expression we need to put a condition on the generated anomaly scores. Usually if the anomaly score is between 0.0 and 1.0, the analyzed value is not abnormal. The more anomaly score exceeded 1 the more our model is sure that value is an anomaly.
You can choose your threshold value that you consider reasonable based on the anomaly score metric, generated by vmanomaly. One of the best ways is to estimate it visually, by plotting the `anomaly_score` metric, along with predicted "expected" range of `yhat_lower` and `yhat_upper`. Later in this tutorial we will show an example
____________________________________________
## 8. Docker Compose configuration
Now we are going to configure the `docker-compose.yml` file to run all needed services.
Here are all services we are going to run:
<p align="center">
<img src="guide-vmanomaly-vmalert_docker-compose.webp" width="800" alt="Docker compose services">
</p>
* victoriametrics - VictoriaMetrics Time Series Database
* vmagent - is an agent which helps you collect metrics from various sources, relabel and filter the collected metrics and store them in VictoriaMetrics or any other storage systems via Prometheus remote_write protocol.
* [grafana](https://grafana.com/) - visualization tool.
* node-exporter - Prometheus [Node Exporter](https://prometheus.io/docs/guides/node-exporter/) exposes a wide variety of hardware- and kernel-related metrics.
* vmalert - VictoriaMetrics Alerting service.
* vmanomaly - VictoriaMetrics Anomaly Detection service.
### Grafana setup
To enable VictoriaMetrics datasource as the default in Grafana we need to create a file `datasource.yml`
<div class="with-copy" markdown="1">
``` yaml
apiVersion: 1
datasources:
- name: VictoriaMetrics
type: prometheus
access: proxy
url: http://victoriametrics:8428
isDefault: true
```
</div>
### Prometheus config
Let's create `prometheus.yml` file for `vmagent` configuration.
<div class="with-copy" markdown="1">
``` yaml
global:
scrape_interval: 10s
scrape_configs:
- job_name: 'vmagent'
static_configs:
- targets: ['vmagent:8429']
- job_name: 'vmalert'
static_configs:
- targets: ['vmalert:8880']
- job_name: 'victoriametrics'
static_configs:
- targets: ['victoriametrics:8428']
- job_name: 'node-exporter'
static_configs:
- targets: ['node-exporter:9100']
- job_name: 'vmanomaly'
static_configs:
- targets: [ 'vmanomaly:8500' ]
```
</div>
### vmanomaly licencing
We are going to use license stored locally in file `vmanomaly_licence.txt` with key in it.
You can explore other license options [here](https://docs.victoriametrics.com/vmanomaly.html#licensing)
### Docker-compose
Let's wrap it all up together into the `docker-compose.yml` file.
<div class="with-copy" markdown="1">
``` yaml
services:
vmagent:
container_name: vmagent
image: victoriametrics/vmagent:latest
depends_on:
- "victoriametrics"
ports:
- 8429:8429
volumes:
- vmagentdata:/vmagentdata
- ./prometheus.yml:/etc/prometheus/prometheus.yml
command:
- "--promscrape.config=/etc/prometheus/prometheus.yml"
- "--remoteWrite.url=http://victoriametrics:8428/api/v1/write"
networks:
- vm_net
restart: always
victoriametrics:
container_name: victoriametrics
image: victoriametrics/victoria-metrics:v1.96.0
ports:
- 8428:8428
- 8089:8089
- 8089:8089/udp
- 2003:2003
- 2003:2003/udp
- 4242:4242
volumes:
- vmdata:/storage
command:
- "--storageDataPath=/storage"
- "--graphiteListenAddr=:2003"
- "--opentsdbListenAddr=:4242"
- "--httpListenAddr=:8428"
- "--influxListenAddr=:8089"
- "--vmalert.proxyURL=http://vmalert:8880"
networks:
- vm_net
restart: always
grafana:
container_name: grafana
image: grafana/grafana-oss:10.2.1
depends_on:
- "victoriametrics"
ports:
- 3000:3000
volumes:
- grafanadata:/var/lib/grafana
- ./datasource.yml:/etc/grafana/provisioning/datasources/datasource.yml
networks:
- vm_net
restart: always
vmalert:
container_name: vmalert
image: victoriametrics/vmalert:latest
depends_on:
- "victoriametrics"
ports:
- 8880:8880
volumes:
- ./vmalert_config.yml:/etc/alerts/alerts.yml
command:
- "--datasource.url=http://victoriametrics:8428/"
- "--remoteRead.url=http://victoriametrics:8428/"
- "--remoteWrite.url=http://victoriametrics:8428/"
- "--notifier.url=http://alertmanager:9093/"
- "--rule=/etc/alerts/*.yml"
# display source of alerts in grafana
- "--external.url=http://127.0.0.1:3000" #grafana outside container
# when copypaste the line be aware of '$$' for escaping in '$expr'
- '--external.alert.source=explore?orgId=1&left=["now-1h","now","VictoriaMetrics",{"expr":{{$$expr|jsonEscape|queryEscape}} },{"mode":"Metrics"},{"ui":[true,true,true,"none"]}]'
networks:
- vm_net
restart: always
vmanomaly:
container_name: vmanomaly
image: us-docker.pkg.dev/victoriametrics-test/public/vmanomaly-trial:v1.7.2
depends_on:
- "victoriametrics"
ports:
- "8500:8500"
networks:
- vm_net
restart: always
volumes:
- ./vmanomaly_config.yml:/config.yaml
- ./vmanomaly_license.txt:/license.txt
platform: "linux/amd64"
command:
- "/config.yaml"
- "--license-file=/license.txt"
node-exporter:
image: quay.io/prometheus/node-exporter:latest
container_name: node-exporter
ports:
- 9100:9100
pid: host
restart: unless-stopped
networks:
- vm_net
volumes:
vmagentdata: {}
vmdata: {}
grafanadata: {}
networks:
vm_net:
```
</div>
Before running our docker-compose make sure that your directory contains all required files:
<p align="center">
<img src="guide-vmanomaly-vmalert_files.webp" width="400" alt="all files">
</p>
This docker-compose file will pull docker images, set up each service and run them all together with the command:
<div class="with-copy" markdown="1">
```
docker-compose up -d
```
</div>
To check if vmanomaly is up and running you can check docker logs:
<div class="with-copy" markdown="1">
```
docker logs vmanomaly
```
</div>
___________________________________________________________
## 9. Model results
To look at model results we need to go to grafana on the `localhost:3000`. Data
vmanomaly need some time to generate more data to visualize.
Let's investigate model output visualization in Grafana.
In the Grafana Explore tab enter queries:
* `anomaly_score`
* `yhat`
* `yhat_lower`
* `yhat_upper`
Each of these metrics will contain same labels our query `rate(node_cpu_seconds_total)` returns.
### Anomaly scores for each metric with its according labels.
Query: `anomaly_score`
<img alt="Anomaly score graph" src="guide-vmanomaly-vmalert_anomaly-score.webp">
<br>Check out if the anomaly score is high for datapoints you think are anomalies. If not, you can try other parameters in the config file or try other model type.
As you may notice a lot of data shows anomaly score greater than 1. It is expected as we just started to scrape and store data and there are not enough datapoints to train on. Just wait for some more time for gathering more data to see how well this particular model can find anomalies. In our configs we put 2 days of data required.
### Actual value from input query with predicted `yhat` metric.
Query: `yhat`
<img alt="yhat" src="guide-vmanomaly-vmalert_yhat.webp">
<br>Here we are using one particular set of metrics for visualization. Check out the difference between model prediction and actual values. If values are very different from prediction, it can be considered as anomalous.
### Lower and upper boundaries that model predicted.
Queries: `yhat_lower` and `yhat_upper`
<img alt="yhat lower and yhat upper" src="guide-vmanomaly-vmalert_yhat-lower-upper.webp">
Boundaries of 'normal' metric values according to model inference.
### Alerting
On the page `http://localhost:8880/vmalert/groups` you can find our configured Alerting rule:
<img alt="alert rule" src="guide-vmanomaly-vmalert_alert-rule.webp">
According to the rule configured for vmalert we will see Alert when anomaly score exceed 1. You will see an alert on Alert tab. `http://localhost:8880/vmalert/alerts`
<img alt="alerts firing" src="guide-vmanomaly-vmalert_alerts-firing.webp">
## 10. Conclusion
Now we know how to set up Victoria Metric Anomaly Detection tool and use it together with vmalert. We also discovered core vmanomaly generated metrics and behaviour.

Binary file not shown.

After

Width:  |  Height:  |  Size: 32 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 110 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 99 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 17 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 11 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 184 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 90 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 105 KiB

@ -12,12 +12,13 @@ aliases:
# vmanomaly
**_vmanomaly is a part of [enterprise package](https://docs.victoriametrics.com/enterprise.html). You need to request a [free trial license](https://victoriametrics.com/products/enterprise/trial/) for evaluation.
Please [contact us](https://victoriametrics.com/contact-us/) to find out more._**
**_vmanomaly_ is a part of [enterprise package](https://docs.victoriametrics.com/enterprise.html). You need to request a [free trial license](https://victoriametrics.com/products/enterprise/trial/) for evaluation.**
Please head to to [Anomaly Detection section](/anomaly-detection) to find out more.
## About
**VictoriaMetrics Anomaly Detection** is a service that continuously scans VictoriaMetrics time
**VictoriaMetrics Anomaly Detection** (or shortly, `vmanomaly`) is a service that continuously scans VictoriaMetrics time
series and detects unexpected changes within data patterns in real-time. It does so by utilizing
user-configurable machine learning models.
@ -48,9 +49,10 @@ processes in parallel, each using its own config.
## Models
Currently, vmanomaly ships with a few common models:
Currently, vmanomaly ships with a set of built-in models:
> For a detailed description, see [model section](/anomaly-detection/components/models)
1. **ZScore**
1. [**ZScore**](/anomaly-detection/components/models/models.html#z-score)
_(useful for testing)_
@ -58,7 +60,7 @@ Currently, vmanomaly ships with a few common models:
from time-series mean (straight line). Keeps only two model parameters internally:
`mean` and `std` (standard deviation).
1. **Prophet**
1. [**Prophet**](/anomaly-detection/components/models/models.html#prophet)
_(simplest in configuration, recommended for getting starting)_
@ -72,35 +74,40 @@ Currently, vmanomaly ships with a few common models:
See [Prophet documentation](https://facebook.github.io/prophet/)
1. **Holt-Winters**
1. [**Holt-Winters**](/anomaly-detection/components/models/models.html#holt-winters)
Very popular forecasting algorithm. See [statsmodels.org documentation](
https://www.statsmodels.org/stable/generated/statsmodels.tsa.holtwinters.ExponentialSmoothing.html)
for Holt-Winters exponential smoothing.
1. **Seasonal-Trend Decomposition**
1. [**Seasonal-Trend Decomposition**](/anomaly-detection/components/models/models.html#seasonal-trend-decomposition)
Extracts three components: season, trend, and residual, that can be plotted individually for
easier debugging. Uses LOESS (locally estimated scatterplot smoothing).
See [statsmodels.org documentation](https://www.statsmodels.org/dev/examples/notebooks/generated/stl_decomposition.html)
for LOESS STD.
1. **ARIMA**
1. [**ARIMA**](/anomaly-detection/components/models/models.html#arima)
Commonly used forecasting model. See [statsmodels.org documentation](https://www.statsmodels.org/stable/generated/statsmodels.tsa.arima.model.ARIMA.html) for ARIMA.
1. **Rolling Quantile**
1. [**Rolling Quantile**](/anomaly-detection/components/models/models.html#rolling-quantile)
A simple moving window of quantiles. Easy to use, easy to understand, but not as powerful as
other models.
1. **Isolation Forest**
1. [**Isolation Forest**](/anomaly-detection/components/models/models.html#isolation-forest-multivariate)
Detects anomalies using binary trees. It works for both univariate and multivariate data. Be aware of [the curse of dimensionality](https://en.wikipedia.org/wiki/Curse_of_dimensionality) in the case of multivariate data - we advise against using a single model when handling multiple time series *if the number of these series significantly exceeds their average length (# of data points)*.
The algorithm has a linear time complexity and a low memory requirement, which works well with high-volume data. See [scikit-learn.org documentation](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.IsolationForest.html) for Isolation Forest.
1. [**MAD (Median Absolute Deviation)**](anomaly-detection/components/models/models.html#mad-median-absolute-deviation)
A robust method for anomaly detection that is less sensitive to outliers in data compared to standard deviation-based models. It considers a point as an anomaly if the absolute deviation from the median is significantly large.
### Examples
For example, heres how Prophet predictions could look like on a real-data example
(Prophet auto-detected seasonality interval):
@ -126,20 +133,22 @@ optionally preserving labels).
## Usage
> Starting from v1.5.0, vmanomaly requires a license key to run. You can obtain a trial license key [here](https://victoriametrics.com/products/enterprise/trial/).
> Starting from [v1.5.0](/anomaly-detection/CHANGELOG.html#v150), vmanomaly requires a license key to run. You can obtain a trial license key [here](https://victoriametrics.com/products/enterprise/trial/).
> See [Getting started guide](https://docs.victoriametrics.com/guides/guide-vmanomaly-vmalert.html).
> See [Getting started guide](anomaly-detection/guides/guide-vmanomaly-vmalert.html).
### Config file
There are 4 required sections in config file:
* `scheduler` - defines how often to run and make inferences, as well as what timerange to use to train the model.
* `model` - specific model parameters and configurations,
* `reader` - how to read data and where it is located
* `writer` - where and how to write the generated output.
* [`scheduler`](/anomaly-detection/components/scheduler.html) - defines how often to run and make inferences, as well as what timerange to use to train the model.
* [`model`](/anomaly-detection/components/models) - specific model parameters and configurations,
* [`reader`](/anomaly-detection/components/reader.html) - how to read data and where it is located
* [`writer`](/anomaly-detection/components/writer.html) - where and how to write the generated output.
[`monitoring`](#monitoring) - defines how to monitor work of *vmanomaly* service. This config section is *optional*.
> For a detailed description, see [config sections](/anomaly-detection/docs)
#### Config example
Here is an example of config file that will run FB Prophet model, that will be retrained every 2 hours on 14 days of previous data. It will generate inference (including `anomaly_score` metric) every 1 minute.
@ -171,6 +180,8 @@ writer:
*vmanomaly* can be monitored by using push or pull approach.
It can push metrics to VictoriaMetrics or expose metrics in Prometheus exposition format.
> For a detailed description, see [monitoring section](/anomaly-detection/components/monitoring.html)
#### Push approach
*vmanomaly* can push metrics to VictoriaMetrics single-node or cluster version.