### Describe Your Changes
docs for `vmanomaly`, updated after release 1.16.2
### Checklist
The following checks are **mandatory**:
- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
(cherry picked from commit b2e7b05918
)
11 KiB
title | weight | menu | aliases | |||||||
---|---|---|---|---|---|---|---|---|---|---|
Reader | 2 |
|
|
VictoriaMetrics Anomaly Detection (vmanomaly
) primarily uses VmReader to ingest data. This reader focuses on fetching time-series data directly from VictoriaMetrics with the help of powerful MetricsQL expressions for aggregating, filtering and grouping your data, ensuring seamless integration and efficient data handling.
Future updates will introduce additional readers, expanding the range of data sources vmanomaly
can work with.
VM reader
Note
: Starting from v1.13.0 there is backward-compatible change of
queries
arg of VmReader. New format allows to specify per-query parameters, likestep
to reduce amount of data read from VictoriaMetrics TSDB and to allow config flexibility. Please see per-query parameters section for the details.
Old format like
# other config sections ...
reader:
class: 'vm'
datasource_url: 'http://localhost:8428' # source victoriametrics/prometheus
sampling_period: "10s" # set it <= min(infer_every) in schedulers section
queries:
# old format {query_alias: query_expr}, prior to 1.13, will be converted to a new format automatically
vmb: 'avg(vm_blocks)'
will be converted to a new one with a warning raised in logs:
# other config sections ...
reader:
class: 'vm'
datasource_url: 'http://localhost:8428' # source victoriametrics/prometheus
sampling_period: '10s'
queries:
# old format {query_alias: query_expr}, prior to 1.13, will be converted to a new format automatically
vmb:
expr: 'avg(vm_blocks)' # initial MetricsQL expression
step: '10s' # individual step for this query, will be filled with `sampling_period` from the root level
data_range: ['-inf', 'inf'] # by default, no constraints applied on data range
# new query-level arguments will be added in backward-compatible way in future releases
Per-query parameters
Starting from v1.13.0 there is change of queries
arg format. Now each query alias supports the next (sub)fields:
-
expr
(string): MetricsQL/PromQL expression that defines an input for VmReader. As accepted by/query_range?query=%s
. i.e.avg(vm_blocks)
-
step
(string): query-level frequency of the points returned, i.e.30s
. Will be converted to/query_range?step=%s
param (in seconds). Useful to optimize total amount of data read from VictoriaMetrics, where different queries may have different frequencies for different machine learning models to run on.Note
: if not set explicitly (or if older config style prior to v1.13.0) is used, then it is set to reader-level
sampling_period
arg.Note
: having different individual
step
args for queries (i.e.30s
forq1
and2m
forq2
) is not yet supported for multivariate model if you want to run it on several queries simultaneously (i.e. settingqueries
arg of a model to [q1
,q2
]). -
data_range
(list[float | string]): Introduced in v1.15.1, it allows defining valid data ranges for input per individual query inqueries
, resulting in:- High anomaly scores (>1) when the data falls outside the expected range, indicating a data constraint violation.
- Lowest anomaly scores (=0) when the model's predictions (
yhat
) fall outside the expected range, meaning uncertain predictions.
Per-query config example
reader:
class: 'vm'
sampling_period: '1m'
# other reader params ...
queries:
ingestion_rate:
expr: 'sum(rate(vm_rows_inserted_total[5m])) by (type) > 0'
step: '2m' # overrides global `sampling_period` of 1m
data_range: [10, 'inf'] # meaning only positive values > 10 are expected, i.e. a value `y` < 10 will trigger anomaly score > 1
Config parameters
Parameter | Example | Description |
---|---|---|
|
reader.vm.VmReader (or vm starting from v1.13.0)
|
Name of the class needed to enable reading from VictoriaMetrics or Prometheus. VmReader is the default option, if not specified. |
queries
|
See per-query config example above | See per-query config section above |
datasource_url
|
http://localhost:8481/
|
Datasource URL address |
tenant_id
|
0:0 , multitenant
|
For VictoriaMetrics Cluster version only, tenants are identified by accountID or accountID:projectID . Starting from v1.16.2, multitenant endpoint is supported, to execute queries over multiple tenants. See VictoriaMetrics Cluster multitenancy docs
|
sampling_period
|
1h
|
Frequency of the points returned. Will be converted to /query_range?step=%s param (in seconds). Required since v1.9.0.
|
query_range_path
|
/api/v1/query_range
|
Performs PromQL/MetricsQL range query |
health_path
|
health
|
Absolute or relative URL address where to check availability of the datasource. |
user
|
USERNAME
|
BasicAuth username |
password
|
PASSWORD
|
BasicAuth password |
timeout
|
30s
|
Timeout for the requests, passed as a string |
verify_tls
|
false
|
Allows disabling TLS verification of the remote certificate. |
bearer_token
|
token
|
Token is passed in the standard format with header: Authorization: bearer {token}
|
bearer_token_file
|
path_to_file
|
Path to a file, which contains token, that is passed in the standard format with header: Authorization: bearer {token} . Available since v1.15.9
|
extra_filters
|
[]
|
List of strings with series selector. See: Prometheus querying API enhancements |
query_from_last_seen_timestamp
|
False
|
If True, then query will be performed from the last seen timestamp for a given series. If False, then query will be performed from the start timestamp, based on a schedule period. Defaults to False . Useful for infer stages in case there were skipped infer calls prior to given.
|
latency_offset
|
1ms
|
Introduced in v1.15.1, it allows overriding the default -search.latencyOffset flag of VictoriaMetrics (30s). The default value is set to 1ms, which should help in cases where sampling_frequency is low (10-60s) and sampling_frequency equals infer_every in the PeriodicScheduler. This prevents users from receiving service - WARNING - [Scheduler [scheduler_alias]] No data available for inference. warnings in logs and allows for consecutive infer calls without gaps. To restore the old behavior, set it equal to your -search.latencyOffset flag value.
|
Config file example:
reader:
class: "vm" # or "reader.vm.VmReader" until v1.13.0
datasource_url: "https://play.victoriametrics.com/"
tenant_id: "0:0"
queries:
ingestion_rate:
expr: 'sum(rate(vm_rows_inserted_total[5m])) by (type) > 0'
step: '1m' # can override global `sampling_period` on per-query level
data_range: [0, 'inf']
sampling_period: '1m'
query_from_last_seen_timestamp: True # false by default
latency_offset: '1ms'
Healthcheck metrics
VmReader
exposes several healthchecks metrics.