VictoriaMetrics/docs/anomaly-detection/components/writer.md
Fred Navruzov 2ba72aecf7
docs/vmanomaly: remove duplicate header in VmWriter docs ()
### Describe Your Changes

docs/vmanomaly: remove duplicate header in VmWriter docs

### Checklist

The following checks are **mandatory**:

- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

(cherry picked from commit 5c9bd35eb9)
2024-10-07 14:46:20 +02:00

11 KiB

title weight menu aliases
Writer 4
docs
parent weight
vmanomaly-components 4
/anomaly-detection/components/writer.html

For exporting data, VictoriaMetrics Anomaly Detection (vmanomaly) primarily employs the VmWriter, which writes produced anomaly scores (preserving initial labelset and optionally applying additional ones) back to VictoriaMetrics. This writer is tailored for smooth data export within the VictoriaMetrics ecosystem.

Future updates will introduce additional export methods, offering users more flexibility in data handling and integration.

VM writer

Config parameters


Parameter Example Description

class

writer.vm.VmWriter or vm starting from v1.13.0

Name of the class needed to enable writing to VictoriaMetrics or Prometheus. VmWriter is the default option, if not specified.

datasource_url

http://localhost:8481/

Datasource URL address

tenant_id

0:0, multitenant (starting from v1.16.2)

For VictoriaMetrics Cluster version only, tenants are identified by accountID or accountID:projectID. Starting from v1.16.2, multitenant endpoint is supported, to write data to multiple tenants. See VictoriaMetrics Cluster multitenancy docs

metric_format

__name__: "vmanomaly_$VAR"

Metrics to save the output (in metric names or labels). Must have __name__ key. Must have a value with $VAR placeholder in it to distinguish between resulting metrics. Supported placeholders:

  • $VAR -- Variables that model provides, all models provide the following set: {"anomaly_score", "y", "yhat", "yhat_lower", "yhat_upper"}. Description of standard output is here. Depending on model type it can provide more metrics, like "trend", "seasonality" etc.

  • $QUERY_KEY -- E.g. "ingestion_rate".

Other keys are supposed to be configured by the user to help identify generated metrics, e.g., specific config file name etc. More details on metric formatting are here.

for: "$QUERY_KEY"

run: "test_metric_format"

config: "io_vm_single.yaml"

import_json_path

/api/v1/import

Optional, to override the default import path

health_path

/health

Absolute or relative URL address where to check the availability of the datasource. Optional, to override the default /health path.

user

USERNAME

BasicAuth username

password

PASSWORD

BasicAuth password

timeout

5s

Timeout for the requests, passed as a string

verify_tls

false

Allows disabling TLS verification of the remote certificate.

bearer_token token Token is passed in the standard format with header: Authorization: bearer {token}
bearer_token_file path_to_file Path to a file, which contains token, that is passed in the standard format with header: Authorization: bearer {token}. Available since v1.15.9

Config example:

writer:
  class: "vm"  # or "writer.vm.VmWriter" until v1.13.0
  datasource_url: "http://localhost:8428/"
  tenant_id: "0:0"
  metric_format:
    __name__: "vmanomaly_$VAR"
    for: "$QUERY_KEY"
    run: "test_metric_format"
    config: "io_vm_single.yaml"
  import_json_path: "/api/v1/import"
  health_path: "health"
  user: "foo"
  password: "bar"

Multitenancy support

This feature applies to the VictoriaMetrics Cluster version only. Tenants are identified by either accountID or accountID:projectID. Starting with v1.16.2, the multitenant endpoint is supported for writing data across multiple tenants. For more details, refer to the VictoriaMetrics Cluster multitenancy documentation.

Please note the different behaviors depending on the tenant_id value:

  1. When writer.tenant_id != 'multitenant' (e.g., "0:0") and reader.tenant_id != 'multitenant' (can be different but valid, like `"0:1"):

    • The vm_account_id label is not created in the reader, not persisted to the writer, and is not expected in the output.
    • Result: Data is written successfully with no logs or errors.
  2. When writer.tenant_id = 'multitenant' and vm_project_id is present in the label set:

    • This typically happens when reader.tenant_id is also set to multitenant, meaning the vm_account_id label is stored in the results returned from the queries.
    • Result: Everything functions as expected. Data is written successfully with no logs or errors.
  3. When writer.tenant_id = 'multitenant' but vm_account_id is missing (e.g., due to aggregation in the reader or missing keep_metric_names in the query):

    • Result: The data is still written to "0:0", but a warning is raised:
    The label `vm_account_id` was not found in the label set of {query_result.key}, 
    but tenant_id='multitenant' is set in writer. The data will be written to the default tenant 0:0. 
    Ensure that the query retains the necessary multi-tenant labels, 
    or adjust the aggregation settings to preserve `vm_account_id` key in the label set.
    
  4. When writer.tenant_id != 'multitenant' (e.g., "0:0") and vm_account_id exists in the label set:

    • Result: Writing is allowed, but a warning is raised:
    The label set for the metric {query_result.key} contains multi-tenancy labels, 
    but the write endpoint is configured for single-tenant mode (tenant_id != 'multitenant'). 
    Either adjust the query in the reader to avoid multi-tenancy labels 
    or ensure that reserved key `vm_account_id` is not explicitly set for single-tenant environments.
    

Healthcheck metrics

VmWriter exposes several healthchecks metrics.

Metrics formatting

There should be 2 mandatory parameters set in metric_format - __name__ and for.

__name__: PREFIX1_$VAR
for: PREFIX2_$QUERY_KEY
  • for __name__ parameter it will name metrics returned by models as PREFIX1_anomaly_score, PREFIX1_yhat_lower, etc. Vmanomaly output metrics names described here
  • for for parameter will add labels PREFIX2_query_name_1, PREFIX2_query_name_2, etc. Query names are set as aliases in config reader section in queries parameter.

It is possible to specify other custom label names needed. For example:

custom_label_1: label_name_1
custom_label_2: label_name_2

Apart from specified labels, output metrics will return labels inherited from input metrics returned by queries. For example if input data contains labels such as cpu=1, device=eth0, instance=node-exporter:9100 all these labels will be present in vmanomaly output metrics.

So if metric_format section was set up like this:

metric_format:
    __name__: "PREFIX1_$VAR"
    for: "PREFIX2_$QUERY_KEY"
    custom_label_1: label_name_1
    custom_label_2: label_name_2

It will return metrics that will look like:

{__name__="PREFIX1_anomaly_score", for="PREFIX2_query_name_1", custom_label_1="label_name_1", custom_label_2="label_name_2", cpu=1, device="eth0", instance="node-exporter:9100"}
{__name__="PREFIX1_yhat_lower", for="PREFIX2_query_name_1", custom_label_1="label_name_1", custom_label_2="label_name_2", cpu=1, device="eth0", instance="node-exporter:9100"}
{__name__="PREFIX1_anomaly_score", for="PREFIX2_query_name_2", custom_label_1="label_name_1", custom_label_2="label_name_2", cpu=1, device="eth0", instance="node-exporter:9100"}
{__name__="PREFIX1_yhat_lower", for="PREFIX2_query_name_2", custom_label_1="label_name_1", custom_label_2="label_name_2", cpu=1, device="eth0", instance="node-exporter:9100"}