mirror of
https://github.com/VictoriaMetrics/VictoriaMetrics.git
synced 2024-12-21 07:56:26 +01:00
f75874f5df
* add `AD` section, fix links, release docs and changelog * - connect sections, refactor structure * - resolve suggestions - add FAQ section - fix dead links * - fix incorrect render of tables for Writer - comment out internal readers/writers - fix page ordering to some extent * - link licensing requirements from v1.5.0 to main page --------- Co-authored-by: Artem Navoiev <tenmozes@gmail.com>
297 lines
8.6 KiB
Markdown
297 lines
8.6 KiB
Markdown
---
|
|
# sort: 5
|
|
title: Monitoring
|
|
weight: 5
|
|
menu:
|
|
docs:
|
|
parent: "vmanomaly-components"
|
|
weight: 5
|
|
# sort: 5
|
|
aliases:
|
|
- /anomaly-detection/components/monitoring.html
|
|
---
|
|
|
|
# Monitoring
|
|
|
|
There are 2 models to monitor VictoriaMetrics Anomaly Detection behavior - [push](https://docs.victoriametrics.com/keyConcepts.html#push-model) and [pull](https://docs.victoriametrics.com/keyConcepts.html#pull-model). Parameters for each of them should be specified in the config file, `monitoring` section.
|
|
|
|
## Pull Model Config parameters
|
|
|
|
<table>
|
|
<thead>
|
|
<tr>
|
|
<th>Parameter</th>
|
|
<th>Default</th>
|
|
<th>Description</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody>
|
|
<tr>
|
|
<td><code>addr</code></td>
|
|
<td><code>"0.0.0.0"</code></td>
|
|
<td>Server IP Address</td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>port</code></td>
|
|
<td><code>8080</code></td>
|
|
<td>Port</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
|
|
## Push Config parameters
|
|
|
|
<table>
|
|
<thead>
|
|
<tr>
|
|
<th>Parameter</th>
|
|
<th>Default</th>
|
|
<th>Description</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody>
|
|
<tr>
|
|
<td><code>url</code></td>
|
|
<td></td>
|
|
<td>Link where to push metrics to. Example: <code>"http://localhost:8480/"</code></td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>tenant_id</code></td>
|
|
<td></td>
|
|
<td>Tenant ID for cluster version. Example: <code>"0:0"</code></td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>health_path</code></td>
|
|
<td><code>"health"</code></td>
|
|
<td>Absolute, to override <code>/health</code> path</td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>user</code></td>
|
|
<td></td>
|
|
<td>BasicAuth username</td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>password</code></td>
|
|
<td></td>
|
|
<td>BasicAuth password</td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>timeout</code></td>
|
|
<td><code>"5s"</code></td>
|
|
<td>Stop waiting for a response after a given number of seconds.</td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>extra_labels</code></td>
|
|
<td></td>
|
|
<td>Section for custom labels specified by user.</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
|
|
## Monitoring section config example
|
|
|
|
<div class="with-copy" markdown="1">
|
|
|
|
``` yaml
|
|
monitoring:
|
|
pull: # Enable /metrics endpoint.
|
|
addr: "0.0.0.0"
|
|
port: 8080
|
|
push:
|
|
url: "http://localhost:8480/"
|
|
tenant_id: "0:0" # For cluster version only
|
|
health_path: "health"
|
|
user: "USERNAME"
|
|
password: "PASSWORD"
|
|
timeout: "5s"
|
|
extra_labels:
|
|
job: "vmanomaly-push"
|
|
test: "test-1"
|
|
```
|
|
</div>
|
|
|
|
## Metrics generated by vmanomaly
|
|
|
|
<table>
|
|
<thead>
|
|
<tr>
|
|
<th>Metric</th>
|
|
<th>Type</th>
|
|
<th>Description</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody>
|
|
<tr>
|
|
<td><code>vmanomaly_start_time_seconds</code></td>
|
|
<td>Gauge</td>
|
|
<td>vmanomaly start time in UNIX time</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
|
|
### Models Behaviour Metrics
|
|
Label names [description](#labelnames)
|
|
|
|
<table>
|
|
<thead>
|
|
<tr>
|
|
<th>Metric</th>
|
|
<th>Type</th>
|
|
<th>Description</th>
|
|
<th>Labelnames</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody>
|
|
<tr>
|
|
<td><code>vmanomaly_model_runs</code></td>
|
|
<td>Counter</td>
|
|
<td>How many times models ran (per model)</td>
|
|
<td><code>stage, query_key</code></td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>vmanomaly_model_run_duration_seconds</code></td>
|
|
<td>Summary</td>
|
|
<td>How much time (in seconds) model invocations took</td>
|
|
<td><code>stage, query_key</code></td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>vmanomaly_model_datapoints_accepted</code></td>
|
|
<td>Counter</td>
|
|
<td>How many datapoints did models accept</td>
|
|
<td><code>stage, query_key</code></td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>vmanomaly_model_datapoints_produced</code></td>
|
|
<td>Counter</td>
|
|
<td>How many datapoints were generated by models</td>
|
|
<td><code>stage, query_key</code></td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>vmanomaly_models_active</code></td>
|
|
<td>Gauge</td>
|
|
<td>How many models are currently inferring</td>
|
|
<td><code>query_key</code></td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>vmanomaly_model_runs_skipped</code></td>
|
|
<td>Counter</td>
|
|
<td>How many times a run was skipped (per model)</td>
|
|
<td><code>stage, query_key</code></td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
|
|
### Writer Behaviour Metrics
|
|
Label names [description](#labelnames)
|
|
|
|
<table>
|
|
<thead>
|
|
<tr>
|
|
<th>Metric</th>
|
|
<th>Type</th>
|
|
<th>Description</th>
|
|
<th>Labelnames</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody>
|
|
<tr>
|
|
<td><code>vmanomaly_writer_request_duration_seconds</code></td>
|
|
<td>Summary</td>
|
|
<td>How much time (in seconds) did requests to VictoriaMetrics take</td>
|
|
<td><code>url, query_key</code></td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>vmanomaly_writer_response_count</code></td>
|
|
<td>Counter</td>
|
|
<td>Response code counts we got from VictoriaMetrics</td>
|
|
<td><code>url, query_key, code</code></td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>vmanomaly_writer_sent_bytes</code></td>
|
|
<td>Counter</td>
|
|
<td>How much bytes were sent to VictoriaMetrics</td>
|
|
<td><code>url, query_key</code></td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>vmanomaly_writer_request_serialize_seconds</code></td>
|
|
<td>Summary</td>
|
|
<td>How much time (in seconds) did serializing take</td>
|
|
<td><code>query_key</code></td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>vmanomaly_writer_datapoints_sent</code></td>
|
|
<td>Counter</td>
|
|
<td>How many datapoints were sent to VictoriaMetrics</td>
|
|
<td><code>query_key</code></td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>vmanomaly_writer_timeseries_sent</code></td>
|
|
<td>Counter</td>
|
|
<td>How many timeseries were sent to VictoriaMetrics</td>
|
|
<td><code>query_key</code></td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
|
|
### Reader Behaviour Metrics
|
|
Label names [description](#labelnames)
|
|
|
|
<table>
|
|
<thead>
|
|
<tr>
|
|
<th>Metric</th>
|
|
<th>Type</th>
|
|
<th>Description</th>
|
|
<th>Labelnames</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody>
|
|
<tr>
|
|
<td><code>vmanomaly_reader_request_duration_seconds</code></td>
|
|
<td>Summary</td>
|
|
<td>How much time (in seconds) did queries to VictoriaMetrics take</td>
|
|
<td><code>url, query_key</code></td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>vmanomaly_reader_response_count</code></td>
|
|
<td>Counter</td>
|
|
<td>Response code counts we got from VictoriaMetrics</td>
|
|
<td><code>url, query_key, code</code></td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>vmanomaly_reader_received_bytes</code></td>
|
|
<td>Counter</td>
|
|
<td>How much bytes were received in responses</td>
|
|
<td><code>query_key</code></td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>vmanomaly_reader_response_parsing_seconds</code></td>
|
|
<td>Summary</td>
|
|
<td>How much time (in seconds) did parsing take for each step</td>
|
|
<td><code>step</code></td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>vmanomaly_reader_timeseries_received</code></td>
|
|
<td>Counter</td>
|
|
<td>How many timeseries were received from VictoriaMetrics</td>
|
|
<td><code>query_key</code></td>
|
|
</tr>
|
|
<tr>
|
|
<td><code>vmanomaly_reader_datapoints_received</code></td>
|
|
<td>Counter</td>
|
|
<td>How many rows were received from VictoriaMetrics</td>
|
|
<td><code>query_key</code></td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
|
|
### Labelnames
|
|
<code>stage</code> - stage of model - 'fit', 'infer' or 'fit_infer' for models that do it simultaneously.
|
|
|
|
<code>query_key</code> - query alias from [`reader`](/anomaly-detection/components/reader.html) config section.
|
|
|
|
<code>url</code> - writer or reader url endpoint.
|
|
|
|
<code>code</code> - response status code or `connection_error`, `timeout`.
|
|
|
|
<code>step</code> - json or dataframe reading step. |