docs: vmagent info about p queue disk size (#5399)

Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
(cherry picked from commit 760a530305)
This commit is contained in:
Artem Navoiev 2023-12-01 03:19:56 -08:00 committed by hagen1778
parent 3507e1e27b
commit fbd04e3437
No known key found for this signature in database
GPG Key ID: 3BF75F3741CA9640

View File

@ -1225,6 +1225,25 @@ which can be downloaded for evaluation from [releases](https://github.com/Victor
The following optional suffixes are supported: s (second), m (minute), h (hour), d (day), w (week), y (year). If suffix isn't set, then the duration is counted in months (default 60s)
```
## How much disk space should you provide for the vmagent persistence queue?
vmagent buffers scraped or received data at the `-remoteWrite.tmpDataPath` file system directory (aka persistent queue) until it is sent to `-remoteWrite.url`. The directory can grow large when remote storage is unavailable for extended periods of time and if the maximum directory size isn't limited with `-remoteWrite.maxDiskUsagePerURL` command-line flag. The buffered metrics are sent to remote storage as soon as the connection to the remote storage is repaired.
To estimate the disk size allocated or `-remoteWrite.maxDiskUsagePerURL` for a persistent queue take into account the following attributes:
1. the size in bytes of data stream sent by vmagent:
Run query `sum(rate(vmagent_remotewrite_bytes_sent_total[1h])) by(instance)` in [vmui](https://docs.victoriametrics.com/#vmui) or Grafana to get the amount of **bytes** sent by each vmagent instance (in k8s you might want to `sum by(pod)` instead) per second.
2. the amount of **time** a persistent queue should keep the data before starting to drop it.
If the persistent queue should be able to retain the data for at least 6h, then multiply the amount of bytes sent by vmagent per-second by 6*3600s to get the approximate queue size on disk.
For example, expression [sum(rate(vmagent_remotewrite_bytes_sent_total[1h])) by(instance) * 6 * 3600 / 1Gi](https://play.victoriametrics.com/select/accounting/1/6a716b0f-38bc-4856-90ce-448fd713e3fe/prometheus/graph/#/?g0.expr=sum%28rate%28vmagent_remotewrite_bytes_sent_total%5B1h%5D%29%29+by%28instance%29+*+6+*+3600+%2F+1Gi&g0.range_input=30m&g0.end_input=2023-11-30T14%3A55%3A14&g0.tab=1&g0.relative_time=last_30_minutes) suggests that there are two vmagents: one needs 2Gi and second 21Gi of disk space available for persistent queue to retain data for 6h.
Additional notes:
1. For `vmagent_.*` metrics to be available for querying ensure that [monitoring](#monitoring) is configured.
1. Re-evaluate the estimation each time when:
* there is an increase in the amount of metrics to process
* there is a change in [relabeling rules](https://docs.victoriametrics.com/vmagent.html#relabeling) which could increase the amount of sent metrics
* you add a new `-remoteWrite.url`
1. The minimum disk size to allocate for the persistent queue is 500Mi per each `-remoteWrite.url` .
## Kafka integration
[Enterprise version](https://docs.victoriametrics.com/enterprise.html) of `vmagent` can read and write metrics from / to Kafka: