mirror of
https://github.com/VictoriaMetrics/VictoriaMetrics.git
synced 2025-01-20 15:29:24 +01:00
update wiki pages
parent
ba60314639
commit
0f082f4249
@ -122,7 +122,7 @@ ROOT_IMAGE=scratch make package
|
|||||||
|
|
||||||
## Operation
|
## Operation
|
||||||
|
|
||||||
### Cluster setup
|
## Cluster setup
|
||||||
|
|
||||||
A minimal cluster must contain the following nodes:
|
A minimal cluster must contain the following nodes:
|
||||||
|
|
||||||
@ -141,7 +141,7 @@ Ports may be altered by setting `-httpListenAddr` on the corresponding nodes.
|
|||||||
|
|
||||||
It is recommended setting up [monitoring](#monitoring) for the cluster.
|
It is recommended setting up [monitoring](#monitoring) for the cluster.
|
||||||
|
|
||||||
#### Environment variables
|
### Environment variables
|
||||||
|
|
||||||
Each flag values can be set thru environment variables by following these rules:
|
Each flag values can be set thru environment variables by following these rules:
|
||||||
|
|
||||||
@ -151,7 +151,7 @@ Each flag values can be set thru environment variables by following these rules:
|
|||||||
- It is possible setting prefix for environment vars with `-envflag.prefix`. For instance, if `-envflag.prefix=VM_`, then env vars must be prepended with `VM_`
|
- It is possible setting prefix for environment vars with `-envflag.prefix`. For instance, if `-envflag.prefix=VM_`, then env vars must be prepended with `VM_`
|
||||||
|
|
||||||
|
|
||||||
### Monitoring
|
## Monitoring
|
||||||
|
|
||||||
All the cluster components expose various metrics in Prometheus-compatible format at `/metrics` page on the TCP port set in `-httpListenAddr` command-line flag.
|
All the cluster components expose various metrics in Prometheus-compatible format at `/metrics` page on the TCP port set in `-httpListenAddr` command-line flag.
|
||||||
By default the following TCP ports are used:
|
By default the following TCP ports are used:
|
||||||
@ -165,7 +165,7 @@ with [the official Grafana dashboard for VictoriaMetrics cluster](https://grafan
|
|||||||
or [an alternative dashboard for VictoriaMetrics cluster](https://grafana.com/grafana/dashboards/11831).
|
or [an alternative dashboard for VictoriaMetrics cluster](https://grafana.com/grafana/dashboards/11831).
|
||||||
|
|
||||||
|
|
||||||
### URL format
|
## URL format
|
||||||
|
|
||||||
* URLs for data ingestion: `http://<vminsert>:8480/insert/<accountID>/<suffix>`, where:
|
* URLs for data ingestion: `http://<vminsert>:8480/insert/<accountID>/<suffix>`, where:
|
||||||
- `<accountID>` is an arbitrary 32-bit integer identifying namespace for data ingestion (aka tenant). It is possible to set it as `accountID:projectID`,
|
- `<accountID>` is an arbitrary 32-bit integer identifying namespace for data ingestion (aka tenant). It is possible to set it as `accountID:projectID`,
|
||||||
@ -231,7 +231,7 @@ or [an alternative dashboard for VictoriaMetrics cluster](https://grafana.com/gr
|
|||||||
across `vmstorage` nodes.
|
across `vmstorage` nodes.
|
||||||
|
|
||||||
|
|
||||||
### Cluster resizing and scalability
|
## Cluster resizing and scalability
|
||||||
|
|
||||||
Cluster performance and capacity scales with adding new nodes.
|
Cluster performance and capacity scales with adding new nodes.
|
||||||
|
|
||||||
@ -250,7 +250,7 @@ Steps to add `vmstorage` node:
|
|||||||
3. Gradually restart all the `vminsert` nodes with new `-storageNode` arg containing `<new_vmstorage_host>:8400`.
|
3. Gradually restart all the `vminsert` nodes with new `-storageNode` arg containing `<new_vmstorage_host>:8400`.
|
||||||
|
|
||||||
|
|
||||||
### Updating / reconfiguring cluster nodes
|
## Updating / reconfiguring cluster nodes
|
||||||
|
|
||||||
All the node types - `vminsert`, `vmselect` and `vmstorage` - may be updated via graceful shutdown.
|
All the node types - `vminsert`, `vmselect` and `vmstorage` - may be updated via graceful shutdown.
|
||||||
Send `SIGINT` signal to the corresponding process, wait until it finishes and then start new version
|
Send `SIGINT` signal to the corresponding process, wait until it finishes and then start new version
|
||||||
@ -260,7 +260,7 @@ Cluster should remain in working state if at least a single node of each type re
|
|||||||
the update process. See [cluster availability](#cluster-availability) section for details.
|
the update process. See [cluster availability](#cluster-availability) section for details.
|
||||||
|
|
||||||
|
|
||||||
### Cluster availability
|
## Cluster availability
|
||||||
|
|
||||||
* HTTP load balancer must stop routing requests to unavailable `vminsert` and `vmselect` nodes.
|
* HTTP load balancer must stop routing requests to unavailable `vminsert` and `vmselect` nodes.
|
||||||
* The cluster remains available if at least a single `vmstorage` node exists:
|
* The cluster remains available if at least a single `vmstorage` node exists:
|
||||||
@ -271,11 +271,11 @@ the update process. See [cluster availability](#cluster-availability) section fo
|
|||||||
Data replication can be used for increasing storage durability. See [these docs](#replication-and-data-safety) for details.
|
Data replication can be used for increasing storage durability. See [these docs](#replication-and-data-safety) for details.
|
||||||
|
|
||||||
|
|
||||||
### Capacity planning
|
## Capacity planning
|
||||||
|
|
||||||
Each instance type - `vminsert`, `vmselect` and `vmstorage` - can run on the most suitable hardware.
|
Each instance type - `vminsert`, `vmselect` and `vmstorage` - can run on the most suitable hardware.
|
||||||
|
|
||||||
#### vminsert
|
### vminsert
|
||||||
|
|
||||||
* The recommended total number of vCPU cores for all the `vminsert` instances can be calculated from the ingestion rate: `vCPUs = ingestion_rate / 150K`.
|
* The recommended total number of vCPU cores for all the `vminsert` instances can be calculated from the ingestion rate: `vCPUs = ingestion_rate / 150K`.
|
||||||
* The recommended number of vCPU cores per each `vminsert` instance should equal to the number of `vmstorage` instances in the cluster.
|
* The recommended number of vCPU cores per each `vminsert` instance should equal to the number of `vmstorage` instances in the cluster.
|
||||||
@ -285,7 +285,7 @@ Each instance type - `vminsert`, `vmselect` and `vmstorage` - can run on the mos
|
|||||||
* Sometimes `-rpc.disableCompression` command-line flag on `vminsert` instances could increase ingestion capacity at the cost
|
* Sometimes `-rpc.disableCompression` command-line flag on `vminsert` instances could increase ingestion capacity at the cost
|
||||||
of higher network bandwidth usage between `vminsert` and `vmstorage`.
|
of higher network bandwidth usage between `vminsert` and `vmstorage`.
|
||||||
|
|
||||||
#### vmstorage
|
### vmstorage
|
||||||
|
|
||||||
* The recommended total number of vCPU cores for all the `vmstorage` instances can be calculated from the ingestion rate: `vCPUs = ingestion_rate / 150K`.
|
* The recommended total number of vCPU cores for all the `vmstorage` instances can be calculated from the ingestion rate: `vCPUs = ingestion_rate / 150K`.
|
||||||
* The recommended total amount of RAM for all the `vmstorage` instances can be calculated from the number of active time series: `RAM = 2 * active_time_series * 1KB`.
|
* The recommended total amount of RAM for all the `vmstorage` instances can be calculated from the number of active time series: `RAM = 2 * active_time_series * 1KB`.
|
||||||
@ -299,7 +299,7 @@ Each instance type - `vminsert`, `vmselect` and `vmstorage` - can run on the mos
|
|||||||
* The recommended total amount of storage space for all the `vmstorage` instances can be calculated
|
* The recommended total amount of storage space for all the `vmstorage` instances can be calculated
|
||||||
from the ingestion rate and retention: `storage_space = ingestion_rate * retention_seconds`.
|
from the ingestion rate and retention: `storage_space = ingestion_rate * retention_seconds`.
|
||||||
|
|
||||||
#### vmselect
|
### vmselect
|
||||||
|
|
||||||
The recommended hardware for `vmselect` instances highly depends on the type of queries. Lightweight queries over small number of time series usually require
|
The recommended hardware for `vmselect` instances highly depends on the type of queries. Lightweight queries over small number of time series usually require
|
||||||
small number of vCPU cores and small amount of RAM on `vmselect`, while heavy queries over big number of time series (>10K) usually require
|
small number of vCPU cores and small amount of RAM on `vmselect`, while heavy queries over big number of time series (>10K) usually require
|
||||||
@ -309,7 +309,7 @@ In general it is recommended increasing the number of vCPU cores and RAM per `vm
|
|||||||
while adding new `vmselect` nodes only when old nodes are overloaded with incoming query stream.
|
while adding new `vmselect` nodes only when old nodes are overloaded with incoming query stream.
|
||||||
|
|
||||||
|
|
||||||
### High availability
|
## High availability
|
||||||
|
|
||||||
It is recommended to run all the components for a single cluster in the same subnetwork with high bandwidth, low latency and low error rates.
|
It is recommended to run all the components for a single cluster in the same subnetwork with high bandwidth, low latency and low error rates.
|
||||||
This improves cluster performance and availability.
|
This improves cluster performance and availability.
|
||||||
@ -321,18 +321,18 @@ If you need multi-AZ setup, then it is recommended running independed clusters i
|
|||||||
into all the cluster. Then [promxy](https://github.com/jacksontj/promxy) could be used for querying the data from multiple clusters.
|
into all the cluster. Then [promxy](https://github.com/jacksontj/promxy) could be used for querying the data from multiple clusters.
|
||||||
|
|
||||||
|
|
||||||
### Helm
|
## Helm
|
||||||
|
|
||||||
Helm chart simplifies managing cluster version of VictoriaMetrics in Kubernetes.
|
Helm chart simplifies managing cluster version of VictoriaMetrics in Kubernetes.
|
||||||
It is available in the [helm-charts](https://github.com/VictoriaMetrics/helm-charts) repository.
|
It is available in the [helm-charts](https://github.com/VictoriaMetrics/helm-charts) repository.
|
||||||
|
|
||||||
|
|
||||||
### Kubernetes operator
|
## Kubernetes operator
|
||||||
|
|
||||||
[K8s operator](https://github.com/VictoriaMetrics/operator) simplifies managing VictoriaMetrics components in Kubernetes.
|
[K8s operator](https://github.com/VictoriaMetrics/operator) simplifies managing VictoriaMetrics components in Kubernetes.
|
||||||
|
|
||||||
|
|
||||||
### Replication and data safety
|
## Replication and data safety
|
||||||
|
|
||||||
In order to enable application-level replication, `-replicationFactor=N` command-line flag must be passed to `vminsert`.
|
In order to enable application-level replication, `-replicationFactor=N` command-line flag must be passed to `vminsert`.
|
||||||
This guarantees that all the data remains available for querying if up to `N-1` `vmstorage` nodes are unavailable.
|
This guarantees that all the data remains available for querying if up to `N-1` `vmstorage` nodes are unavailable.
|
||||||
@ -355,7 +355,7 @@ HDD-based persistent disks should be enough for the majority of use cases.
|
|||||||
It is recommended using durable replicated persistent volumes in Kubernetes.
|
It is recommended using durable replicated persistent volumes in Kubernetes.
|
||||||
|
|
||||||
|
|
||||||
### Backups
|
## Backups
|
||||||
|
|
||||||
It is recommended performing periodical backups from [instant snapshots](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282)
|
It is recommended performing periodical backups from [instant snapshots](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282)
|
||||||
for protecting from user errors such as accidental data deletion.
|
for protecting from user errors such as accidental data deletion.
|
||||||
@ -376,6 +376,27 @@ Restoring from backup:
|
|||||||
3. Start `vmstorage` node.
|
3. Start `vmstorage` node.
|
||||||
|
|
||||||
|
|
||||||
|
## Profiling
|
||||||
|
|
||||||
|
All the cluster components provide the following handlers for [profiling](https://blog.golang.org/profiling-go-programs):
|
||||||
|
|
||||||
|
* `http://vminsert:8480/debug/pprof/heap` for memory profile and `http://vminsert:8480/debug/pprof/profile` for CPU profile
|
||||||
|
* `http://vmselect:8481/debug/pprof/heap` for memory profile and `http://vmselect:8481/debug/pprof/profile` for CPU profile
|
||||||
|
* `http://vmstorage:8482/debug/pprof/heap` for memory profile and `http://vmstorage:8482/debug/pprof/profile` for CPU profile
|
||||||
|
|
||||||
|
Example command for collecting cpu profile from `vmstorage`:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
curl -s http://<victoria-metrics-host>:8428/debug/pprof/profile > cpu.pprof
|
||||||
|
```
|
||||||
|
|
||||||
|
Example command for collecting memory profile from `vminsert`:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
curl -s http://<victoria-metrics-host>:8428/debug/pprof/heap > mem.pprof
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
## Community and contributions
|
## Community and contributions
|
||||||
|
|
||||||
We are open to third-party pull requests provided they follow [KISS design principle](https://en.wikipedia.org/wiki/KISS_principle):
|
We are open to third-party pull requests provided they follow [KISS design principle](https://en.wikipedia.org/wiki/KISS_principle):
|
||||||
|
Loading…
Reference in New Issue
Block a user