mirror of
https://github.com/VictoriaMetrics/VictoriaMetrics.git
synced 2025-01-20 07:19:17 +01:00
docs/Cluster-VictoriaMetrics.md: add Profiling
section
This commit is contained in:
parent
34622f7f9b
commit
69b3ca37d0
53
README.md
53
README.md
@ -122,7 +122,7 @@ ROOT_IMAGE=scratch make package
|
||||
|
||||
## Operation
|
||||
|
||||
### Cluster setup
|
||||
## Cluster setup
|
||||
|
||||
A minimal cluster must contain the following nodes:
|
||||
|
||||
@ -141,7 +141,7 @@ Ports may be altered by setting `-httpListenAddr` on the corresponding nodes.
|
||||
|
||||
It is recommended setting up [monitoring](#monitoring) for the cluster.
|
||||
|
||||
#### Environment variables
|
||||
### Environment variables
|
||||
|
||||
Each flag values can be set thru environment variables by following these rules:
|
||||
|
||||
@ -151,7 +151,7 @@ Each flag values can be set thru environment variables by following these rules:
|
||||
- It is possible setting prefix for environment vars with `-envflag.prefix`. For instance, if `-envflag.prefix=VM_`, then env vars must be prepended with `VM_`
|
||||
|
||||
|
||||
### Monitoring
|
||||
## Monitoring
|
||||
|
||||
All the cluster components expose various metrics in Prometheus-compatible format at `/metrics` page on the TCP port set in `-httpListenAddr` command-line flag.
|
||||
By default the following TCP ports are used:
|
||||
@ -165,7 +165,7 @@ with [the official Grafana dashboard for VictoriaMetrics cluster](https://grafan
|
||||
or [an alternative dashboard for VictoriaMetrics cluster](https://grafana.com/grafana/dashboards/11831).
|
||||
|
||||
|
||||
### URL format
|
||||
## URL format
|
||||
|
||||
* URLs for data ingestion: `http://<vminsert>:8480/insert/<accountID>/<suffix>`, where:
|
||||
- `<accountID>` is an arbitrary 32-bit integer identifying namespace for data ingestion (aka tenant). It is possible to set it as `accountID:projectID`,
|
||||
@ -231,7 +231,7 @@ or [an alternative dashboard for VictoriaMetrics cluster](https://grafana.com/gr
|
||||
across `vmstorage` nodes.
|
||||
|
||||
|
||||
### Cluster resizing and scalability
|
||||
## Cluster resizing and scalability
|
||||
|
||||
Cluster performance and capacity scales with adding new nodes.
|
||||
|
||||
@ -250,7 +250,7 @@ Steps to add `vmstorage` node:
|
||||
3. Gradually restart all the `vminsert` nodes with new `-storageNode` arg containing `<new_vmstorage_host>:8400`.
|
||||
|
||||
|
||||
### Updating / reconfiguring cluster nodes
|
||||
## Updating / reconfiguring cluster nodes
|
||||
|
||||
All the node types - `vminsert`, `vmselect` and `vmstorage` - may be updated via graceful shutdown.
|
||||
Send `SIGINT` signal to the corresponding process, wait until it finishes and then start new version
|
||||
@ -260,7 +260,7 @@ Cluster should remain in working state if at least a single node of each type re
|
||||
the update process. See [cluster availability](#cluster-availability) section for details.
|
||||
|
||||
|
||||
### Cluster availability
|
||||
## Cluster availability
|
||||
|
||||
* HTTP load balancer must stop routing requests to unavailable `vminsert` and `vmselect` nodes.
|
||||
* The cluster remains available if at least a single `vmstorage` node exists:
|
||||
@ -271,11 +271,11 @@ the update process. See [cluster availability](#cluster-availability) section fo
|
||||
Data replication can be used for increasing storage durability. See [these docs](#replication-and-data-safety) for details.
|
||||
|
||||
|
||||
### Capacity planning
|
||||
## Capacity planning
|
||||
|
||||
Each instance type - `vminsert`, `vmselect` and `vmstorage` - can run on the most suitable hardware.
|
||||
|
||||
#### vminsert
|
||||
### vminsert
|
||||
|
||||
* The recommended total number of vCPU cores for all the `vminsert` instances can be calculated from the ingestion rate: `vCPUs = ingestion_rate / 150K`.
|
||||
* The recommended number of vCPU cores per each `vminsert` instance should equal to the number of `vmstorage` instances in the cluster.
|
||||
@ -285,7 +285,7 @@ Each instance type - `vminsert`, `vmselect` and `vmstorage` - can run on the mos
|
||||
* Sometimes `-rpc.disableCompression` command-line flag on `vminsert` instances could increase ingestion capacity at the cost
|
||||
of higher network bandwidth usage between `vminsert` and `vmstorage`.
|
||||
|
||||
#### vmstorage
|
||||
### vmstorage
|
||||
|
||||
* The recommended total number of vCPU cores for all the `vmstorage` instances can be calculated from the ingestion rate: `vCPUs = ingestion_rate / 150K`.
|
||||
* The recommended total amount of RAM for all the `vmstorage` instances can be calculated from the number of active time series: `RAM = 2 * active_time_series * 1KB`.
|
||||
@ -299,7 +299,7 @@ Each instance type - `vminsert`, `vmselect` and `vmstorage` - can run on the mos
|
||||
* The recommended total amount of storage space for all the `vmstorage` instances can be calculated
|
||||
from the ingestion rate and retention: `storage_space = ingestion_rate * retention_seconds`.
|
||||
|
||||
#### vmselect
|
||||
### vmselect
|
||||
|
||||
The recommended hardware for `vmselect` instances highly depends on the type of queries. Lightweight queries over small number of time series usually require
|
||||
small number of vCPU cores and small amount of RAM on `vmselect`, while heavy queries over big number of time series (>10K) usually require
|
||||
@ -309,7 +309,7 @@ In general it is recommended increasing the number of vCPU cores and RAM per `vm
|
||||
while adding new `vmselect` nodes only when old nodes are overloaded with incoming query stream.
|
||||
|
||||
|
||||
### High availability
|
||||
## High availability
|
||||
|
||||
It is recommended to run all the components for a single cluster in the same subnetwork with high bandwidth, low latency and low error rates.
|
||||
This improves cluster performance and availability.
|
||||
@ -321,18 +321,18 @@ If you need multi-AZ setup, then it is recommended running independed clusters i
|
||||
into all the cluster. Then [promxy](https://github.com/jacksontj/promxy) could be used for querying the data from multiple clusters.
|
||||
|
||||
|
||||
### Helm
|
||||
## Helm
|
||||
|
||||
Helm chart simplifies managing cluster version of VictoriaMetrics in Kubernetes.
|
||||
It is available in the [helm-charts](https://github.com/VictoriaMetrics/helm-charts) repository.
|
||||
|
||||
|
||||
### Kubernetes operator
|
||||
## Kubernetes operator
|
||||
|
||||
[K8s operator](https://github.com/VictoriaMetrics/operator) simplifies managing VictoriaMetrics components in Kubernetes.
|
||||
|
||||
|
||||
### Replication and data safety
|
||||
## Replication and data safety
|
||||
|
||||
In order to enable application-level replication, `-replicationFactor=N` command-line flag must be passed to `vminsert`.
|
||||
This guarantees that all the data remains available for querying if up to `N-1` `vmstorage` nodes are unavailable.
|
||||
@ -355,7 +355,7 @@ HDD-based persistent disks should be enough for the majority of use cases.
|
||||
It is recommended using durable replicated persistent volumes in Kubernetes.
|
||||
|
||||
|
||||
### Backups
|
||||
## Backups
|
||||
|
||||
It is recommended performing periodical backups from [instant snapshots](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282)
|
||||
for protecting from user errors such as accidental data deletion.
|
||||
@ -376,6 +376,27 @@ Restoring from backup:
|
||||
3. Start `vmstorage` node.
|
||||
|
||||
|
||||
## Profiling
|
||||
|
||||
All the cluster components provide the following handlers for [profiling](https://blog.golang.org/profiling-go-programs):
|
||||
|
||||
* `http://vminsert:8480/debug/pprof/heap` for memory profile and `http://vminsert:8480/debug/pprof/profile` for CPU profile
|
||||
* `http://vmselect:8481/debug/pprof/heap` for memory profile and `http://vmselect:8481/debug/pprof/profile` for CPU profile
|
||||
* `http://vmstorage:8482/debug/pprof/heap` for memory profile and `http://vmstorage:8482/debug/pprof/profile` for CPU profile
|
||||
|
||||
Example command for collecting cpu profile from `vmstorage`:
|
||||
|
||||
```bash
|
||||
curl -s http://<victoria-metrics-host>:8428/debug/pprof/profile > cpu.pprof
|
||||
```
|
||||
|
||||
Example command for collecting memory profile from `vminsert`:
|
||||
|
||||
```bash
|
||||
curl -s http://<victoria-metrics-host>:8428/debug/pprof/heap > mem.pprof
|
||||
```
|
||||
|
||||
|
||||
## Community and contributions
|
||||
|
||||
We are open to third-party pull requests provided they follow [KISS design principle](https://en.wikipedia.org/wiki/KISS_principle):
|
||||
|
@ -122,7 +122,7 @@ ROOT_IMAGE=scratch make package
|
||||
|
||||
## Operation
|
||||
|
||||
### Cluster setup
|
||||
## Cluster setup
|
||||
|
||||
A minimal cluster must contain the following nodes:
|
||||
|
||||
@ -141,7 +141,7 @@ Ports may be altered by setting `-httpListenAddr` on the corresponding nodes.
|
||||
|
||||
It is recommended setting up [monitoring](#monitoring) for the cluster.
|
||||
|
||||
#### Environment variables
|
||||
### Environment variables
|
||||
|
||||
Each flag values can be set thru environment variables by following these rules:
|
||||
|
||||
@ -151,7 +151,7 @@ Each flag values can be set thru environment variables by following these rules:
|
||||
- It is possible setting prefix for environment vars with `-envflag.prefix`. For instance, if `-envflag.prefix=VM_`, then env vars must be prepended with `VM_`
|
||||
|
||||
|
||||
### Monitoring
|
||||
## Monitoring
|
||||
|
||||
All the cluster components expose various metrics in Prometheus-compatible format at `/metrics` page on the TCP port set in `-httpListenAddr` command-line flag.
|
||||
By default the following TCP ports are used:
|
||||
@ -165,7 +165,7 @@ with [the official Grafana dashboard for VictoriaMetrics cluster](https://grafan
|
||||
or [an alternative dashboard for VictoriaMetrics cluster](https://grafana.com/grafana/dashboards/11831).
|
||||
|
||||
|
||||
### URL format
|
||||
## URL format
|
||||
|
||||
* URLs for data ingestion: `http://<vminsert>:8480/insert/<accountID>/<suffix>`, where:
|
||||
- `<accountID>` is an arbitrary 32-bit integer identifying namespace for data ingestion (aka tenant). It is possible to set it as `accountID:projectID`,
|
||||
@ -231,7 +231,7 @@ or [an alternative dashboard for VictoriaMetrics cluster](https://grafana.com/gr
|
||||
across `vmstorage` nodes.
|
||||
|
||||
|
||||
### Cluster resizing and scalability
|
||||
## Cluster resizing and scalability
|
||||
|
||||
Cluster performance and capacity scales with adding new nodes.
|
||||
|
||||
@ -250,7 +250,7 @@ Steps to add `vmstorage` node:
|
||||
3. Gradually restart all the `vminsert` nodes with new `-storageNode` arg containing `<new_vmstorage_host>:8400`.
|
||||
|
||||
|
||||
### Updating / reconfiguring cluster nodes
|
||||
## Updating / reconfiguring cluster nodes
|
||||
|
||||
All the node types - `vminsert`, `vmselect` and `vmstorage` - may be updated via graceful shutdown.
|
||||
Send `SIGINT` signal to the corresponding process, wait until it finishes and then start new version
|
||||
@ -260,7 +260,7 @@ Cluster should remain in working state if at least a single node of each type re
|
||||
the update process. See [cluster availability](#cluster-availability) section for details.
|
||||
|
||||
|
||||
### Cluster availability
|
||||
## Cluster availability
|
||||
|
||||
* HTTP load balancer must stop routing requests to unavailable `vminsert` and `vmselect` nodes.
|
||||
* The cluster remains available if at least a single `vmstorage` node exists:
|
||||
@ -271,11 +271,11 @@ the update process. See [cluster availability](#cluster-availability) section fo
|
||||
Data replication can be used for increasing storage durability. See [these docs](#replication-and-data-safety) for details.
|
||||
|
||||
|
||||
### Capacity planning
|
||||
## Capacity planning
|
||||
|
||||
Each instance type - `vminsert`, `vmselect` and `vmstorage` - can run on the most suitable hardware.
|
||||
|
||||
#### vminsert
|
||||
### vminsert
|
||||
|
||||
* The recommended total number of vCPU cores for all the `vminsert` instances can be calculated from the ingestion rate: `vCPUs = ingestion_rate / 150K`.
|
||||
* The recommended number of vCPU cores per each `vminsert` instance should equal to the number of `vmstorage` instances in the cluster.
|
||||
@ -285,7 +285,7 @@ Each instance type - `vminsert`, `vmselect` and `vmstorage` - can run on the mos
|
||||
* Sometimes `-rpc.disableCompression` command-line flag on `vminsert` instances could increase ingestion capacity at the cost
|
||||
of higher network bandwidth usage between `vminsert` and `vmstorage`.
|
||||
|
||||
#### vmstorage
|
||||
### vmstorage
|
||||
|
||||
* The recommended total number of vCPU cores for all the `vmstorage` instances can be calculated from the ingestion rate: `vCPUs = ingestion_rate / 150K`.
|
||||
* The recommended total amount of RAM for all the `vmstorage` instances can be calculated from the number of active time series: `RAM = 2 * active_time_series * 1KB`.
|
||||
@ -299,7 +299,7 @@ Each instance type - `vminsert`, `vmselect` and `vmstorage` - can run on the mos
|
||||
* The recommended total amount of storage space for all the `vmstorage` instances can be calculated
|
||||
from the ingestion rate and retention: `storage_space = ingestion_rate * retention_seconds`.
|
||||
|
||||
#### vmselect
|
||||
### vmselect
|
||||
|
||||
The recommended hardware for `vmselect` instances highly depends on the type of queries. Lightweight queries over small number of time series usually require
|
||||
small number of vCPU cores and small amount of RAM on `vmselect`, while heavy queries over big number of time series (>10K) usually require
|
||||
@ -309,7 +309,7 @@ In general it is recommended increasing the number of vCPU cores and RAM per `vm
|
||||
while adding new `vmselect` nodes only when old nodes are overloaded with incoming query stream.
|
||||
|
||||
|
||||
### High availability
|
||||
## High availability
|
||||
|
||||
It is recommended to run all the components for a single cluster in the same subnetwork with high bandwidth, low latency and low error rates.
|
||||
This improves cluster performance and availability.
|
||||
@ -321,18 +321,18 @@ If you need multi-AZ setup, then it is recommended running independed clusters i
|
||||
into all the cluster. Then [promxy](https://github.com/jacksontj/promxy) could be used for querying the data from multiple clusters.
|
||||
|
||||
|
||||
### Helm
|
||||
## Helm
|
||||
|
||||
Helm chart simplifies managing cluster version of VictoriaMetrics in Kubernetes.
|
||||
It is available in the [helm-charts](https://github.com/VictoriaMetrics/helm-charts) repository.
|
||||
|
||||
|
||||
### Kubernetes operator
|
||||
## Kubernetes operator
|
||||
|
||||
[K8s operator](https://github.com/VictoriaMetrics/operator) simplifies managing VictoriaMetrics components in Kubernetes.
|
||||
|
||||
|
||||
### Replication and data safety
|
||||
## Replication and data safety
|
||||
|
||||
In order to enable application-level replication, `-replicationFactor=N` command-line flag must be passed to `vminsert`.
|
||||
This guarantees that all the data remains available for querying if up to `N-1` `vmstorage` nodes are unavailable.
|
||||
@ -355,7 +355,7 @@ HDD-based persistent disks should be enough for the majority of use cases.
|
||||
It is recommended using durable replicated persistent volumes in Kubernetes.
|
||||
|
||||
|
||||
### Backups
|
||||
## Backups
|
||||
|
||||
It is recommended performing periodical backups from [instant snapshots](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282)
|
||||
for protecting from user errors such as accidental data deletion.
|
||||
@ -376,6 +376,27 @@ Restoring from backup:
|
||||
3. Start `vmstorage` node.
|
||||
|
||||
|
||||
## Profiling
|
||||
|
||||
All the cluster components provide the following handlers for [profiling](https://blog.golang.org/profiling-go-programs):
|
||||
|
||||
* `http://vminsert:8480/debug/pprof/heap` for memory profile and `http://vminsert:8480/debug/pprof/profile` for CPU profile
|
||||
* `http://vmselect:8481/debug/pprof/heap` for memory profile and `http://vmselect:8481/debug/pprof/profile` for CPU profile
|
||||
* `http://vmstorage:8482/debug/pprof/heap` for memory profile and `http://vmstorage:8482/debug/pprof/profile` for CPU profile
|
||||
|
||||
Example command for collecting cpu profile from `vmstorage`:
|
||||
|
||||
```bash
|
||||
curl -s http://<victoria-metrics-host>:8428/debug/pprof/profile > cpu.pprof
|
||||
```
|
||||
|
||||
Example command for collecting memory profile from `vminsert`:
|
||||
|
||||
```bash
|
||||
curl -s http://<victoria-metrics-host>:8428/debug/pprof/heap > mem.pprof
|
||||
```
|
||||
|
||||
|
||||
## Community and contributions
|
||||
|
||||
We are open to third-party pull requests provided they follow [KISS design principle](https://en.wikipedia.org/wiki/KISS_principle):
|
||||
|
Loading…
Reference in New Issue
Block a user