diff --git a/README.md b/README.md index 74357a637..05df80b2c 100644 --- a/README.md +++ b/README.md @@ -1822,6 +1822,7 @@ curl http://0.0.0.0:8428/debug/pprof/profile > cpu.pprof The command for collecting CPU profile waits for 30 seconds before returning. The collected profiles may be analyzed with [go tool pprof](https://github.com/google/pprof). +It is safe sharing the collected profiles from security point of view, since they do not contain sensitive information. ## Integrations diff --git a/app/vmagent/README.md b/app/vmagent/README.md index dd746a87d..b7123d957 100644 --- a/app/vmagent/README.md +++ b/app/vmagent/README.md @@ -893,6 +893,7 @@ curl http://0.0.0.0:8429/debug/pprof/profile > cpu.pprof The command for collecting CPU profile waits for 30 seconds before returning. The collected profiles may be analyzed with [go tool pprof](https://github.com/google/pprof). +It is safe sharing the collected profiles from security point of view, since they do not contain sensitive information. ## Advanced usage diff --git a/app/vmalert/README.md b/app/vmalert/README.md index bd28a4cf5..c9cbb39bb 100644 --- a/app/vmalert/README.md +++ b/app/vmalert/README.md @@ -630,6 +630,35 @@ Use the official [Grafana dashboard](https://grafana.com/grafana/dashboards/1495 If you have suggestions for improvements or have found a bug - please open an issue on github or add a review to the dashboard. +## Profiling + +`vmalert` provides handlers for collecting the following [Go profiles](https://blog.golang.org/profiling-go-programs): + +* Memory profile. It can be collected with the following command (replace `0.0.0.0` with hostname if needed): + +
+ +```console +curl http://0.0.0.0:8880/debug/pprof/heap > mem.pprof +``` + +
+ +* CPU profile. It can be collected with the following command (replace `0.0.0.0` with hostname if needed): + +
+ +```console +curl http://0.0.0.0:8880/debug/pprof/profile > cpu.pprof +``` + +
+ +The command for collecting CPU profile waits for 30 seconds before returning. + +The collected profiles may be analyzed with [go tool pprof](https://github.com/google/pprof). +It is safe sharing the collected profiles from security point of view, since they do not contain sensitive information. + ## Configuration ### Flags diff --git a/app/vmauth/README.md b/app/vmauth/README.md index 94683509b..379b43dd1 100644 --- a/app/vmauth/README.md +++ b/app/vmauth/README.md @@ -217,6 +217,7 @@ curl http://0.0.0.0:8427/debug/pprof/profile > cpu.pprof The command for collecting CPU profile waits for 30 seconds before returning. The collected profiles may be analyzed with [go tool pprof](https://github.com/google/pprof). +It is safe sharing the collected profiles from security point of view, since they do not contain sensitive information. ## Advanced usage diff --git a/docs/Cluster-VictoriaMetrics.md b/docs/Cluster-VictoriaMetrics.md index 0c143eefd..64c942b8b 100644 --- a/docs/Cluster-VictoriaMetrics.md +++ b/docs/Cluster-VictoriaMetrics.md @@ -313,45 +313,47 @@ with new configs. There are the following cluster update / upgrade approaches exist: -* `No downtime` strategy. Gracefully restart every node in the cluster one-by-one with the updated config / upgraded binary. +### No downtime strategy - It is recommended restarting the nodes in the following order: +Gracefully restart every node in the cluster one-by-one with the updated config / upgraded binary. - 1. Restart `vmstorage` nodes. - 2. Restart `vminsert` nodes. - 3. Restart `vmselect` nodes. +It is recommended restarting the nodes in the following order: - This strategy allows upgrading the cluster without downtime if the following conditions are met: +1. Restart `vmstorage` nodes. +2. Restart `vminsert` nodes. +3. Restart `vmselect` nodes. - - The cluster has at least a pair of nodes of each type - `vminsert`, `vmselect` and `vmstorage`, - so it can continue accept new data and serve incoming requests when a single node is temporary unavailable - during its restart. See [cluster availability docs](#cluster-availability) for details. - - The cluster has enough compute resources (CPU, RAM, network bandwidth, disk IO) for processing - the current workload when a single node of any type (`vminsert`, `vmselect` or `vmstorage`) - is temporarily unavailable during its restart. - - The updated config / upgraded binary is compatible with the remaining components in the cluster. - See the [CHANGELOG](https://docs.victoriametrics.com/CHANGELOG.html) for compatibility notes between different releases. +This strategy allows upgrading the cluster without downtime if the following conditions are met: + +- The cluster has at least a pair of nodes of each type - `vminsert`, `vmselect` and `vmstorage`, + so it can continue accept new data and serve incoming requests when a single node is temporary unavailable + during its restart. See [cluster availability docs](#cluster-availability) for details. +- The cluster has enough compute resources (CPU, RAM, network bandwidth, disk IO) for processing + the current workload when a single node of any type (`vminsert`, `vmselect` or `vmstorage`) + is temporarily unavailable during its restart. +- The updated config / upgraded binary is compatible with the remaining components in the cluster. + See the [CHANGELOG](https://docs.victoriametrics.com/CHANGELOG.html) for compatibility notes between different releases. If at least a single condition isn't met, then the rolling restart may result in cluster unavailability during the config update / version upgrade. In this case the following strategy is recommended. -* `Minimum downtime` strategy: +### Minimum downtime strategy - 1. Gracefully stop all the `vminsert` and `vmselect` nodes in parallel. - 2. Gracefully restart all the `vmstorage` nodes in parallel. - 3. Start all the `vminsert` and `vmselect` nodes in parallel. +1. Gracefully stop all the `vminsert` and `vmselect` nodes in parallel. +2. Gracefully restart all the `vmstorage` nodes in parallel. +3. Start all the `vminsert` and `vmselect` nodes in parallel. - The cluster is unavailable for data ingestion and querying when performing the steps above. - The downtime is minimized by restarting cluster nodes in parallel at every step above. - The `minimum downtime` strategy has the following benefits comparing to `no downtime` startegy: +The cluster is unavailable for data ingestion and querying when performing the steps above. +The downtime is minimized by restarting cluster nodes in parallel at every step above. +The `minimum downtime` strategy has the following benefits comparing to `no downtime` startegy: - - It allows performing config update / version upgrade with minimum disruption - when the previous config / version is incompatible with the new config / version. - - It allows perorming config update / version upgrade with minimum disruption - when the cluster has no enough compute resources (CPU, RAM, disk IO, network bandwidth) - for rolling upgrade. - - It allows minimizing the duration of config update / version ugprade for clusters with big number of nodes - of for clusters with big `vmstorage` nodes, which may take long time for graceful restart. +- It allows performing config update / version upgrade with minimum disruption + when the previous config / version is incompatible with the new config / version. +- It allows perorming config update / version upgrade with minimum disruption + when the cluster has no enough compute resources (CPU, RAM, disk IO, network bandwidth) + for rolling upgrade. +- It allows minimizing the duration of config update / version ugprade for clusters with big number of nodes + of for clusters with big `vmstorage` nodes, which may take long time for graceful restart. ## Cluster availability @@ -563,6 +565,8 @@ Example command for collecting memory profile from `vminsert` (replace `0.0.0.0` curl http://0.0.0.0:8480/debug/pprof/heap > mem.pprof ``` +It is safe sharing the collected profiles from security point of view, since they do not contain sensitive information. + ## vmalert diff --git a/docs/README.md b/docs/README.md index 74357a637..05df80b2c 100644 --- a/docs/README.md +++ b/docs/README.md @@ -1822,6 +1822,7 @@ curl http://0.0.0.0:8428/debug/pprof/profile > cpu.pprof The command for collecting CPU profile waits for 30 seconds before returning. The collected profiles may be analyzed with [go tool pprof](https://github.com/google/pprof). +It is safe sharing the collected profiles from security point of view, since they do not contain sensitive information. ## Integrations diff --git a/docs/Single-server-VictoriaMetrics.md b/docs/Single-server-VictoriaMetrics.md index 75eb02845..2c16aec63 100644 --- a/docs/Single-server-VictoriaMetrics.md +++ b/docs/Single-server-VictoriaMetrics.md @@ -1826,6 +1826,7 @@ curl http://0.0.0.0:8428/debug/pprof/profile > cpu.pprof The command for collecting CPU profile waits for 30 seconds before returning. The collected profiles may be analyzed with [go tool pprof](https://github.com/google/pprof). +It is safe sharing the collected profiles from security point of view, since they do not contain sensitive information. ## Integrations diff --git a/docs/vmagent.md b/docs/vmagent.md index 7c708fb93..00aed3f58 100644 --- a/docs/vmagent.md +++ b/docs/vmagent.md @@ -897,6 +897,7 @@ curl http://0.0.0.0:8429/debug/pprof/profile > cpu.pprof The command for collecting CPU profile waits for 30 seconds before returning. The collected profiles may be analyzed with [go tool pprof](https://github.com/google/pprof). +It is safe sharing the collected profiles from security point of view, since they do not contain sensitive information. ## Advanced usage diff --git a/docs/vmalert.md b/docs/vmalert.md index c008e07b7..48df5d9c1 100644 --- a/docs/vmalert.md +++ b/docs/vmalert.md @@ -634,6 +634,35 @@ Use the official [Grafana dashboard](https://grafana.com/grafana/dashboards/1495 If you have suggestions for improvements or have found a bug - please open an issue on github or add a review to the dashboard. +## Profiling + +`vmalert` provides handlers for collecting the following [Go profiles](https://blog.golang.org/profiling-go-programs): + +* Memory profile. It can be collected with the following command (replace `0.0.0.0` with hostname if needed): + +
+ +```console +curl http://0.0.0.0:8880/debug/pprof/heap > mem.pprof +``` + +
+ +* CPU profile. It can be collected with the following command (replace `0.0.0.0` with hostname if needed): + +
+ +```console +curl http://0.0.0.0:8880/debug/pprof/profile > cpu.pprof +``` + +
+ +The command for collecting CPU profile waits for 30 seconds before returning. + +The collected profiles may be analyzed with [go tool pprof](https://github.com/google/pprof). +It is safe sharing the collected profiles from security point of view, since they do not contain sensitive information. + ## Configuration ### Flags diff --git a/docs/vmauth.md b/docs/vmauth.md index 0aab5f153..63f1de9d3 100644 --- a/docs/vmauth.md +++ b/docs/vmauth.md @@ -221,6 +221,7 @@ curl http://0.0.0.0:8427/debug/pprof/profile > cpu.pprof The command for collecting CPU profile waits for 30 seconds before returning. The collected profiles may be analyzed with [go tool pprof](https://github.com/google/pprof). +It is safe sharing the collected profiles from security point of view, since they do not contain sensitive information. ## Advanced usage