mirror of
https://github.com/VictoriaMetrics/VictoriaMetrics.git
synced 2025-01-20 07:19:17 +01:00
docs/Cluster-VictoriaMetrics.md: document the best strategies for cluster update / upgrade
This commit is contained in:
parent
1c7f402598
commit
7f46c17e4e
43
README.md
43
README.md
@ -296,8 +296,47 @@ All the node types - `vminsert`, `vmselect` and `vmstorage` - may be updated via
|
|||||||
Send `SIGINT` signal to the corresponding process, wait until it finishes and then start new version
|
Send `SIGINT` signal to the corresponding process, wait until it finishes and then start new version
|
||||||
with new configs.
|
with new configs.
|
||||||
|
|
||||||
Cluster should remain in working state if at least a single node of each type remains available during
|
There are the following cluster update / upgrade approaches exist:
|
||||||
the update process. See [cluster availability](#cluster-availability) section for details.
|
|
||||||
|
* `No downtime` strategy. Gracefully restart every node in the cluster one-by-one with the updated config / upgraded binary.
|
||||||
|
|
||||||
|
It is recommended restarting the nodes in the following order:
|
||||||
|
|
||||||
|
1. Restart `vmstorage` nodes.
|
||||||
|
2. Restart `vminsert` nodes.
|
||||||
|
3. Restart `vmselect` nodes.
|
||||||
|
|
||||||
|
This strategy allows upgrading the cluster without downtime if the following conditions are met:
|
||||||
|
|
||||||
|
- The cluster has at least a pair of nodes of each type - `vminsert`, `vmselect` and `vmstorage`,
|
||||||
|
so it can continue accept new data and serve incoming requests when a single node is temporary unavailable
|
||||||
|
during its restart. See [cluster availability docs](#cluster-availability) for details.
|
||||||
|
- The cluster has enough compute resources (CPU, RAM, network bandwidth, disk IO) for processing
|
||||||
|
the current workload when a single node of any type (`vminsert`, `vmselect` or `vmstorage`)
|
||||||
|
is temporarily unavailable during its restart.
|
||||||
|
- The updated config / upgraded binary is compatible with the remaining components in the cluster.
|
||||||
|
See the [CHANGELOG](https://docs.victoriametrics.com/CHANGELOG.html) for compatibility notes between different releases.
|
||||||
|
|
||||||
|
If at least a single condition isn't met, then the rolling restart may result in cluster unavailability
|
||||||
|
during the config update / version upgrade. In this case the following strategy is recommended.
|
||||||
|
|
||||||
|
* `Minimum downtime` strategy:
|
||||||
|
|
||||||
|
1. Gracefully stop all the `vminsert` and `vmselect` nodes in parallel.
|
||||||
|
2. Gracefully restart all the `vmstorage` nodes in parallel.
|
||||||
|
3. Start all the `vminsert` and `vmselect` nodes in parallel.
|
||||||
|
|
||||||
|
The cluster is unavailable for data ingestion and querying when performing the steps above.
|
||||||
|
The downtime is minimized by restarting cluster nodes in parallel at every step above.
|
||||||
|
The `minimum downtime` strategy has the following benefits comparing to `no downtime` startegy:
|
||||||
|
|
||||||
|
- It allows performing config update / version upgrade with minimum disruption
|
||||||
|
when the previous config / version is incompatible with the new config / version.
|
||||||
|
- It allows perorming config update / version upgrade with minimum disruption
|
||||||
|
when the cluster has no enough compute resources (CPU, RAM, disk IO, network bandwidth)
|
||||||
|
for rolling upgrade.
|
||||||
|
- It allows minimizing the duration of config update / version ugprade for clusters with big number of nodes
|
||||||
|
of for clusters with big `vmstorage` nodes, which may take long time for graceful restart.
|
||||||
|
|
||||||
## Cluster availability
|
## Cluster availability
|
||||||
|
|
||||||
|
@ -300,8 +300,47 @@ All the node types - `vminsert`, `vmselect` and `vmstorage` - may be updated via
|
|||||||
Send `SIGINT` signal to the corresponding process, wait until it finishes and then start new version
|
Send `SIGINT` signal to the corresponding process, wait until it finishes and then start new version
|
||||||
with new configs.
|
with new configs.
|
||||||
|
|
||||||
Cluster should remain in working state if at least a single node of each type remains available during
|
There are the following cluster update / upgrade approaches exist:
|
||||||
the update process. See [cluster availability](#cluster-availability) section for details.
|
|
||||||
|
* `No downtime` strategy. Gracefully restart every node in the cluster one-by-one with the updated config / upgraded binary.
|
||||||
|
|
||||||
|
It is recommended restarting the nodes in the following order:
|
||||||
|
|
||||||
|
1. Restart `vmstorage` nodes.
|
||||||
|
2. Restart `vminsert` nodes.
|
||||||
|
3. Restart `vmselect` nodes.
|
||||||
|
|
||||||
|
This strategy allows upgrading the cluster without downtime if the following conditions are met:
|
||||||
|
|
||||||
|
- The cluster has at least a pair of nodes of each type - `vminsert`, `vmselect` and `vmstorage`,
|
||||||
|
so it can continue accept new data and serve incoming requests when a single node is temporary unavailable
|
||||||
|
during its restart. See [cluster availability docs](#cluster-availability) for details.
|
||||||
|
- The cluster has enough compute resources (CPU, RAM, network bandwidth, disk IO) for processing
|
||||||
|
the current workload when a single node of any type (`vminsert`, `vmselect` or `vmstorage`)
|
||||||
|
is temporarily unavailable during its restart.
|
||||||
|
- The updated config / upgraded binary is compatible with the remaining components in the cluster.
|
||||||
|
See the [CHANGELOG](https://docs.victoriametrics.com/CHANGELOG.html) for compatibility notes between different releases.
|
||||||
|
|
||||||
|
If at least a single condition isn't met, then the rolling restart may result in cluster unavailability
|
||||||
|
during the config update / version upgrade. In this case the following strategy is recommended.
|
||||||
|
|
||||||
|
* `Minimum downtime` strategy:
|
||||||
|
|
||||||
|
1. Gracefully stop all the `vminsert` and `vmselect` nodes in parallel.
|
||||||
|
2. Gracefully restart all the `vmstorage` nodes in parallel.
|
||||||
|
3. Start all the `vminsert` and `vmselect` nodes in parallel.
|
||||||
|
|
||||||
|
The cluster is unavailable for data ingestion and querying when performing the steps above.
|
||||||
|
The downtime is minimized by restarting cluster nodes in parallel at every step above.
|
||||||
|
The `minimum downtime` strategy has the following benefits comparing to `no downtime` startegy:
|
||||||
|
|
||||||
|
- It allows performing config update / version upgrade with minimum disruption
|
||||||
|
when the previous config / version is incompatible with the new config / version.
|
||||||
|
- It allows perorming config update / version upgrade with minimum disruption
|
||||||
|
when the cluster has no enough compute resources (CPU, RAM, disk IO, network bandwidth)
|
||||||
|
for rolling upgrade.
|
||||||
|
- It allows minimizing the duration of config update / version ugprade for clusters with big number of nodes
|
||||||
|
of for clusters with big `vmstorage` nodes, which may take long time for graceful restart.
|
||||||
|
|
||||||
## Cluster availability
|
## Cluster availability
|
||||||
|
|
||||||
|
@ -335,10 +335,4 @@ The query engine may behave differently for some functions. Please see [this art
|
|||||||
|
|
||||||
Single-node VictoriaMetrics cannot be restarted / upgraded or downgraded without downtime, since it needs to be gracefully shut down and then started again. See [how to upgrade VictoriaMetrics](https://docs.victoriametrics.com/#how-to-upgrade-victoriametrics).
|
Single-node VictoriaMetrics cannot be restarted / upgraded or downgraded without downtime, since it needs to be gracefully shut down and then started again. See [how to upgrade VictoriaMetrics](https://docs.victoriametrics.com/#how-to-upgrade-victoriametrics).
|
||||||
|
|
||||||
[Cluster version of VictoriaMetrics](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html) can be restarted / upgraded / downgraded without downtime if the following conditions are met:
|
[Cluster version of VictoriaMetrics](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html) can be restarted / upgraded / downgraded without downtime according to [these instructions](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#updating--reconfiguring-cluster-nodes).
|
||||||
|
|
||||||
* If every component of the cluster - `vminsert`, `vmselect` and `vmstorage` - has at least 2 instances.
|
|
||||||
* If the cluster has enough compute resources (CPU, RAM, disk IO, network bandwidth) to perform rolling restart of all its components.
|
|
||||||
* If the version used for upgrade / downgrade is compatible with the currently running version. The [CHANGELOG](https://docs.victoriametrics.com/CHANGELOG.html) contains compatibility notes for the published releases.
|
|
||||||
|
|
||||||
See [updating / reconfiguring cluster nodes](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#updating--reconfiguring-cluster-nodes) for details on cluster upgrade.
|
|
||||||
|
Loading…
Reference in New Issue
Block a user