mirror of
https://github.com/VictoriaMetrics/VictoriaMetrics.git
synced 2025-01-20 07:19:17 +01:00
vmselect: introduce search.skipSlowReplicas
cmd-line flag (#4538)
* vmselect: introduce `search.skipSlowReplicas` cmd-line flag vmselect has two logical conditions during request processing when `-replicationFactor` cmd-line flag is set: 1. If at least `len(storageNodes) - replicationFactor` responded, it could skip waiting for the rest of nodes to respond. This could lead to problems described here https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1207. 2. Mark response as partial if less than `len(storageNodes) - replicationFactor` responded without an error. The P1 showed itself error-prone and became the main reason why `-replicationFactor` wasn't recommended to use at vmselect level. However, this optimization could be still very useful in situations when there are slow and fast replicas in cluster. But P2 remains viable and important conditionless. Hiding P1 behind the feature-flag `search.skipSlowReplicas` should make `-replicationFactor` flag usable again. And let users choose whether they want P1 to be respected. Related issues https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1207 https://github.com/VictoriaMetrics/VictoriaMetrics/issues/711 Signed-off-by: hagen1778 <roman@victoriametrics.com> * docs: update changelog Signed-off-by: hagen1778 <roman@victoriametrics.com> --------- Signed-off-by: hagen1778 <roman@victoriametrics.com>
This commit is contained in:
parent
45cec4728c
commit
fb03762d4d
@ -67,6 +67,7 @@ Released at 2023-06-30
|
|||||||
* BUGFIX: [storage](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html): prevent from possible crashloop after the migration from versions below `v1.90.0` to newer versions. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4336) for details.
|
* BUGFIX: [storage](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html): prevent from possible crashloop after the migration from versions below `v1.90.0` to newer versions. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4336) for details.
|
||||||
* BUGFIX: [vmui](https://docs.victoriametrics.com/#vmui): fix a memory leak issue associated with chart updates. See [this pull request](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4455).
|
* BUGFIX: [vmui](https://docs.victoriametrics.com/#vmui): fix a memory leak issue associated with chart updates. See [this pull request](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4455).
|
||||||
* BUGFIX: [vmbackupmanager](https://docs.victoriametrics.com/vmbackupmanager.html): fix removing storage data dir before restoring from backup.
|
* BUGFIX: [vmbackupmanager](https://docs.victoriametrics.com/vmbackupmanager.html): fix removing storage data dir before restoring from backup.
|
||||||
|
* BUGFIX: vmselect: wait for all vmstorage nodes to respond when the `-replicationFactor` flag is set bigger than > 1. Before, vmselect could have [skip waiting for the slowest replicas](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/711) to respond. This could have resulted in issues illustrated [here](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1207). Now, this optimization is disabled by default and could be re-enabled by passing `-search.skipSlowReplicas` cmd-line flag to vmselect. See more details [here](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4538).
|
||||||
|
|
||||||
|
|
||||||
## [v1.91.2](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.91.2)
|
## [v1.91.2](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.91.2)
|
||||||
|
@ -659,6 +659,7 @@ It is available in the [helm-charts](https://github.com/VictoriaMetrics/helm-cha
|
|||||||
By default, VictoriaMetrics offloads replication to the underlying storage pointed by `-storageDataPath` such as [Google compute persistent disk](https://cloud.google.com/compute/docs/disks#pdspecs), which guarantees data durability. VictoriaMetrics supports application-level replication if replicated durable persistent disks cannot be used for some reason.
|
By default, VictoriaMetrics offloads replication to the underlying storage pointed by `-storageDataPath` such as [Google compute persistent disk](https://cloud.google.com/compute/docs/disks#pdspecs), which guarantees data durability. VictoriaMetrics supports application-level replication if replicated durable persistent disks cannot be used for some reason.
|
||||||
|
|
||||||
The replication can be enabled by passing `-replicationFactor=N` command-line flag to `vminsert`. This instructs `vminsert` to store `N` copies for every ingested sample on `N` distinct `vmstorage` nodes. This guarantees that all the stored data remains available for querying if up to `N-1` `vmstorage` nodes are unavailable.
|
The replication can be enabled by passing `-replicationFactor=N` command-line flag to `vminsert`. This instructs `vminsert` to store `N` copies for every ingested sample on `N` distinct `vmstorage` nodes. This guarantees that all the stored data remains available for querying if up to `N-1` `vmstorage` nodes are unavailable.
|
||||||
|
Passing `-replicationFactor=N` command-line flag to `vmselect` instructs it to not mark responses as `partial` if less `replicationFactor` storage nodes failed to respond on query time.
|
||||||
|
|
||||||
The cluster must contain at least `2*N-1` `vmstorage` nodes, where `N` is replication factor, in order to maintain the given replication factor for newly ingested data when `N-1` of storage nodes are unavailable.
|
The cluster must contain at least `2*N-1` `vmstorage` nodes, where `N` is replication factor, in order to maintain the given replication factor for newly ingested data when `N-1` of storage nodes are unavailable.
|
||||||
|
|
||||||
@ -1207,6 +1208,8 @@ Below is the output for `/path/to/vmselect -help`:
|
|||||||
Optional authKey for resetting rollup cache via /internal/resetRollupResultCache call
|
Optional authKey for resetting rollup cache via /internal/resetRollupResultCache call
|
||||||
-search.setLookbackToStep
|
-search.setLookbackToStep
|
||||||
Whether to fix lookback interval to 'step' query arg value. If set to true, the query model becomes closer to InfluxDB data model. If set to true, then -search.maxLookback and -search.maxStalenessInterval are ignored
|
Whether to fix lookback interval to 'step' query arg value. If set to true, the query model becomes closer to InfluxDB data model. If set to true, then -search.maxLookback and -search.maxStalenessInterval are ignored
|
||||||
|
-search.skipSlowReplicas
|
||||||
|
Whether to skip waiting for all replicas to respond during search query. Enabling this setting may improve query speed by serving results from the fastest vmstorage replicas in the cluster. But could also lead to incomplete results if replicas contain data gaps. Consider enabling this setting only if all replicas contain identical data.
|
||||||
-search.treatDotsAsIsInRegexps
|
-search.treatDotsAsIsInRegexps
|
||||||
Whether to treat dots as is in regexp label filters used in queries. For example, foo{bar=~"a.b.c"} will be automatically converted to foo{bar=~"a\\.b\\.c"}, i.e. all the dots in regexp filters will be automatically escaped in order to match only dot char instead of matching any char. Dots in ".+", ".*" and ".{n}" regexps aren't escaped. This option is DEPRECATED in favor of {__graphite__="a.*.c"} syntax for selecting metrics matching the given Graphite metrics filter
|
Whether to treat dots as is in regexp label filters used in queries. For example, foo{bar=~"a.b.c"} will be automatically converted to foo{bar=~"a\\.b\\.c"}, i.e. all the dots in regexp filters will be automatically escaped in order to match only dot char instead of matching any char. Dots in ".+", ".*" and ".{n}" regexps aren't escaped. This option is DEPRECATED in favor of {__graphite__="a.*.c"} syntax for selecting metrics matching the given Graphite metrics filter
|
||||||
-selectNode array
|
-selectNode array
|
||||||
|
Loading…
Reference in New Issue
Block a user