From 0dca3c40254e8be1dc7493a51fae550a4402134a Mon Sep 17 00:00:00 2001 From: Aliaksandr Valialkin Date: Wed, 24 Jan 2024 13:32:13 +0200 Subject: [PATCH] app/{vmselect,vmstorage}: return compression of the data passed from vmstorage to vmselect This reverts cd4f641d323cf0977c2e22915eb75edb765ad090 , since it has been appeared that the disabled compression for vmstorage->vmselect data increase network bandwidth usage by more than 10x on typical production workloads, while it decreases CPU usage at vmstorage by up to 10% and improves query latency by up to 10%. The 10x increase in network usage is too high price for 10% improvements on query latency and vmstorage CPU usage. This may result in network bandwidth bottlenecks, which can reduce the overall performance and stability of VictoriaMetrics cluster. That's why return back the vmstorage->vmselect data compression by default. The vmstorage->vmselect compression can be disabled by passing -rpc.disableCompression command-line flag to vmstorage. The vmselect->vmselect compression in multi-level cluster setup can be disabled by passing -clusternative.disableCompression command-line flag. --- README.md | 6 +++--- app/vmselect/clusternative/vmselect.go | 2 +- app/vmstorage/servers/vmselect.go | 2 +- docs/CHANGELOG.md | 1 - docs/Cluster-VictoriaMetrics.md | 6 +++--- 5 files changed, 8 insertions(+), 9 deletions(-) diff --git a/README.md b/README.md index 6c441be37a..0b75d66e8e 100644 --- a/README.md +++ b/README.md @@ -585,7 +585,7 @@ Some capacity planning tips for VictoriaMetrics cluster: - Query latency can be reduced by increasing CPU resources per each `vmselect` node, since each incoming query is processed by a single `vmselect` node. Performance for heavy queries scales with the number of available CPU cores at `vmselect` node, since `vmselect` processes time series referred by the query on all the available CPU cores. - If the cluster needs to process incoming queries at a high rate, then its capacity can be increased by adding more `vmselect` nodes, so incoming queries could be spread among bigger number of `vmselect` nodes. - By default `vminsert` compresses the data it sends to `vmstorage` in order to reduce network bandwidth usage. The compression takes additional CPU resources at `vminsert`. If `vminsert` nodes have limited CPU, then the compression can be disabled by passing `-rpc.disableCompression` command-line flag at `vminsert` nodes. -- By default `vmstorage` doesn't compress the data it sends to `vmselect` in order to reduce CPU usage at the cost of additional network bandwidth usage. Pass `-rpc.disableCompression=false` command-line flag at `vmstorage` in order to reduce network bandwidh usage needed for processing queries at the cost of increased CPU usage. +- By default `vmstorage` compresses the data it sends to `vmselect` during queries in order to reduce network bandwidth usage. The compression takes additional CPU resources at `vmstorage`. If `vmstorage` nodes have limited CPU, then the compression can be disabled by passing `-rpc.disableCompression` command-line flag at `vmstorage` nodes. See also [resource usage limits docs](#resource-usage-limits). @@ -1176,7 +1176,7 @@ Below is the output for `/path/to/vmselect -help`: -cluster.tlsKeyFile string Path to client-side TLS key file to use when connecting to -storageNode if -cluster.tls flag is set. See https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#mtls-protection . This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html -clusternative.disableCompression - Whether to disable compression of the data sent to vmselect via -clusternativeListenAddr. This reduces CPU usage at the cost of higher network bandwidth usage (default true) + Whether to disable compression of the data sent to vmselect via -clusternativeListenAddr. This reduces CPU usage at the cost of higher network bandwidth usage -clusternative.maxConcurrentRequests int The maximum number of concurrent vmselect requests the server can process at -clusternativeListenAddr. It shouldn't be high, since a single request usually saturates a CPU core at the underlying vmstorage nodes, and many concurrently executed requests may require high amounts of memory. See also -clusternative.maxQueueDuration (default 32) -clusternative.maxQueueDuration duration @@ -1597,7 +1597,7 @@ Below is the output for `/path/to/vmstorage -help`: -retentionTimezoneOffset duration The offset for performing indexdb rotation. If set to 0, then the indexdb rotation is performed at 4am UTC time per each -retentionPeriod. If set to 2h, then the indexdb rotation is performed at 4am EET time (the timezone with +2h offset) -rpc.disableCompression - Whether to disable compression of the data sent from vmstorage to vmselect. This reduces CPU usage at the cost of higher network bandwidth usage (default true) + Whether to disable compression of the data sent from vmstorage to vmselect. This reduces CPU usage at the cost of higher network bandwidth usage -search.maxConcurrentRequests int The maximum number of concurrent vmselect requests the vmstorage can process at -vmselectAddr. It shouldn't be high, since a single request usually saturates a CPU core, and many concurrently executed requests may require high amounts of memory. See also -search.maxQueueDuration (default 32) -search.maxQueueDuration duration diff --git a/app/vmselect/clusternative/vmselect.go b/app/vmselect/clusternative/vmselect.go index 68598e175d..ce29d2e7bb 100644 --- a/app/vmselect/clusternative/vmselect.go +++ b/app/vmselect/clusternative/vmselect.go @@ -25,7 +25,7 @@ var ( maxQueueDuration = flag.Duration("clusternative.maxQueueDuration", 10*time.Second, "The maximum time the incoming query to -clusternativeListenAddr waits for execution "+ "when -clusternative.maxConcurrentRequests limit is reached") - disableRPCCompression = flag.Bool("clusternative.disableCompression", true, "Whether to disable compression of the data sent to vmselect via -clusternativeListenAddr. "+ + disableRPCCompression = flag.Bool("clusternative.disableCompression", false, "Whether to disable compression of the data sent to vmselect via -clusternativeListenAddr. "+ "This reduces CPU usage at the cost of higher network bandwidth usage") ) diff --git a/app/vmstorage/servers/vmselect.go b/app/vmstorage/servers/vmselect.go index dfdedf2a7c..f697063e38 100644 --- a/app/vmstorage/servers/vmselect.go +++ b/app/vmstorage/servers/vmselect.go @@ -26,7 +26,7 @@ var ( maxQueueDuration = flag.Duration("search.maxQueueDuration", 10*time.Second, "The maximum time the incoming vmselect request waits for execution "+ "when -search.maxConcurrentRequests limit is reached") - disableRPCCompression = flag.Bool("rpc.disableCompression", true, "Whether to disable compression of the data sent from vmstorage to vmselect. "+ + disableRPCCompression = flag.Bool("rpc.disableCompression", false, "Whether to disable compression of the data sent from vmstorage to vmselect. "+ "This reduces CPU usage at the cost of higher network bandwidth usage") denyQueriesOutsideRetention = flag.Bool("denyQueriesOutsideRetention", false, "Whether to deny queries outside of the configured -retentionPeriod. "+ "When set, then /api/v1/query_range would return '503 Service Unavailable' error for queries with 'from' value outside -retentionPeriod. "+ diff --git a/docs/CHANGELOG.md b/docs/CHANGELOG.md index ff1dc6066a..a86129c0ff 100644 --- a/docs/CHANGELOG.md +++ b/docs/CHANGELOG.md @@ -62,7 +62,6 @@ The sandbox cluster installation is running under the constant load generated by * FEATURE: [vmui](https://docs.victoriametrics.com/#vmui): add `-vmui.defaultTimezone` flag to set a default timezone. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5375) and [these docs](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/master/app/vmui#timezone-configuration). * FEATURE: [vmui](https://docs.victoriametrics.com/#vmui): include UTC in the timezone selection dropdown for standardized time referencing. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5375). * FEATURE: add [VictoriaMetrics datasource](https://github.com/VictoriaMetrics/grafana-datasource) to docker compose environment. See [this pull request](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5363). -* FEATURE: [VictoriaMetrics cluster](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html): disable compression for RPC responses sent from `vmstorage` to `vmselect`. This improves query performance at the cost of increased network bandwidth usage. The compression can be enabled back by passing `-rpc.disableCompression=false` command-line flag to `vmstorage`. The compression can be enabled back in [multi-level cluster setup](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#multi-level-cluster-setup) by passing `-clusternative.disableCompression=false` command-line flag to lower-level `vmselect` nodes. * BUGFIX: properly return errors from [export APIs](https://docs.victoriametrics.com/#how-to-export-time-series). Previously these errors were silently suppressed. See [this pull request](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5649). * BUGFIX: [VictoriaMetrics cluster](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html): properly return full results when `-search.skipSlowReplicas` command-line flag is passed to `vmselect` and when [vmstorage groups](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#vmstorage-groups-at-vmselect) are in use. Previously partial results could be returned in this case. diff --git a/docs/Cluster-VictoriaMetrics.md b/docs/Cluster-VictoriaMetrics.md index fe5da5d384..134ff19010 100644 --- a/docs/Cluster-VictoriaMetrics.md +++ b/docs/Cluster-VictoriaMetrics.md @@ -596,7 +596,7 @@ Some capacity planning tips for VictoriaMetrics cluster: - Query latency can be reduced by increasing CPU resources per each `vmselect` node, since each incoming query is processed by a single `vmselect` node. Performance for heavy queries scales with the number of available CPU cores at `vmselect` node, since `vmselect` processes time series referred by the query on all the available CPU cores. - If the cluster needs to process incoming queries at a high rate, then its capacity can be increased by adding more `vmselect` nodes, so incoming queries could be spread among bigger number of `vmselect` nodes. - By default `vminsert` compresses the data it sends to `vmstorage` in order to reduce network bandwidth usage. The compression takes additional CPU resources at `vminsert`. If `vminsert` nodes have limited CPU, then the compression can be disabled by passing `-rpc.disableCompression` command-line flag at `vminsert` nodes. -- By default `vmstorage` doesn't compress the data it sends to `vmselect` in order to reduce CPU usage at the cost of additional network bandwidth usage. Pass `-rpc.disableCompression=false` command-line flag at `vmstorage` in order to reduce network bandwidh usage needed for processing queries at the cost of increased CPU usage. +- By default `vmstorage` compresses the data it sends to `vmselect` during queries in order to reduce network bandwidth usage. The compression takes additional CPU resources at `vmstorage`. If `vmstorage` nodes have limited CPU, then the compression can be disabled by passing `-rpc.disableCompression` command-line flag at `vmstorage` nodes. See also [resource usage limits docs](#resource-usage-limits). @@ -1187,7 +1187,7 @@ Below is the output for `/path/to/vmselect -help`: -cluster.tlsKeyFile string Path to client-side TLS key file to use when connecting to -storageNode if -cluster.tls flag is set. See https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#mtls-protection . This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html -clusternative.disableCompression - Whether to disable compression of the data sent to vmselect via -clusternativeListenAddr. This reduces CPU usage at the cost of higher network bandwidth usage (default true) + Whether to disable compression of the data sent to vmselect via -clusternativeListenAddr. This reduces CPU usage at the cost of higher network bandwidth usage -clusternative.maxConcurrentRequests int The maximum number of concurrent vmselect requests the server can process at -clusternativeListenAddr. It shouldn't be high, since a single request usually saturates a CPU core at the underlying vmstorage nodes, and many concurrently executed requests may require high amounts of memory. See also -clusternative.maxQueueDuration (default 32) -clusternative.maxQueueDuration duration @@ -1608,7 +1608,7 @@ Below is the output for `/path/to/vmstorage -help`: -retentionTimezoneOffset duration The offset for performing indexdb rotation. If set to 0, then the indexdb rotation is performed at 4am UTC time per each -retentionPeriod. If set to 2h, then the indexdb rotation is performed at 4am EET time (the timezone with +2h offset) -rpc.disableCompression - Whether to disable compression of the data sent from vmstorage to vmselect. This reduces CPU usage at the cost of higher network bandwidth usage (default true) + Whether to disable compression of the data sent from vmstorage to vmselect. This reduces CPU usage at the cost of higher network bandwidth usage -search.maxConcurrentRequests int The maximum number of concurrent vmselect requests the vmstorage can process at -vmselectAddr. It shouldn't be high, since a single request usually saturates a CPU core, and many concurrently executed requests may require high amounts of memory. See also -search.maxQueueDuration (default 32) -search.maxQueueDuration duration