diff --git a/README.md b/README.md index 3926be0761..739e923bdf 100644 --- a/README.md +++ b/README.md @@ -938,7 +938,7 @@ Below is the output for `/path/to/vminsert -help`: Whether to use proxy protocol for connections accepted at -httpListenAddr . See https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt . With enabled proxy protocol http server cannot serve regular /metrics endpoint. Use -pushmetrics.url for metrics pushing -import.maxLineLen size The maximum length in bytes of a single line accepted by /api/v1/import; the line length can be limited with 'max_rows_per_line' query arg passed to /api/v1/export - Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 104857600) + Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 10485760) -influx.databaseNames array Comma-separated list of database names to return from /query and /influx/query API. This can be needed for accepting data from Telegraf plugins such as https://github.com/fangli/fluent-plugin-influxdb Supports an array of values separated by comma or specified via multiple flags. diff --git a/docs/CHANGELOG.md b/docs/CHANGELOG.md index 8e2fd6c3b9..2246e305e5 100644 --- a/docs/CHANGELOG.md +++ b/docs/CHANGELOG.md @@ -29,6 +29,7 @@ The sandbox cluster installation is running under the constant load generated by ## tip * FEATURE: [vmagent](https://docs.victoriametrics.com/vmagent.html): add support for reading and writing samples via [Google PubSub](https://cloud.google.com/pubsub). See [these docs](https://docs.victoriametrics.com/vmagent.html#google-pubsub-integration). +* FEATURE: reduce the default value for `-import.maxLineLen` command-line flag from 100MB to 10MB in order to prevent excessive memory usage during data import via [/api/v1/import](https://docs.victoriametrics.com/#how-to-import-data-in-json-line-format). * BUGFIX: [vmagent](https://docs.victoriametrics.com/vmagent.html): prevent from `FATAL: cannot flush metainfo` panic when [`-remoteWrite.multitenantURL`](https://docs.victoriametrics.com/vmagent.html#multitenancy) command-line flag is set. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5357). * BUGFIX: [vmagent](https://docs.victoriametrics.com/vmagent.html): properly decode zstd-encoded data blocks received via [VictoriaMetrics remote_write protocol](https://docs.victoriametrics.com/vmagent.html#victoriametrics-remote-write-protocol). See [this issue comment](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5301#issuecomment-1815871992). @@ -38,7 +39,6 @@ The sandbox cluster installation is running under the constant load generated by Released at 2023-11-16 * FEATURE: dashboards: use `version` instead of `short_version` in version change annotation for single/cluster dashboards. The update should reflect version changes even if different flavours of the same release were applied (custom builds). -* FEATURE: lower limit for `import.maxLineLen` cmd-line flag from 100MB to 10MB in order to prevent excessive memory usage during data import. Please note, the line length of exported data can be limited with `max_rows_per_line` query arg passed to `/api/v1/export`. The change affects vminsert/vmagent/VictoriaMetrics single-node. * BUGFIX: fix a bug, which could result in improper results and/or to `cannot merge series: duplicate series found` error during [range query](https://docs.victoriametrics.com/keyConcepts.html#range-query) execution. The issue has been introduced in [v1.95.0](https://docs.victoriametrics.com/CHANGELOG.html#v1950). See [this bugreport](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5332) for details. * BUGFIX: improve deadline detection when using buffered connection for communication between cluster components. Before, due to nature of a buffered connection the deadline could have been exceeded while reading or writing buffered data to connection. See [this pull request](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5327). diff --git a/docs/README.md b/docs/README.md index 443ba34521..0f977ee125 100644 --- a/docs/README.md +++ b/docs/README.md @@ -1417,7 +1417,12 @@ For example, `/api/v1/import?extra_label=foo=bar` would add `"foo":"bar"` label Note that it could be required to flush response cache after importing historical data. See [these docs](#backfilling) for detail. -VictoriaMetrics parses input JSON lines one-by-one. It loads the whole JSON line in memory, then parses it and then saves the parsed samples into persistent storage. This means that VictoriaMetrics can occupy big amounts of RAM when importing too long JSON lines. The solution is to split too long JSON lines into smaller lines. It is OK if samples for a single time series are split among multiple JSON lines. +VictoriaMetrics parses input JSON lines one-by-one. It loads the whole JSON line in memory, then parses it and then saves the parsed samples into persistent storage. +This means that VictoriaMetrics can occupy big amounts of RAM when importing too long JSON lines. +The solution is to split too long JSON lines into shorter lines. It is OK if samples for a single time series are split among multiple JSON lines. +JSON line length can be limited via `max_rows_per_line` query arg when exporting via [/api/v1/export](how-to-export-data-in-json-line-format). + +The maximum JSON line length, which can be parsed by VictoriaMetrics, is limited by `-import.maxLineLen` command-line flag value. ### How to import data in native format @@ -1594,15 +1599,18 @@ The format follows [JSON streaming concept](http://ndjson.org/), e.g. each line ``` Note that every JSON object must be written in a single line, e.g. all the newline chars must be removed from it. -Every line length is limited by the value passed to `-import.maxLineLen` command-line flag (by default this is 10MB). +[/api/v1/import](#how-to-import-data-in-json-line-format) handler doesn't accept JSON lines longer than the value +passed to `-import.maxLineLen` command-line flag (by default this is 10MB). It is recommended passing 1K-10K samples per line for achieving the maximum data ingestion performance at [/api/v1/import](#how-to-import-data-in-json-line-format). Too long JSON lines may increase RAM usage at VictoriaMetrics side. +[/api/v1/export](#how-to-export-data-in-json-line-format) handler accepts `max_rows_per_line` query arg, which allows limiting the number of samples per each exported line. + It is OK to split [raw samples](https://docs.victoriametrics.com/keyConcepts.html#raw-samples) for the same [time series](https://docs.victoriametrics.com/keyConcepts.html#time-series) across multiple lines. -The number of lines in JSON line document can be arbitrary. +The number of lines in the request to [/api/v1/import](#how-to-import-data-in-json-line-format) can be arbitrary - they are imported in streaming manner. ## Relabeling diff --git a/docs/Single-server-VictoriaMetrics.md b/docs/Single-server-VictoriaMetrics.md index 1a234a94e7..31745d4420 100644 --- a/docs/Single-server-VictoriaMetrics.md +++ b/docs/Single-server-VictoriaMetrics.md @@ -1425,7 +1425,12 @@ For example, `/api/v1/import?extra_label=foo=bar` would add `"foo":"bar"` label Note that it could be required to flush response cache after importing historical data. See [these docs](#backfilling) for detail. -VictoriaMetrics parses input JSON lines one-by-one. It loads the whole JSON line in memory, then parses it and then saves the parsed samples into persistent storage. This means that VictoriaMetrics can occupy big amounts of RAM when importing too long JSON lines. The solution is to split too long JSON lines into smaller lines. It is OK if samples for a single time series are split among multiple JSON lines. +VictoriaMetrics parses input JSON lines one-by-one. It loads the whole JSON line in memory, then parses it and then saves the parsed samples into persistent storage. +This means that VictoriaMetrics can occupy big amounts of RAM when importing too long JSON lines. +The solution is to split too long JSON lines into shorter lines. It is OK if samples for a single time series are split among multiple JSON lines. +JSON line length can be limited via `max_rows_per_line` query arg when exporting via [/api/v1/export](how-to-export-data-in-json-line-format). + +The maximum JSON line length, which can be parsed by VictoriaMetrics, is limited by `-import.maxLineLen` command-line flag value. ### How to import data in native format @@ -1602,15 +1607,18 @@ The format follows [JSON streaming concept](http://ndjson.org/), e.g. each line ``` Note that every JSON object must be written in a single line, e.g. all the newline chars must be removed from it. -Every line length is limited by the value passed to `-import.maxLineLen` command-line flag (by default this is 10MB). +[/api/v1/import](#how-to-import-data-in-json-line-format) handler doesn't accept JSON lines longer than the value +passed to `-import.maxLineLen` command-line flag (by default this is 10MB). It is recommended passing 1K-10K samples per line for achieving the maximum data ingestion performance at [/api/v1/import](#how-to-import-data-in-json-line-format). Too long JSON lines may increase RAM usage at VictoriaMetrics side. +[/api/v1/export](#how-to-export-data-in-json-line-format) handler accepts `max_rows_per_line` query arg, which allows limiting the number of samples per each exported line. + It is OK to split [raw samples](https://docs.victoriametrics.com/keyConcepts.html#raw-samples) for the same [time series](https://docs.victoriametrics.com/keyConcepts.html#time-series) across multiple lines. -The number of lines in JSON line document can be arbitrary. +The number of lines in the request to [/api/v1/import](#how-to-import-data-in-json-line-format) can be arbitrary - they are imported in streaming manner. ## Relabeling