update wiki pages

2025-01-20 07:19:17 +01:00 · 2023-07-20 23:53:54 +00:00 · 2023-07-20 23:53:54 +00:00 · d7850bb8db
commit d7850bb8db
parent 299c2ea68c
6 changed files with 89 additions and 23 deletions
--- a/README.md
+++ b/README.md
@ -404,6 +404,11 @@ matching the specified [series selector](https://prometheus.io/docs/prometheus/l

 Cardinality explorer is built on top of [/api/v1/status/tsdb](#tsdb-stats).

+In [cluster version of VictoriaMetrics](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html) each vmstorage tracks the stored time series individually.
+vmselect requests stats via [/api/v1/status/tsdb](#tsdb-stats) API from each vmstorage node and merges the results by summing per-series stats.
+This may lead to inflated values when samples for the same time series are spread across multiple vmstorage nodes
+due to [replication](#replication) or [rerouting](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html?highlight=re-routes#cluster-availability).
+
 See [cardinality explorer playground](https://play.victoriametrics.com/select/accounting/1/6a716b0f-38bc-4856-90ce-448fd713e3fe/prometheus/graph/#/cardinality).
 See the example of using the cardinality explorer [here](https://victoriametrics.com/blog/cardinality-explorer/).

@ -1787,6 +1792,11 @@ VictoriaMetrics returns TSDB stats at `/api/v1/status/tsdb` page in the way simi
 * `match[]=SELECTOR` where `SELECTOR` is an arbitrary [time series selector](https://prometheus.io/docs/prometheus/latest/querying/basics/#time-series-selectors) for series to take into account during stats calculation. By default all the series are taken into account.
 * `extra_label=LABEL=VALUE`. See [these docs](#prometheus-querying-api-enhancements) for more details.

+In [cluster version of VictoriaMetrics](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html) each vmstorage tracks the stored time series individually.
+vmselect requests stats via [/api/v1/status/tsdb](#tsdb-stats) API from each vmstorage node and merges the results by summing per-series stats.
+This may lead to inflated values when samples for the same time series are spread across multiple vmstorage nodes
+due to [replication](#replication) or [rerouting](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html?highlight=re-routes#cluster-availability).
+
 VictoriaMetrics provides an UI on top of `/api/v1/status/tsdb` - see [cardinality explorer docs](#cardinality-explorer).

 ## Query tracing
--- a/Single-server-VictoriaMetrics.md
+++ b/Single-server-VictoriaMetrics.md
@ -412,6 +412,11 @@ matching the specified [series selector](https://prometheus.io/docs/prometheus/l

 Cardinality explorer is built on top of [/api/v1/status/tsdb](#tsdb-stats).

+In [cluster version of VictoriaMetrics](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html) each vmstorage tracks the stored time series individually.
+vmselect requests stats via [/api/v1/status/tsdb](#tsdb-stats) API from each vmstorage node and merges the results by summing per-series stats.
+This may lead to inflated values when samples for the same time series are spread across multiple vmstorage nodes
+due to [replication](#replication) or [rerouting](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html?highlight=re-routes#cluster-availability).
+
 See [cardinality explorer playground](https://play.victoriametrics.com/select/accounting/1/6a716b0f-38bc-4856-90ce-448fd713e3fe/prometheus/graph/#/cardinality).
 See the example of using the cardinality explorer [here](https://victoriametrics.com/blog/cardinality-explorer/).

@ -1795,6 +1800,11 @@ VictoriaMetrics returns TSDB stats at `/api/v1/status/tsdb` page in the way simi
 * `match[]=SELECTOR` where `SELECTOR` is an arbitrary [time series selector](https://prometheus.io/docs/prometheus/latest/querying/basics/#time-series-selectors) for series to take into account during stats calculation. By default all the series are taken into account.
 * `extra_label=LABEL=VALUE`. See [these docs](#prometheus-querying-api-enhancements) for more details.

+In [cluster version of VictoriaMetrics](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html) each vmstorage tracks the stored time series individually.
+vmselect requests stats via [/api/v1/status/tsdb](#tsdb-stats) API from each vmstorage node and merges the results by summing per-series stats.
+This may lead to inflated values when samples for the same time series are spread across multiple vmstorage nodes
+due to [replication](#replication) or [rerouting](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html?highlight=re-routes#cluster-availability).
+
 VictoriaMetrics provides an UI on top of `/api/v1/status/tsdb` - see [cardinality explorer docs](#cardinality-explorer).

 ## Query tracing
--- a/VictoriaLogs/CHANGELOG.md
+++ b/VictoriaLogs/CHANGELOG.md
@ -5,6 +5,8 @@ according to [these docs](https://docs.victoriametrics.com/VictoriaLogs/QuickSta

 ## tip

+* FEATURE: add support for data ingestion via Promtail (aka default log shipper for Grafana Loki). See [these](https://docs.victoriametrics.com/VictoriaLogs/data-ingestion/Promtail.html) and [these](https://docs.victoriametrics.com/VictoriaLogs/data-ingestion/#loki-json-api) docs.
+
 ## [v0.2.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v0.2.0-victorialogs)

 Released at 2023-07-17
--- a/VictoriaLogs/data-ingestion/Promtail.md
+++ b/VictoriaLogs/data-ingestion/Promtail.md
@ -1,16 +1,20 @@
 # Promtail setup

+[Promtail](https://grafana.com/docs/loki/latest/clients/promtail/) is a default log shipper for Grafana Loki.
+Promtail can be configured to send the collected logs to VictoriaLogs according to the following docs.
+
 Specify [`clients`](https://grafana.com/docs/loki/latest/clients/promtail/configuration/#clients) section in the configuration file
 for sending the collected logs to [VictoriaLogs](https://docs.victoriametrics.com/VictoriaLogs/):

 ```yaml
 clients:
-  - url: http://vlogs:9428/insert/loki/api/v1/push?_stream_fields=filename,job,stream,host,app,pid
+  - url: http://localhost:9428/insert/loki/api/v1/push?_stream_fields=instance,job,host,app,pid
 ```

-Substitute `vlogs:9428` address inside `clients` with the real TCP address of VictoriaLogs.
+Substitute `localhost:9428` address inside `clients` with the real TCP address of VictoriaLogs.

 See [these docs](https://docs.victoriametrics.com/VictoriaLogs/data-ingestion/#http-parameters) for details on the used URL query parameter section.
+There is no need in specifying `_msg_field` and `_time_field` query args, since VictoriaLogs automatically extracts log message and timestamp from the ingested Loki data.

 It is recommended verifying whether the initial setup generates the needed [log fields](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#data-model)
 and uses the correct [stream fields](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#stream-fields).
@ -19,27 +23,28 @@ and inspecting VictoriaLogs logs then:

 ```yaml
 clients:
-  - url: http://vlogs:9428/insert/loki/api/v1/push?_stream_fields=filename,job,stream,host,app,pid&debug=1
+  - url: http://localhost:9428/insert/loki/api/v1/push?_stream_fields=instance,job,host,app,pid&debug=1
 ```

 If some [log fields](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#data-model) must be skipped
 during data ingestion, then they can be put into `ignore_fields` [parameter](https://docs.victoriametrics.com/VictoriaLogs/data-ingestion/#http-parameters).
-For example, the following config instructs VictoriaLogs to ignore `log.offset` and `event.original` fields in the ingested logs:
+For example, the following config instructs VictoriaLogs to ignore `filename` and `stream` fields in the ingested logs:

 ```yaml
 clients:
-  - url: http://vlogs:9428/insert/loki/api/v1/push?_stream_fields=filename,job,stream,host,app,pid&debug=1
+  - url: http://localhost:9428/insert/loki/api/v1/push?_stream_fields=instance,job,host,app,pid&ignore_fields=filename,stream
 ```

 By default the ingested logs are stored in the `(AccountID=0, ProjectID=0)` [tenant](https://docs.victoriametrics.com/VictoriaLogs/#multitenancy).
-If you need storing logs in other tenant, then It is possible to either use `tenant_id` provided by Loki configuration, or to use `headers` and provide
-`AccountID` and `ProjectID` headers. Format for `tenant_id` is `AccountID:ProjectID`. 
-For example, the following config instructs VictoriaLogs to store logs in the `(AccountID=12, ProjectID=12)` tenant:
+If you need storing logs in other tenant, then specify the needed tenant via `tenant_id` field
+in the [Loki client configuration](https://grafana.com/docs/loki/latest/clients/promtail/configuration/#clients)
+The `tenant_id` must have `AccountID:ProjectID` format, where `AccountID` and `ProjectID` are arbitrary uint32 numbers.
+For example, the following config instructs VictoriaLogs to store logs in the `(AccountID=12, ProjectID=34)` [tenant](https://docs.victoriametrics.com/VictoriaLogs/#multitenancy):

 ```yaml
 clients:
-  - url: http://vlogs:9428/insert/loki/api/v1/push?_stream_fields=filename,job,stream,host,app,pid&debug=1
-    tenant_id: "12:12"
+  - url: http://localhost:9428/insert/loki/api/v1/push?_stream_fields=instance,job,host,app,pid&debug=1
+    tenant_id: "12:34"
 ```

 The ingested log entries can be queried according to [these docs](https://docs.victoriametrics.com/VictoriaLogs/querying/).
--- a/VictoriaLogs/data-ingestion/README.md
+++ b/VictoriaLogs/data-ingestion/README.md
@ -17,7 +17,7 @@ menu:
 - Fluentbit. See [how to setup Fluentbit for sending logs to VictoriaLogs](https://docs.victoriametrics.com/VictoriaLogs/data-ingestion/Fluentbit.html).
 - Logstash. See [how to setup Logstash for sending logs to VictoriaLogs](https://docs.victoriametrics.com/VictoriaLogs/data-ingestion/Logstash.html).
 - Vector. See [how to setup Vector for sending logs to VictoriaLogs](https://docs.victoriametrics.com/VictoriaLogs/data-ingestion/Vector.html).
- Promtail. See [how to setup Promtail for sending logs to VictoriaLogs](https://docs.victoriametrics.com/VictoriaLogs/data-ingestion/Promtail.html).
+- Promtail (aka Grafana Loki). See [how to setup Promtail for sending logs to VictoriaLogs](https://docs.victoriametrics.com/VictoriaLogs/data-ingestion/Promtail.html).

 The ingested logs can be queried according to [these docs](https://docs.victoriametrics.com/VictoriaLogs/querying/).

@ -33,7 +33,7 @@ VictoriaLogs supports the following data ingestion HTTP APIs:

 - Elasticsearch bulk API. See [these docs](#elasticsearch-bulk-api).
 - JSON stream API aka [ndjson](http://ndjson.org/). See [these docs](#json-stream-api).
- [Loki JSON API](https://grafana.com/docs/loki/latest/api/#push-log-entries-to-lokiq). See [these docs](#loki-json-api).
+- Loki JSON API. See [these docs](#loki-json-api).

 VictoriaLogs accepts optional [HTTP parameters](#http-parameters) at data ingestion HTTP APIs.

@ -47,12 +47,18 @@ The following command pushes a single log line to VictoriaLogs:

 ```bash
 echo '{"create":{}}
-{"_msg":"cannot open file","_time":"2023-06-21T04:24:24Z","host.name":"host123"}
+{"_msg":"cannot open file","_time":"0","host.name":"host123"}
 ' | curl -X POST -H 'Content-Type: application/json' --data-binary @- http://localhost:9428/insert/elasticsearch/_bulk
 ```

 It is possible to push thousands of log lines in a single request to this API.

+If the [timestamp field](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#time-field) is set to `"0"`,
+then the current timestamp at VictoriaLogs side is used per each ingested log line.
+Otherwise the timestamp field must be in the [ISO8601](https://en.wikipedia.org/wiki/ISO_8601) format. For example, `2023-06-20T15:32:10Z`.
+Optional fractional part of seconds can be specified after the dot - `2023-06-20T15:32:10.123Z`.
+Timezone can be specified instead of `Z` suffix - `2023-06-20T15:32:10+02:00`.
+
 See [these docs](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#data-model) for details on fields,
 which must be present in the ingested log messages.

@ -88,17 +94,18 @@ VictoriaLogs accepts JSON line stream aka [ndjson](http://ndjson.org/) at `http:
 The following command pushes multiple log lines to VictoriaLogs:

 ```bash
-echo '{ "log": { "level": "info", "message": "hello world" }, "date": "2023-06-20T15:31:23Z", "stream": "stream1" }
-{ "log": { "level": "error", "message": "oh no!" }, "date": "2023-06-20T15:32:10.567Z", "stream": "stream1" }
-{ "log": { "level": "info", "message": "hello world" }, "date": "2023-06-20T15:35:11.567890+02:00", "stream": "stream2" }
+echo '{ "log": { "level": "info", "message": "hello world" }, "date": "0", "stream": "stream1" }
+{ "log": { "level": "error", "message": "oh no!" }, "date": "0", "stream": "stream1" }
+{ "log": { "level": "info", "message": "hello world" }, "date": "0", "stream": "stream2" }
 ' | curl -X POST -H 'Content-Type: application/stream+json' --data-binary @- \
  'http://localhost:9428/insert/jsonline?_stream_fields=stream&_time_field=date&_msg_field=log.message'
 ```

 It is possible to push unlimited number of log lines in a single request to this API.

-The [timestamp field](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#time-field) must be
-in the [ISO8601](https://en.wikipedia.org/wiki/ISO_8601) format. For example, `2023-06-20T15:32:10Z`.
+If the [timestamp field](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#time-field) is set to `"0"`,
+then the current timestamp at VictoriaLogs side is used per each ingested log line.
+Otherwise the timestamp field must be in the [ISO8601](https://en.wikipedia.org/wiki/ISO_8601) format. For example, `2023-06-20T15:32:10Z`.
 Optional fractional part of seconds can be specified after the dot - `2023-06-20T15:32:10.123Z`.
 Timezone can be specified instead of `Z` suffix - `2023-06-20T15:32:10+02:00`.

@ -134,15 +141,42 @@ See also:

 ### Loki JSON API

-VictoriaLogs accepts logs in [Loki JSON API](https://grafana.com/docs/loki/latest/api/#push-log-entries-to-lokiq) format at `http://localhost:9428/insert/loki/api/v1/push` endpoint.
+VictoriaLogs accepts logs in [Loki JSON API](https://grafana.com/docs/loki/latest/api/#push-log-entries-to-loki) format at `http://localhost:9428/insert/loki/api/v1/push` endpoint.

 The following command pushes a single log line to Loki JSON API at VictoriaLogs:

 ```bash
-curl -v -H "Content-Type: application/json" -XPOST -s "http://localhost:9428/insert/loki/api/v1/push?_stream_fields=foo" --data-raw \
-  '{"streams": [{ "stream": { "foo": "bar2" }, "values": [ [ "1570818238000000000", "fizzbuzz" ] ] }]}'
+curl -H "Content-Type: application/json" -XPOST "http://localhost:9428/insert/loki/api/v1/push?_stream_fields=instance,job" --data-raw \
+  '{"streams": [{ "stream": { "instance": "host123", "job": "app42" }, "values": [ [ "0", "foo fizzbuzz bar" ] ] }]}'
 ```

+It is possible to push thousands of log streams and log lines in a single request to this API.
+
+The API accepts various http parameters, which can change the data ingestion behavior - [these docs](#http-parameters) for details.
+
+The following command verifies that the data has been successfully ingested into VictoriaLogs by [querying](https://docs.victoriametrics.com/VictoriaLogs/querying/) it:
+
+```bash
+curl http://localhost:9428/select/logsql/query -d 'query=fizzbuzz'
+```
+
+The command should return the following response:
+
+```bash
+{"_msg":"foo fizzbuzz bar","_stream":"{instance=\"host123\",job=\"app42\"}","_time":"2023-07-20T23:01:19.288676497Z"}
+```
+
+The response by default contains [`_msg`](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#message-field),
+[`_stream`](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#stream-fields) and
+[`_time`](https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#time-field) fields plus the explicitly mentioned fields.
+See [these docs](https://docs.victoriametrics.com/VictoriaLogs/LogsQL.html#querying-specific-fields) for details.
+
+See also:
+
+- [How to debug data ingestion](#troubleshooting).
+- [HTTP parameters, which can be passed to the API](#http-parameters).
+- [How to query VictoriaLogs](https://docs.victoriametrics.com/VictoriaLogs/querying.html).
+
 ### HTTP parameters

 VictoriaLogs accepts the following parameters at [data ingestion HTTP APIs](#http-apis):
--- a/stream-aggregation.md
+++ b/stream-aggregation.md
@ -358,7 +358,7 @@ Below are aggregation functions that can be put in the `outputs` list at [stream
 ### total

 `total` generates output [counter](https://docs.victoriametrics.com/keyConcepts.html#counter) by summing the input counters.
-`total` only makes sense for aggregating [counter](https://docs.victoriametrics.com/keyConcepts.html#counter) type metrics.
+`total` only makes sense for aggregating [counter](https://docs.victoriametrics.com/keyConcepts.html#counter) metrics.

 The results of `total` is equal to the `sum(some_counter)` query.

@ -383,7 +383,7 @@ This changes a label with pod's name in the series, but `total` account for such
 ### increase

 `increase` returns the increase of input [counters](https://docs.victoriametrics.com/keyConcepts.html#counter).
-`increase` only makes sense for aggregating [counter](https://docs.victoriametrics.com/keyConcepts.html#counter) type metrics.
+`increase` only makes sense for aggregating [counter](https://docs.victoriametrics.com/keyConcepts.html#counter) metrics.

 The results of `increase` with aggregation interval of `1m` is equal to the `increase(some_counter[1m])` query.

@ -406,6 +406,7 @@ The results of `count_samples` with aggregation interval of `1m` is equal to the
 ### sum_samples

 `sum_samples` sums input [sample values](https://docs.victoriametrics.com/keyConcepts.html#raw-samples).
+`sum_samples` makes sense only for aggregating [gauge](https://docs.victoriametrics.com/keyConcepts.html#gauge) metrics.

 The results of `sum_samples` with aggregation interval of `1m` is equal to the `sum_over_time(some_metric[1m])` query.

@ -455,12 +456,14 @@ For example, see below time series produced by config with aggregation interval
 ### stddev

 `stddev` returns [standard deviation](https://en.wikipedia.org/wiki/Standard_deviation) for the input [sample values](https://docs.victoriametrics.com/keyConcepts.html#raw-samples).
+`stddev` makes sense only for aggregating [gauge](https://docs.victoriametrics.com/keyConcepts.html#gauge) metrics.

 The results of `stddev` with aggregation interval of `1m` is equal to the `stddev_over_time(some_metric[1m])` query.

 ### stdvar

 `stdvar` returns [standard variance](https://en.wikipedia.org/wiki/Variance) for the input [sample values](https://docs.victoriametrics.com/keyConcepts.html#raw-samples).
+`stdvar` makes sense only for aggregating [gauge](https://docs.victoriametrics.com/keyConcepts.html#gauge) metrics.

 The results of `stdvar` with aggregation interval of `1m` is equal to the `stdvar_over_time(some_metric[1m])` query.

@ -472,6 +475,7 @@ For example, see below time series produced by config with aggregation interval

 `histogram_bucket` returns [VictoriaMetrics histogram buckets](https://valyala.medium.com/improving-histogram-usability-for-prometheus-and-grafana-bc7e5df0e350)
  for the input [sample values](https://docs.victoriametrics.com/keyConcepts.html#raw-samples).
+`histogram_bucket` makes sense only for aggregating [gauge](https://docs.victoriametrics.com/keyConcepts.html#gauge) metrics.

 The results of `histogram_bucket` with aggregation interval of `1m` is equal to the `histogram_over_time(some_histogram_bucket[1m])` query.

@ -480,6 +484,7 @@ The results of `histogram_bucket` with aggregation interval of `1m` is equal to
 `quantiles(phi1, ..., phiN)` returns [percentiles](https://en.wikipedia.org/wiki/Percentile) for the given `phi*`
 over the input [sample values](https://docs.victoriametrics.com/keyConcepts.html#raw-samples). 
 The `phi` must be in the range `[0..1]`, where `0` means `0th` percentile, while `1` means `100th` percentile.
+`quantiles(...)` makes sense only for aggregating [gauge](https://docs.victoriametrics.com/keyConcepts.html#gauge) metrics.

 The results of `quantiles(phi1, ..., phiN)` with aggregation interval of `1m` 
 is equal to the `quantiles_over_time("quantile", phi1, ..., phiN, some_histogram_bucket[1m])` query.