Commit Graph

907 Commits

Author SHA1 Message Date
Yury Molodov
c8de98e03f
vmui: add lists of top queries (#3065)
* feat: add lists of top queries

* fix: change the field label

* refactor: add handlers for readability

* app/vmselect: `make vmui-update`

* docs: document `top queries` tab

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-09-08 21:44:43 +03:00
Aliaksandr Valialkin
052b527b39
docs/CHANGELOG.md: document ec273eafef
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3076
2022-09-08 21:21:26 +03:00
Aliaksandr Valialkin
f0eea5b02d
app/vmselect/netstorage: fix a typo, which leads to incorrect query results in VictoriaMetrics cluster
The typo has been introduced in the commit 1a254ea20c

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3067
2022-09-08 13:46:40 +03:00
Aliaksandr Valialkin
f6990871a3
docs/vmctl.md: make docs-sync after c5261d5f56
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2733
2022-09-06 13:20:16 +03:00
Zakhar Bessarab
ef01f01b6c
vmctl: implement support of chunking data for vm-native export (#3044)
vmctl: implement support of chunking data for vm-native export process

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2733
2022-09-06 13:14:48 +03:00
Aliaksandr Valialkin
051e722112
lib/storage: verify that timestamps in block are in the range specified by blockHeader.{Min,Max}Timestamp when upacking the block
This should reduce chances of unnoticed on-disk data corruption.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2998
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3011

This change modifies the format for data exported via /api/v1/export/native -
now this data contains MaxTimestamp and PrecisionBits fields from blockHeader.

This is OK, since the native export format is undocumented.
2022-09-06 13:07:49 +03:00
Aliaksandr Valialkin
b7f3569522
app/vmselect/prometheus: follow-up after 50e2524bc2
- Add getCommonParamsWithDefaultDuration function and use it at /api/v1/series, /api/v1/labels and /api/v1/label/.../values
- Document the default behaviour for setting 5 minutes time range if start arg isn't passed to /api/v1/series, /api/v1/labels and /api/v1/label/.../values
- Document the change at docs/CHANGELOG.md

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3052
2022-09-05 11:57:08 +03:00
Aliaksandr Valialkin
c6b74148cf
app/vmselect/promql: consistently calculate rate_over_sum(m[d]) as sum_over_time(m[d])/d
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3045
2022-09-02 23:19:05 +03:00
Aliaksandr Valialkin
569843c0dd
docs/CHANGELOG.md: cut v1.81.1 2022-09-02 22:00:34 +03:00
Aliaksandr Valialkin
9cca3a0a1b
app/vmselect/netstorage: fix potential panic under high load
The panic may trigger during data blocks' processing received
from vmstorage nodes when some of vmstorage nodes return an error
or when `-replicationFactor` is set to values higher than 2 at `vmselect`.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3058
2022-09-02 21:36:15 +03:00
Aliaksandr Valialkin
024e2f18da
app/vmselect/promql: evaluate union() args in parallel in order to increase query performance
Note that the parallel execution of `union()` args may take more memory and CPU time
than the sequential execution if args contain heavy queries, which may load all the available CPU,
disk and memory resources and vmselect and vmstorage levels.
2022-09-02 21:01:04 +03:00
Aliaksandr Valialkin
a11eae3a76
docs/CHANGELOG.md: cut v1.81.0 2022-08-31 02:33:36 +03:00
Aliaksandr Valialkin
30a1d0c622
docs/CHANGELOG.md: add missing link to vmagent 2022-08-31 02:22:52 +03:00
Aliaksandr Valialkin
cdb22c5ca3
docs/CHANGELOG.md: document v1.79.3 LTS release 2022-08-31 00:15:26 +03:00
Aliaksandr Valialkin
529bcc0761
docs/CHANGELOG.md: document the 044d51b668 2022-08-30 09:42:51 +03:00
Aliaksandr Valialkin
8aaaf221cc
app/vmselect/promql: follow-up after 2d71b4859c
- Use getScalar() function for obtaining the expected scalar from phi arg
- Reduce the error message returned to the user when incorrect phi is passed to histogram_quantiles
- Improve the description of this bugfix in the docs/CHANGELOG.md

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3026
2022-08-27 01:38:17 +03:00
Dmytro Kozlov
2cf29cc4d1
vmselect/promql: fix panic in histogram_quantiles function (#3029)
* vmselect/promql: fix panic in histogram_quantiles function

* Update docs/MetricsQL.md

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
2022-08-27 01:34:54 +03:00
Aliaksandr Valialkin
86394b4179
lib/promscrape: optimize discoveryutils.SanitizeLabelName()
Cache sanitized label names and return them next time.
This reduces the number of allocations and speeds up the SanitizeLabelName()
function for common case when the number of unique label names is smaller than 100k
2022-08-27 00:18:19 +03:00
Aliaksandr Valialkin
e1bd38fa97
lib/promrelabel: optimize matching for commonly used regex patterns in if option
The following regex patterns are optimized:

- literal string match, e.g. "foo"
- prefix match, e.g. "foo.*" and "foo.+"
- substring match, e.g. ".*foo.*" and ".+foo.+"
- alternate values match, e.g. "foo|bar|baz"
2022-08-26 14:55:13 +03:00
Aliaksandr Valialkin
d60654eb0a
lib/promrelabel: optimize action: {labeldrop,labelkeep,keep,drop} with regex containing alternate values
For example, the following relabeling rule must work much faster now:

- action: labeldrop
  regex: "foo|bar|baz"
2022-08-24 17:55:54 +03:00
Aliaksandr Valialkin
d4cf52f092
app/vmagent: follow-up after 2b22aa1537
- Document the change at docs/CHANGELOG.md
- Move auth token parsing from app/vmagent/opentsdbhttp/ to app/vmagent/main.go,
  since it must be parsed only when multitenancy support is enabled at vmagent side.
  See https://docs.victoriametrics.com/vmagent.html#multitenancy
2022-08-24 16:21:23 +03:00
Dmytro Kozlov
d32a6359b0
vmselect/promql: enable search.maxPointsSubqueryPerTimeseries for sub-queries (#2963)
* vmselect/promql: enable search.maxPointsPerTimeSeriesSubquery for sub-queries

* vmselect/promql: cleanup

* vmselect/promql: rename config flag

* vmselect/promql: add tests

* vmselect/promql: use test object instead of log

* vmselect/promql: fix posible panic is subquery has more points. add description

* vmselect/promql: update tests descriptions

* vmselect/promql: update doInternal validation

* vmselect/promql: fix linter

* vmselect/promql: fix linter

* vmselect/promql: update documentation and release notes

* wip

- Properly apply -search.maxPointsSubqueryPerTimeseries limit to subqueries.
  Previously the -search.maxPointsPerTimeseries limit was unexpectedly applied to subqueries
  if it was smaller than the -search.maxPointsSubqueryPerTimeseries .
- Clarify docs for -search.maxPointsSubqueryPerTimeseries command-line flag .
- Document -search.maxPointsPerTimeseries and -search.maxPointsSubqueryPerTimeseries flags at https://docs.victoriametrics.com/#resource-usage-limits .
- Update docs/CHANGELOG.md .

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2922

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-08-24 15:27:41 +03:00
Aliaksandr Valialkin
7b9ba456ff
app/vmstorage: expose vm_{hourly,daily}_series_limit_{max,current}_series metrics if -storage.max{Hourly,Daily}Series limits are set
These metrics allow alerting when the number of unique series approach the limit.
For example, the following query alerts when the number of series reaches 90% of the configured limit:

    vm_hourly_series_limit_current_series / vm_hourly_series_limit_max_series > 0.9
2022-08-24 13:41:57 +03:00
Roman Khavronenko
555a202d80
docs: follow-up after 88425bb285 (#3007)
Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-08-24 01:23:37 +03:00
Aliaksandr Valialkin
c5325f2db5
docs/CHANGELOG.md: document d59d829cdb
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2673
2022-08-21 23:37:15 +03:00
Aliaksandr Valialkin
1c7f402598
app/vmagent: add ability to construct a label from multiple existing labels by referring them in the replacement field during relabeling
For example:

- target_label: composite-label
  replacement: {{source_label1}}-{{source_label2}}
2022-08-21 22:49:24 +03:00
Aliaksandr Valialkin
ecd8a03b28
docs: change links to Prometheus docs about instant and range queries to links to VictoriaMetrics docs 2022-08-21 19:02:46 +03:00
Aliaksandr Valialkin
76844e519b
docs/CHANGELOG.md: document 10402459d8 2022-08-19 11:44:29 +03:00
Roman Khavronenko
2256d51418
docs: fix docs formatting related to vmalert (#2994)
Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-08-19 11:05:08 +03:00
Aliaksandr Valialkin
bbfa52bd75
docs: follow-up after 68e56b6fc5 2022-08-17 21:27:24 +03:00
Roman Khavronenko
cfcb5ab15b
vmalert: set alert's source link to UI instead of JSON source (#2986)
We switch default alert's source link to redirect user
to vmalert's UI instead of previous JSON object. While it breaks
compatibility, it also supposed to improve user's experience.
The old behavior can be achieved by updating `-external.alert.source`
command-line flag.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-08-17 21:27:24 +03:00
Aliaksandr Valialkin
1812d33a2d
lib/promscrape: automatically generate additional per-target labels for targets with non-zero series limit
The following metrics are generated:

- scrape_series_limit
- scrape_series_current
- scrape_series_limit_samples_dropped

These metrics simplify alerting on targets, which expose too many time series

See https://docs.victoriametrics.com/vmagent.html#automatically-generated-metrics
and https://docs.victoriametrics.com/vmagent.html#cardinality-limiter for more details
2022-08-17 13:22:02 +03:00
Aliaksandr Valialkin
e5c61cac22
docs/CHANGELOG.md: group vmalert features closer to each other 2022-08-17 11:49:53 +03:00
Aliaksandr Valialkin
aa37e6b438
lib/promscrape: retry http requests if the server returns 429 status code
The 429 status code means that the server is overwhelmed with requests.
The client can retry the request after some wait time.
Implement this strategy for service discovery and scrape requests.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2940
2022-08-16 14:57:26 +03:00
Aliaksandr Valialkin
dc929e0d16
lib/storage: improve performance for /api/v1/labels and /api/v1/label/.../values endpoints when match[] filter matches small number of time series
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2978
2022-08-16 13:34:23 +03:00
Roman Khavronenko
436b4d90af
docs: update vmalert docs (#2987)
* mention recently added `$alertID` and `$groupID` variables in the changelog
* properly escape template examples in the vmalert's README

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-08-16 12:10:08 +03:00
Aliaksandr Valialkin
7d7cf2b6fd
app/vmselect: follow-up after 63e0f16062
* Explicitly store a pointer to UserReadableError in the error interface.
  Previously Go automatically converted the value to a pointer before storing in the error interface.

* Add Unwrap() method to UserReadableError, so it can be used transparently with the other code,
  which calls errors.Is() and errors.As().

* Document the change in docs/CHANGELOG.md
2022-08-15 13:53:19 +03:00
Roman Khavronenko
79e396dac9
docs: mention default vmalert's behavior change in Update notes (#2981)
Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-08-15 13:31:59 +03:00
Yury Molodov
dc52b283a3
vmui: shortcut keys legend (#2971)
* feat: add shortcut modal

* feat: add shortcut descriptions

* app/vmselect/vmui: `make vmui-update`

* docs/CHANGELOG.md: document the change

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-08-15 12:04:30 +03:00
Aliaksandr Valialkin
c27bd63f6c
lib/promscrape: update links to sd_configs from Prometheus site to https://docs.victoriametrics.com/sd_configs.html 2022-08-15 01:40:48 +03:00
Aliaksandr Valialkin
1a00c9ef03
lib/promscrape/discovery/kubernetes: add __meta_kubernetes_pod_container_image label in the same way as Prometheus 2.38 does
See https://github.com/prometheus/prometheus/pull/11034
2022-08-15 01:18:57 +03:00
Aliaksandr Valialkin
2fb63dda83
lib/promscrape/discovery/kubernetes: add __meta_kubernetes_service_port_number label to role: service in the same way as Prometheus 2.38 does
See https://github.com/prometheus/prometheus/pull/11002
2022-08-15 01:07:19 +03:00
Aliaksandr Valialkin
198d8eeeaa
app/vmalert/templates: add toTime() template function in the same way as Prometheus 2.38 does
See https://github.com/prometheus/prometheus/pull/10993
2022-08-15 00:49:52 +03:00
Aliaksandr Valialkin
2b58bd9876
lib/promscrape/discovery/dns: add support for resolving MX records
See https://github.com/prometheus/prometheus/pull/10099
2022-08-15 00:33:06 +03:00
Aliaksandr Valialkin
78f584fb0b
docs/CHANGELOG.md: clarify the change at 28441711e6 2022-08-11 23:53:32 +03:00
Roman Khavronenko
36660bcfe2
vmalert: follow-up after 28441711e6 (#2972)
Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-08-11 23:52:18 +03:00
Aliaksandr Valialkin
ec3df0b913
app/vmselect/netstorage: improve scalability of blocks processing on systems with multiple CPU cores
Previously a single syncwg.WaitGroup was used for tracking the lifetime of processBlock callbacks
across all the per-vmstorage goroutines. This could be slow on systems with many CPU cores
because of inter-CPU synchronization overhead.

Use a separate per-vmstorage sync.WaitGroup instead in order to reduce inter-CPU synchronization overhead.
This should imrpove performance for heavy queries over big number of blocks on multi-CPU systems.
2022-08-11 21:37:24 +03:00
Roman Khavronenko
f42853275f
lib/storage: prevent excessive loops when storage is in RO (#2962)
* lib/storage: prevent excessive loops when storage is in RO

Returning nil error when storage is in RO mode results
into excessive loops and function calls which could
result into CPU exhaustion. Returning an err instead
will trigger delays in the for loop and save some resources.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* document the change

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-08-09 12:17:47 +03:00
Aliaksandr Valialkin
91d0b45537
docs/CHANGELOG.md: cut v1.80.0 2022-08-08 19:59:02 +03:00
Aliaksandr Valialkin
7e892d3322
docs/CHANGELOG.md: add changes for v1.79.2 2022-08-08 18:00:38 +03:00