Commit Graph

658 Commits

Author SHA1 Message Date
Aliaksandr Valialkin
c584aece38 app/vmselect/promql: properly limit implicitly set rollup window to -search.maxStalenessInterval
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/784
2020-09-23 23:23:59 +03:00
Aliaksandr Valialkin
2985077c35 all: consistently use "%w" formatting in fmt.Errorf for wrapped errors 2020-09-23 22:46:34 +03:00
Aliaksandr Valialkin
27500d7d4c app/vmselect/prometheus: code cleanup after 3ba507000c 2020-09-23 13:04:17 +03:00
Aliaksandr Valialkin
3ba507000c app/vmselect/prometheus: return timestamps from /api/v1/query, which match the time query arg
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/720
2020-09-23 12:58:48 +03:00
Aliaksandr Valialkin
bed25e3c24 app/vmselect/netstorage: properly pre-allocate space for sbs 2020-09-22 23:49:55 +03:00
Aliaksandr Valialkin
09b0f7c202 app/vmselect/netstorage: release search resources on timeout errors
Previously these resources weren't released, which could lead to resource leaks.
2020-09-22 22:57:38 +03:00
Aliaksandr Valialkin
3b1e3a03e0 app/vmselect: make sure the request doesnt wait in pending queue more than the configured timeout
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/711
2020-09-22 01:23:19 +03:00
Aliaksandr Valialkin
94f7d00537 docs/vmagent.md: typo fix 2020-09-21 21:49:22 +03:00
Aliaksandr Valialkin
00b5145c69 app/vmselect/searchutils: fixed tests after 2eb72e09ab 2020-09-21 21:31:38 +03:00
Aliaksandr Valialkin
2eb72e09ab app/vmselect: use time value rounded to seconds if it isnt passed to /api/v1/query
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/720
2020-09-21 21:24:40 +03:00
Nikolay Khramchikhin
312fead9a2
Add improvements to ec2_sd_discovery (#775)
* Add improvements to ec2 discovery

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/771

 role_arn support with aws sts
 instance iam_role support
 refreshing temporary tokens

* Apply suggestions from code review

Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>

* changed implementation, removed tests, clean up code

* moved endpoint builder into getEC2APIResponse

Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>
2020-09-21 16:04:15 +03:00
Aliaksandr Valialkin
1e1a27d803 app/vmalert: remove unneeded UTC() call
UTC() doesn't change the underlying timestamp, so the call isn't needed here
2020-09-21 15:55:59 +03:00
Roman Khavronenko
5dffc7a553
vmalert: add support for datasource.lookback flag (#779)
New datasource flag `datasource.lookback` defines how far to look into
past when evaluating queries.

Address https://github.com/VictoriaMetrics/VictoriaMetrics/issues/668
2020-09-21 15:53:49 +03:00
Roman Khavronenko
82c3bbce34
vmalert: fix the typo in error message (#782)
The error will be always nil so no sense in printing it.
2020-09-21 11:34:23 +03:00
Aliaksandr Valialkin
d50165ad59 app/vmagent: increase default value for -remoteWrite.queues from 1 to 4, since it has been appeared that many users hit this limit 2020-09-18 14:21:54 +03:00
Aliaksandr Valialkin
98d1cd0971 app/vmselect/graphite: return proper results /metrics/find?query=foo.*.bar according to Graphite Metrics API 2020-09-18 11:00:00 +03:00
Aliaksandr Valialkin
7a134b0fd7 app/vmstorage: added -forceMergeAuthKey command-line flag for protecting /internal/force_merge endpoint 2020-09-17 14:21:53 +03:00
Aliaksandr Valialkin
1f33dd717f lib/storage: add /internal/force_merge handler for running forced compactions on historical per-month partitions
This may be useful for freeing up storage space after time series deletion.

See https://victoriametrics.github.io/#force-merge for more details.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/686
2020-09-17 12:20:40 +03:00
Aliaksandr Valialkin
ab53cb6f7b app/vmagent: substitute -remoteWrite.url with secret-url value in logs, since it may contain sensitive info such as passwords or auth tokens
Pass `-remoteWrite.showURL` command-line flag in order to see real `-remoteWrite.url` values in logs and at `/metrics` page.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/773
2020-09-16 22:36:25 +03:00
Aliaksandr Valialkin
9f79bcf64a app/vmselect: improve description for -search.maxQueryDuration 2020-09-16 21:15:41 +03:00
Aliaksandr Valialkin
1fec47a289 app/vmselect/netstorage: reduce memory usage when the time range from query touches big number of samples per each time series 2020-09-15 21:08:28 +03:00
Aliaksandr Valialkin
8c3d7c1a59 app/vmselect: typo fix in -search.maxStalenessInterval description 2020-09-15 14:24:27 +03:00
Aliaksandr Valialkin
0e533d1a9c app/vmselect/promql: support composite durations like Prometheus 2.21 does
The following durations are supported now: `1h5m35s` or `1s543ms`

See https://github.com/prometheus/prometheus/releases/tag/v2.21.0
and https://github.com/prometheus/prometheus/pull/7713
2020-09-11 23:39:13 +03:00
Roman Khavronenko
6ad6480400
vmalert: add Group name as label to generated alerts and timeseries (#761)
Solves #611
2020-09-11 20:52:56 +01:00
Roman Khavronenko
4cdffb04a4
vmalert: update groups on config reload only if changes detected (#759)
On config reload event `vmalert` reloads configuration for every group. While
it works for simple configurations, the more complex and heavy installations may
suffer from frequent config reloads.
The change introduces the `checksum` field for every group and is set to md5 hash
of yaml configuration. The checksum will change if on any change to group
definition like rules order or annotation change. Comparing the `checksum` field
on config reload event helps to detect if group should be updated.
The groups update is now done concurrently, so reload duration will be limited by
the slowest group now.

Partially solves #691 by improving config reload speed.
2020-09-11 20:14:30 +01:00
Aliaksandr Valialkin
ca856284e4 app/vmagent: allow setting multiple identical -remoteWrite.url values
This may be useful when each url is authenticated via different `-remoteWrite.basicAuth.username`.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/755
2020-09-11 15:17:22 +03:00
Aliaksandr Valialkin
a2f647d142 app/vmselect/prometheus: typo fix in the description for -search.latencyOffset command-line flag 2020-09-11 14:16:46 +03:00
Aliaksandr Valialkin
2380e9b017 app/{vminsert,vmagent}: allow passing timestamp via timestamp query arg when ingesting data to /api/v1/import/prometheus
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/750
2020-09-11 13:27:14 +03:00
Aliaksandr Valialkin
f0005c3007 app/vmselect: move Deadline from netstorage to searchutils
This removes dependency on netstorage from searchutils.
2020-09-11 13:27:13 +03:00
Aliaksandr Valialkin
2114179e19 app/vmselect: substitute inf values at smooth_exponential with the previous values
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/757
2020-09-11 12:24:14 +03:00
Aliaksandr Valialkin
204ec415b4 app/vmselect: skip infinite values when calculating smooth_exponential
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/757
2020-09-11 11:29:58 +03:00
Aliaksandr Valialkin
8a8b5a73d3 app/vmselect/graphite: typo fix in label name for vm_request_duration_seconds metric 2020-09-11 01:58:28 +03:00
Aliaksandr Valialkin
f6bc608e86 app/vmselect: initial implementation of Graphite Metrics API
See https://graphite-api.readthedocs.io/en/latest/api.html#the-metrics-api
2020-09-11 00:30:01 +03:00
Aliaksandr Valialkin
9d8fdff6c5 lib/storage: reuse timestamp blocks for adjancent metric blocks with identical timestamps
This should reduce disk space usage when scraping targets containing metrics with identical names
such as `node_cpu_seconds_total`, histograms, quantiles, etc.

Expose `vm_timestamps_blocks_merged_total` and `vm_timestamps_bytes_saved_total` metrics for monitoring
the effectiveness of timestamp blocks merging.
2020-09-09 23:59:32 +03:00
Aliaksandr Valialkin
d7c04db1fc docs: sync docs for vmalert, vmauth, vmbackup and vmrestore 2020-09-09 21:10:34 +03:00
Aliaksandr Valialkin
11eaa37111 docs/vmagent.md: clarified the case when -remoteWrite.queues must be tuned
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/745
2020-09-08 20:15:27 +03:00
Aliaksandr Valialkin
62919eaf7e app/vmselect/promql: go fmt 2020-09-08 15:19:59 +03:00
Aliaksandr Valialkin
e6da63dffe app/vmselect/promql: adjust integrate() calculations to be more similar to calculations from InfluxDB: attempt #2
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/701
2020-09-08 14:35:50 +03:00
Aliaksandr Valialkin
8e85b56737 app/vmselect/promql: adjust integrate() calculations to be more similar to calculations from InfluxDB
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/701
2020-09-08 14:23:39 +03:00
Aliaksandr Valialkin
c0343a661b app/vmselect/promql: increase floating point calculations accuracy by dividing by 1e3 instead of multiplying by 1e-3 2020-09-08 14:00:47 +03:00
Aliaksandr Valialkin
f4e7e5fb90 app/vmselect/promql: add count_le_over_time(m[d], le) and count_gt_over_time(m[d], gt) functions
These functions returns the number of raw samples that don't exceed `le` or are bigger than `gt`.
These functions are complement to already existing `share_le_over_time(m[d], le)` and `share_gt_over_time(m[d], gt)`.
2020-09-03 15:29:10 +03:00
Aliaksandr Valialkin
e706e59d49 app/vmselect: unconditionally align time range boundaries to step for subqueries as Prometheus does 2020-09-03 13:29:50 +03:00
Aliaksandr Valialkin
ddabc13796 app/vmagent: properly flush big blocks of data
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/741

Thanks to @IceRain00 for the investigation and initial attempt to fix the issue
at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/742
2020-09-03 12:12:39 +03:00
Aliaksandr Valialkin
7a839b461f app/vmagent: fix data race when accessing writeRequest.lastFlushTime 2020-09-03 12:12:37 +03:00
Nikolay Khramchikhin
764b3d4fda
changed vmalert behaviour (#738)
* VMAlert start with empty rules dir

There are some applications (operator for instance), that generates alerts configuration at runtime
and vmalert must start correctly without rules to support this behaviour.
Later application will add rules files and send SIGHUP to vmalert,
which will trigger reading rules files and start rules exectuion.

Removing rules files with SIGHUP signal must stop rules execution and
vmalert will wait for new rules.

* imports sorted

* added test cases for empty rules, removed blank line

* fixed imports conflict

* updated tests
2020-09-03 11:04:42 +03:00
Aliaksandr Valialkin
5f16ceb294 app/vmalert: imrovements over 3f932c2db1 2020-09-03 01:00:55 +03:00
DexterZhang
3f932c2db1
feat: spread load of rule evaluation by group when starting new groups (#724)
* feat: spread load of rule evaluation by group when starting new groups

* review: reduce the resulting diff.

* Update app/vmalert/group.go

Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>
2020-09-03 00:58:54 +03:00
Aliaksandr Valialkin
f41b36bb9a app/{vminsert,vmagent}: allow adding extra labels when importing data via Prometheus, CSV and JSON line formats
Extra labels may be added to the imported data by passing `extra_label=name=value` query args.
Multiple query args may be passed in order to add multiple extra labels.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/719
2020-09-02 19:43:21 +03:00
Aliaksandr Valialkin
5af777469a app/vmagent: log unsuccessful attempt number when sending data to -remoteWrite.url 2020-08-30 21:40:22 +03:00
Aliaksandr Valialkin
2149733bd2 app/vmagent: apply sane limits to -remoteWrite.queues
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/707
2020-08-30 21:25:37 +03:00