Aliaksandr Valialkin
aadbd014ff
all: add native format for data export and import
...
The data can be exported via [/api/v1/export/native](https://victoriametrics.github.io/#how-to-export-data-in-native-format ) handler
and imported via [/api/v1/import/native](https://victoriametrics.github.io/#how-to-import-data-in-native-format ) handler.
2020-09-27 17:36:38 +03:00
Aliaksandr Valialkin
e83947a882
app/vminsert: code prettifying
2020-09-26 04:16:14 +03:00
Aliaksandr Valialkin
8fc9b77496
app/vmagent: reduce memory usage when importing data via /api/v1/import
...
Previously vmagent could use big amounts of RAM when each ingested JSON line
contained many samples.
2020-09-26 04:10:13 +03:00
Aliaksandr Valialkin
973df09686
app/vmselect/netstorage: do not spend CPU time on unpacking empty blocks during /api/v1/series
calls
2020-09-24 20:44:15 +03:00
Aliaksandr Valialkin
b8bce348c5
app/vmselect/promql: properly limit implicitly set rollup window to -search.maxStalenessInterval
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/784
2020-09-23 23:24:09 +03:00
Aliaksandr Valialkin
543f3aea97
all: consistently use "%w" formatting in fmt.Errorf for wrapped errors
2020-09-23 22:48:21 +03:00
Aliaksandr Valialkin
8627365b48
app/vmselect/prometheus: code cleanup after 3ba507000c
2020-09-23 13:04:51 +03:00
Aliaksandr Valialkin
1fce79518a
app/vmselect/prometheus: return timestamps from /api/v1/query
, which match the time
query arg
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/720
2020-09-23 12:59:23 +03:00
Aliaksandr Valialkin
0468cdf33e
app/vmselect/netstorage: properly pre-allocate space for sbs
2020-09-22 23:51:01 +03:00
Aliaksandr Valialkin
69eb9783e6
app/vmselect: make sure the request doesnt wait in pending queue more than the configured timeout
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/711
2020-09-22 01:21:42 +03:00
Aliaksandr Valialkin
ed473c94ff
docs/vmagent.md: typo fix
2020-09-21 21:49:08 +03:00
Aliaksandr Valialkin
e564725641
app/vmselect/searchutils: fixed tests after 2eb72e09ab
2020-09-21 21:31:28 +03:00
Aliaksandr Valialkin
07c6226334
app/vmselect: use time
value rounded to seconds if it isnt passed to /api/v1/query
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/720
2020-09-21 21:24:46 +03:00
Nikolay Khramchikhin
0069353d5e
Add improvements to ec2_sd_discovery ( #775 )
...
* Add improvements to ec2 discovery
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/771
role_arn support with aws sts
instance iam_role support
refreshing temporary tokens
* Apply suggestions from code review
Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>
* changed implementation, removed tests, clean up code
* moved endpoint builder into getEC2APIResponse
Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>
2020-09-21 16:05:01 +03:00
Aliaksandr Valialkin
8cd89cb847
app/vmalert: remove unneeded UTC() call
...
UTC() doesn't change the underlying timestamp, so the call isn't needed here
2020-09-21 15:56:48 +03:00
Roman Khavronenko
d111969d39
vmalert: add support for datasource.lookback
flag ( #779 )
...
New datasource flag `datasource.lookback` defines how far to look into
past when evaluating queries.
Address https://github.com/VictoriaMetrics/VictoriaMetrics/issues/668
2020-09-21 15:56:47 +03:00
Roman Khavronenko
0042b0f307
vmalert: fix the typo in error message ( #782 )
...
The error will be always nil so no sense in printing it.
2020-09-21 11:36:09 +03:00
Aliaksandr Valialkin
1c07d7bee7
app/vmagent: increase default value for -remoteWrite.queues
from 1 to 4, since it has been appeared that many users hit this limit
2020-09-18 14:22:02 +03:00
Aliaksandr Valialkin
a1bebb660c
app/vmselect/graphite: return proper results /metrics/find?query=foo.*.bar
according to Graphite Metrics API
2020-09-18 11:53:28 +03:00
Aliaksandr Valialkin
9b15b11f74
app/vmstorage: added -forceMergeAuthKey
command-line flag for protecting /internal/force_merge
endpoint
2020-09-17 14:24:20 +03:00
Aliaksandr Valialkin
d96858b921
lib/storage: add /internal/force_merge
handler for running forced compactions on historical per-month partitions
...
This may be useful for freeing up storage space after time series deletion.
See https://victoriametrics.github.io/#force-merge for more details.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/686
2020-09-17 12:20:56 +03:00
Aliaksandr Valialkin
406f4fe445
app/vmagent: substitute -remoteWrite.url
with secret-url
value in logs, since it may contain sensitive info such as passwords or auth tokens
...
Pass `-remoteWrite.showURL` command-line flag in order to see real `-remoteWrite.url` values in logs and at `/metrics` page.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/773
2020-09-16 22:36:18 +03:00
Aliaksandr Valialkin
9793008734
app/vmselect: add -search.storageTimeout command-line flag for limiting the maximum duration of query execution per each -storageNode
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/711
2020-09-16 21:33:47 +03:00
Aliaksandr Valialkin
a9205fe308
app/vmselect: prevent from closing connection to vmstorage on query timeout by setting +2 secs deadline on connection comparing to query deadline
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/711
2020-09-16 21:14:00 +03:00
Aliaksandr Valialkin
1587f83fa0
app/vmselect/netstorage: typo fix after 03dfccfbed
2020-09-16 00:10:33 +03:00
Aliaksandr Valialkin
03dfccfbed
app/vmselect/netstorage: reduce memory usage when the time range from query touches big number of samples per each time series
2020-09-15 21:08:09 +03:00
Aliaksandr Valialkin
27cd5555e6
app/vmselect/netstorage: mention that RunParallel or Cancel must be called on the returned results from ProcessSearchQuery
2020-09-15 20:39:43 +03:00
Aliaksandr Valialkin
1f0c0b0f6b
app/vmselect: typo fix in -search.maxStalenessInterval description
2020-09-15 14:25:34 +03:00
Roman Khavronenko
e2b31590e6
vmalert: add Group name as label to generated alerts and timeseries ( #761 )
...
Solves #611
2020-09-11 23:41:12 +03:00
Roman Khavronenko
16e0bb496e
vmalert: update groups on config reload only if changes detected ( #759 )
...
On config reload event `vmalert` reloads configuration for every group. While
it works for simple configurations, the more complex and heavy installations may
suffer from frequent config reloads.
The change introduces the `checksum` field for every group and is set to md5 hash
of yaml configuration. The checksum will change if on any change to group
definition like rules order or annotation change. Comparing the `checksum` field
on config reload event helps to detect if group should be updated.
The groups update is now done concurrently, so reload duration will be limited by
the slowest group now.
Partially solves #691 by improving config reload speed.
2020-09-11 23:41:12 +03:00
Aliaksandr Valialkin
b776a93608
app/vmselect/promql: support composite durations like Prometheus 2.21 does
...
The following durations are supported now: `1h5m35s` or `1s543ms`
See https://github.com/prometheus/prometheus/releases/tag/v2.21.0
and https://github.com/prometheus/prometheus/pull/7713
2020-09-11 22:17:24 +03:00
Aliaksandr Valialkin
6382e8081a
app/vmagent: allow setting multiple identical -remoteWrite.url
values
...
This may be useful when each url is authenticated via different `-remoteWrite.basicAuth.username`.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/755
2020-09-11 15:17:29 +03:00
Aliaksandr Valialkin
d3ad0d365e
app/vmselect: move Deadline from netstorage to searchutils
...
This removes dependency on netstorage from searchutils.
2020-09-11 13:39:13 +03:00
Aliaksandr Valialkin
58d3b82ae5
app/{vminsert,vmagent}: allow passing timestamp via timestamp
query arg when ingesting data to /api/v1/import/prometheus
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/750
2020-09-11 13:28:31 +03:00
Aliaksandr Valialkin
579c20756a
app/vmselect: substitute inf values at smooth_exponential with the previous values
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/757
2020-09-11 12:23:56 +03:00
Aliaksandr Valialkin
d67e6d3d2e
app/vmselect: skip infinite values when calculating smooth_exponential
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/757
2020-09-11 11:57:53 +03:00
Aliaksandr Valialkin
06427a184f
app/vmselect/graphite: typo fix in label name for vm_request_duration_seconds metric
2020-09-11 01:59:52 +03:00
Aliaksandr Valialkin
f307e6f432
app/vmselect: initial implementation of Graphite Metrics API
...
See https://graphite-api.readthedocs.io/en/latest/api.html#the-metrics-api
2020-09-11 00:30:20 +03:00
Aliaksandr Valialkin
f5cb213ef9
lib/storage: reuse timestamp blocks for adjancent metric blocks with identical timestamps
...
This should reduce disk space usage when scraping targets containing metrics with identical names
such as `node_cpu_seconds_total`, histograms, quantiles, etc.
Expose `vm_timestamps_blocks_merged_total` and `vm_timestamps_bytes_saved_total` metrics for monitoring
the effectiveness of timestamp blocks merging.
2020-09-09 23:59:21 +03:00
Aliaksandr Valialkin
475698d2ad
docs: sync docs for vmalert, vmauth, vmbackup and vmrestore
2020-09-09 21:10:48 +03:00
Aliaksandr Valialkin
5ab57f916b
docs/vmagent.md: clarified the case when -remoteWrite.queues
must be tuned
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/745
2020-09-08 20:15:49 +03:00
Aliaksandr Valialkin
f3a79abfb4
app/vmselect/promql: go fmt
2020-09-08 15:18:57 +03:00
Aliaksandr Valialkin
4f06eed1c1
app/vmselect/promql: adjust integrate()
calculations to be more similar to calculations from InfluxDB: attempt #2
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/701
2020-09-08 14:36:23 +03:00
Aliaksandr Valialkin
0d0b606455
app/vmselect/promql: adjust integrate()
calculations to be more similar to calculations from InfluxDB
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/701
2020-09-08 14:24:02 +03:00
Aliaksandr Valialkin
db91045348
app/vmselect/promql: increase floating point calculations accuracy by dividing by 1e3
instead of multiplying by 1e-3
2020-09-08 14:01:02 +03:00
Aliaksandr Valialkin
804304c365
app/vmselect: add missing deletion for temporary files on partial responses when -search.denyPartialResponse=true
2020-09-04 02:23:12 +03:00
Aliaksandr Valialkin
478d8f8393
app/vmselect/promql: add count_le_over_time(m[d], le)
and count_gt_over_time(m[d], gt)
functions
...
These functions returns the number of raw samples that don't exceed `le` or are bigger than `gt`.
These functions are complement to already existing `share_le_over_time(m[d], le)` and `share_gt_over_time(m[d], gt)`.
2020-09-03 15:28:58 +03:00
Aliaksandr Valialkin
3490160fd0
app/vmselect: unconditionally align time range boundaries to step for subqueries as Prometheus does
2020-09-03 13:22:06 +03:00
Aliaksandr Valialkin
a3cdef6b06
app/vmagent: properly flush big blocks of data
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/741
Thanks to @IceRain00 for the investigation and initial attempt to fix the issue
at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/742
2020-09-03 12:12:12 +03:00
Aliaksandr Valialkin
de216bab41
app/vmagent: fix data race when accessing writeRequest.lastFlushTime
2020-09-03 12:12:09 +03:00
Nikolay Khramchikhin
80a9dc79fe
changed vmalert behaviour ( #738 )
...
* VMAlert start with empty rules dir
There are some applications (operator for instance), that generates alerts configuration at runtime
and vmalert must start correctly without rules to support this behaviour.
Later application will add rules files and send SIGHUP to vmalert,
which will trigger reading rules files and start rules exectuion.
Removing rules files with SIGHUP signal must stop rules execution and
vmalert will wait for new rules.
* imports sorted
* added test cases for empty rules, removed blank line
* fixed imports conflict
* updated tests
2020-09-03 11:07:40 +03:00
Aliaksandr Valialkin
7ac10ee978
app/vmalert: imrovements over 3f932c2db1
2020-09-03 01:14:30 +03:00
DexterZhang
85f49ad439
feat: spread load of rule evaluation by group when starting new groups ( #724 )
...
* feat: spread load of rule evaluation by group when starting new groups
* review: reduce the resulting diff.
* Update app/vmalert/group.go
Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>
Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>
2020-09-03 01:14:26 +03:00
Aliaksandr Valialkin
4fa97430d7
app/{vminsert,vmagent}: allow adding extra labels when importing data via Prometheus, CSV and JSON line formats
...
Extra labels may be added to the imported data by passing `extra_label=name=value` query args.
Multiple query args may be passed in order to add multiple extra labels.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/719
2020-09-02 19:47:02 +03:00
Aliaksandr Valialkin
fe08b1eb26
app/vminsert: improve error message when the data cannot be sent to vmstorage - log reroutedBR buffer size
...
This should improve debuggability for improperly configured cluster
2020-08-31 17:51:44 +03:00
Aliaksandr Valialkin
6f9c1bc078
app/vmagent: log unsuccessful attempt number when sending data to -remoteWrite.url
2020-08-30 21:40:15 +03:00
Aliaksandr Valialkin
3b1ecac04b
app/vmagent: apply sane limits to -remoteWrite.queues
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/707
2020-08-30 21:25:51 +03:00
Roman Khavronenko
08b76cb26f
vmalert: update -rule
flag description to enforce quotes using ( #709 )
...
Description for `-rule` flag uses as example specific chars like asterisks
which could be interpreted wrong by different shells. To avoid this, description
now contains quoted flag values.
See also #708
2020-08-28 09:46:35 +03:00
Roman Khavronenko
4b89da9463
lib/decimal: rename significant decimal digits
to significant figures
( #698 )
...
The previous notion was inconsistent with what `decimal.Round` does.
According to [wiki](https://en.wikipedia.org/wiki/Significant_figures ) rounding
applied to all significant figures, not just decimal ones.
2020-08-16 17:22:40 +03:00
Aliaksandr Valialkin
6aab2f4989
all: allow using KB
, MB
, GB
, KiB
, MiB
and GiB
suffixes in command-line flag values related to byte sizes or byte rates
2020-08-16 17:08:28 +03:00
Aliaksandr Valialkin
285665e93b
app/vmselect/promql: allow passing multiple args to aggregate functions such as avg(q1, q2, q3)
2020-08-15 01:15:16 +03:00
Aliaksandr Valialkin
a2021d0dde
docs/vmagent.md: mention that gaps in remote storage may appear if vmagent cannot keep up with data ingestion
2020-08-14 20:48:17 +03:00
Aliaksandr Valialkin
e7c0b2ca56
docs: update docs
2020-08-14 19:14:46 +03:00
Aliaksandr Valialkin
b996280c65
app/{vminsert,vmagent}: improve documentation for -influxListenAddr
command-line flag
2020-08-14 18:03:08 +03:00
Aliaksandr Valialkin
60c7397be5
all: support %{ENV_VAR}
placeholders in yaml configs in all the vm* components
...
Such placeholders are substituted by the corresponding environment variable values.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/583
2020-08-13 17:17:06 +03:00
Aliaksandr Valialkin
6721e47ae9
app: respect CPU limits set via cgroups
...
Update GOMAXPROCS to limits set via cgroups. This should reduce CPU trashing and reduce memory usage
for cases when VictoriaMetrics components run in containers with CPU limits.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/685
2020-08-11 23:01:03 +03:00
Aliaksandr Valialkin
62b6e54622
app/vmselect: reduce memory usage when exporting time series with big number of samples via /api/v1/export
if max_rows_per_line
is set to non-zero value
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/685
2020-08-10 20:57:43 +03:00
Aliaksandr Valialkin
c9f5c5623f
app/vmselect/netstorage: vary batch size for data unpacking depending on the available CPU cores
...
This should reduce contention on the channel with unpack work for systems with high number of CPU cores
2020-08-10 15:16:48 +03:00
Aliaksandr Valialkin
b3d4ff7ee2
app/vmstorage: improve error logging when the request times out
2020-08-10 13:17:24 +03:00
Roman Khavronenko
78afc61896
app/vmalert: extend metrics set exported by vmalert
#573 ( #654 )
...
* app/vmalert: extend metrics set exported by `vmalert` #573
New metrics were added to improve observability:
+ vmalert_alerts_pending{alertname, group} - number of pending alerts per group
per alert;
+ vmalert_alerts_acitve{alertname, group} - number of active alerts per group
per alert;
+ vmalert_alerts_error{alertname, group} - is 1 if alertname ended up with error
during prev execution, is 0 if no errors happened;
+ vmalert_recording_rules_error{recording, group} - is 1 if recording rule
ended up with error during prev execution, is 0 if no errors happened;
* vmalert_iteration_total{group, file} - now contains group and file name labels.
This should improve control over specific groups;
* vmalert_iteration_duration_seconds{group, file} - now contains group and file name labels. This should improve control over specific groups;
Some collisions for alerts and recording rules are possible, because neither
group name nor alert/recording rule name are unique for compatibility reasons.
Commit contains list of TODOs for Unregistering metrics since groups and rules
are ephemeral and could be removed without application restart. In order to
unlock Unregistering feature corresponding PR was filed - https://github.com/VictoriaMetrics/metrics/pull/13
* app/vmalert: extend metrics set exported by `vmalert` #573
The changes are following:
* add an ID label to rules metrics, since `name` collisions within one group is
a common case - see the k8s example alerts;
* supports metrics unregistering on rule updates. Consider the case when one rule
was added or removed from the group, or the whole group was added or removed.
The change depends on https://github.com/VictoriaMetrics/metrics/pull/16
where race condition for Unregister method was fixed.
2020-08-09 09:42:05 +03:00
ofen
3fea7c39be
401 Unauthorize HTTP error added ( #681 )
...
401 Unauthorize HTTP error added to trigger browser credentials pop-up promt [RFC 7235 https://developer.mozilla.org/en-US/docs/Web/HTTP/Authentication ]
2020-08-09 09:39:37 +03:00
Aliaksandr Valialkin
67cacb22ac
lib/httpserver: add -tls
, -tlsCertFile
and -tlsKeyFile
command-line flags in every vm binary
...
This makes such binaries compatible with binaries from `master` branch (aka single-node version)
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/677
2020-08-07 10:57:32 +03:00
Aliaksandr Valialkin
95a8c492ef
app/vmselect/promql: properly handle -n^m
like Prometheus does
...
`-n^m` must be handled as `-(n^m)` instead of `(-n)^m`.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/675
2020-08-07 07:42:42 +03:00
Aliaksandr Valialkin
7f93d61a56
app/vmselect/promql: remove metric name after applying clamp_min
and clamp_max
functions in order to be consistent with Prometheus
...
This improves VictoriaMetrics score at https://promlabs.com/promql-compliance-test-results-victoriametrics/
2020-08-06 23:42:55 +03:00
Aliaksandr Valialkin
01000505a0
app/vmselect/promql: remove metric name after applying ceil
, floor
and round
functions in order to be more consistent with Prometheus
...
This improves VictoriaMetrics score at https://promlabs.com/promql-compliance-test-results-victoriametrics/
2020-08-06 23:34:03 +03:00
Aliaksandr Valialkin
75bff1a567
app/vmselect/promql: remove metric name from results of certain rollup functions in order to be consistent with Prometheus
...
Rollup functions:
- avg_over_time
- min_over_time
- max_over_time
- quantile_over_time
This improves VictoriaMetrics results at https://promlabs.com/promql-compliance-test-results-victoriametrics/
2020-08-06 23:29:18 +03:00
Aliaksandr Valialkin
8835004a4c
app/vmselect: properly handle PromQL queries like scalar1 < metric < scalar2
like Prometheus does
...
This fixes some cases from https://promlabs.com/promql-compliance-test-results-victoriametrics/
2020-08-06 23:21:14 +03:00
Aliaksandr Valialkin
14ddb8a34e
app/vmselect/netstorage: reduce CPU contention when upacking time series blocks by unpacking batches of such blocks instead of a single block
...
This should improve query performance on systems with big number of CPU cores (16 and more)
2020-08-06 17:50:13 +03:00
Aliaksandr Valialkin
46c98cd97a
app/vmselect/netstorage: reduce contention on unpackworkCh and timeseriesWorkCh for multi-CPU system by providing more capacity for these chans
2020-08-06 17:22:39 +03:00
Aliaksandr Valialkin
a455930ab4
app/vmstorage: rename vm_cache_size_entries{type="storage/prefetchedMetricIDs"}
to vm_cache_entries{type="storage/prefetchedMetricIDs"}
to be consistent with other vm_cache_entries
metrics
2020-08-06 16:34:18 +03:00
Aliaksandr Valialkin
a3e91c593b
lib/storage: limit the number of concurrent calls to storage.searchTSIDs to GOMAXPROCS*2
...
This should limit the maximum memory usage and reduce CPU trashing on vmstorage
when multiple heavy queries are executed.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648
2020-08-05 18:27:21 +03:00
Aliaksandr Valialkin
a930460236
app/vmagent: tune http client for sending data to remote storage in order to disable closing keep-alive connections
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/663
2020-08-04 21:01:40 +03:00
Aliaksandr Valialkin
a04f4a3d9a
app/vmselect: use warning level instead of info level for logging slow queries that take longer than -search.logSlowQueryDuration
2020-08-04 20:24:38 +03:00
Aliaksandr Valialkin
bdb881c43b
app/vmselect/promql: add zscore-related functions: zscore_over_time(m[d])
and zscore(q) by (...)
2020-08-03 21:52:15 +03:00
Aliaksandr Valialkin
94471a1273
app: remove duplicate *-pure makefile rules
2020-07-31 20:01:30 +03:00
Aliaksandr Valialkin
a2aa3a60eb
app/vmselect: show X-Forwarded-For
contents on /api/v1/status/active_queries
page
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/659
2020-07-31 20:01:09 +03:00
Aliaksandr Valialkin
106e302d7a
all: add mssing APP_NAME to vm*-GOARCH builds
2020-07-31 13:45:32 +03:00
Aliaksandr Valialkin
945645f38f
docs/{vmagent,vmalert}: add instruction on how to build for ARM
2020-07-31 09:25:41 +03:00
Aliaksandr Valialkin
0c00fe70cf
app/vmselect: do not adjust start
and end
query args passed to /api/v1/query_range
when -search.disableCache
command-line flag is set
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/563
2020-07-30 23:14:56 +03:00
Aliaksandr Valialkin
29bbab0ec9
lib/storage: remove prioritizing of merging small parts over merging big parts, since it doesn't work as expected
...
The prioritizing could lead to big merge starvation, which could end up in too big number of parts that must be merged into big parts.
Multiple big merges may be initiated after the migration from v1.39.0 or v1.39.1. It is OK - these merges should be finished soon,
which should return CPU and disk IO usage to normal levels.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/618
2020-07-30 20:02:22 +03:00
Aliaksandr Valialkin
338ee47d60
app/vmselect/promql: return non-empty value from rate_over_sum(m[d])
even if a single data point is located in the given [d]
window
...
Just divide the data point value by the window duration in this case.
2020-07-29 12:37:34 +03:00
Aliaksandr Valialkin
717c554fb0
app/vmselect/promql: remove rollupFuncArg.realPrevValue handling, since the corner case in increase()
is handled in another way now
...
See e00cfc854d
for the approach used now.
2020-07-29 12:37:34 +03:00
Aliaksandr Valialkin
d9037b3970
app/vmselect/promql: fill gaps with 0 in rate_over_sum
response when the last value before the selected time window isnt empty
2020-07-29 12:37:34 +03:00
Aliaksandr Valialkin
f6d4275087
app/{vmagent,vminsert}: properly preserve db
tag from query string passed to Influx line protocol query
...
Previously `db` tag from the query string wasn't added to metrics after encountering `db` tag in the Influx line
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/653
2020-07-28 21:25:49 +03:00
Aliaksandr Valialkin
baebe86844
app/vmagent/remotewrite: add missing resp.Body.Close()
after pushing data to remote storage
...
Missing body close could disable HTTP keep-alive connections.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/653
2020-07-28 21:00:25 +03:00
Aliaksandr Valialkin
0f6f0d30d3
app/vmselect: show query origin (aka remote_addr or client address) on the /api/v1/status/active_queries
page for every query
2020-07-28 15:14:40 +03:00
Roman Khavronenko
ec6ed467c6
app/vmalert: support external.label
to specify global labelset for all rules #622 ( #652 )
...
`external.label` flag supposed to help to distinguish alert or recording rules
source in situations when more than one `vmalert` runs for the same datasource
or AlertManager.
2020-07-28 14:23:04 +03:00
Aliaksandr Valialkin
9dccedc599
app/vmselect/promql: return empty values from group()
if all the time series have no values at the given timestamp
...
This aligns `group()` behaviour to Prometheus
2020-07-28 13:41:04 +03:00
Aliaksandr Valialkin
d5057f6d04
app/vmagent/remotewrite: create new request on failure to send a block of data to remote storage
...
Previously the request body was already consumed before the retry, so this led to the following error:
http: ContentLength=... with Body length 0
2020-07-27 17:33:05 +03:00
Aliaksandr Valialkin
b191e425b3
app/vmselect/promql: improve further the accuracy of buckets_limit() function
...
The accuracy is increased by mergin the smallest bucket with the smallest adjacent bucket.
2020-07-26 12:10:56 +03:00
Aliaksandr Valialkin
43871e79c6
app/vmselect/promql: avoid dropping inf
bucket in buckets_limit
...
The `le="inf"` bucket must be preserved in order to maintain the maximum level of accuracy.
2020-07-25 17:00:25 +03:00
Aliaksandr Valialkin
978c1e930e
app/vmselect/promql: optimize buckets_limit(k, buckets)
for big number of buckets
2020-07-25 13:24:33 +03:00
Aliaksandr Valialkin
51cbf27077
app/vmselect/promql: improve the accuracy of buckets_limit(k, buckets)
function
...
Now it properly merges the bucket with the previous bucket after deletion.
2020-07-24 17:07:30 +03:00
Aliaksandr Valialkin
cf69b1ea6f
app/vmselect/promql: add buckets_limit(k, buckets)
function, which limits the number of buckets per time series to k
...
This function works with both Prometheus-style and VictoriaMetrics-style buckets.
The function removes buckets with the lowest values in order to reserve the highest precision.
The function is useful for building heatmaps in Grafana from too big number of buckets.
2020-07-24 16:14:12 +03:00
Aliaksandr Valialkin
45334f61de
app/vmselect: fix tests for rate_over_sum
2020-07-24 02:35:09 +03:00
Aliaksandr Valialkin
3526e8768a
app/vmselect/promql: typo fix after 3e557c9861
2020-07-24 02:15:23 +03:00
Aliaksandr Valialkin
8d1721d128
app/vmselect/promql: add rate_over_sum(m[d])
function to MetricsQL, which returns rate over sum of m
values over d
duration
...
Something like `sum_over_time(m[d]) / d`, but more accurate.
2020-07-24 01:17:15 +03:00
Aliaksandr Valialkin
88e8bed0c9
app/vmselect/promql: allow setting [d]
window smaller than the interval between raw points for avg_over_time
...
This makes `avg_over_time` behavior consistent with `sum_over_time` and `count_over_time` behaviors.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/636
2020-07-23 22:25:33 +03:00
Aliaksandr Valialkin
fb3d1380ac
lib/storage: respect -search.maxQueryDuration
when searching for time series in inverted index
...
Previously the time spent on inverted index search could exceed the configured `-search.maxQueryDuration`.
This commit stops searching in inverted index on query timeout.
2020-07-23 21:22:05 +03:00
Aliaksandr Valialkin
dbf3038637
lib/storage: add more fine-grained pace limiting for search
2020-07-23 19:21:49 +03:00
Aliaksandr Valialkin
16a4b1b20c
app/vmselect/netstorage: protect from too smart compiler, which may break memory usage optimization in tmpBlocksFileWrapper.WriteBlocks
2020-07-23 17:57:24 +03:00
Aliaksandr Valialkin
0750d2cec1
app/vminsert: export vm_relabel_metrics_dropped_total
metric that shows the number of metrics dropped due to relabeling
2020-07-23 14:58:02 +03:00
Aliaksandr Valialkin
55ed07add1
app/vmselect: typo fix after 0168e21fe32776e2f7f003f88e0e6e490eb2dcb0g
2020-07-23 14:11:15 +03:00
Aliaksandr Valialkin
7aa5b48508
app/vmselect: reduce memory usage when querying big number of time series with long labels
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/646
2020-07-23 13:48:58 +03:00
Aliaksandr Valialkin
49a0011837
app/vminsert: do not call ApplyRelabeling function if relabeling is disabled
...
This should reduce CPU usage a bit when `-relabelConfig` isn't set
2020-07-23 13:35:36 +03:00
Aliaksandr Valialkin
c91ccce50c
app/vminsert: fix relabeling for metrics ingested via Influx line protocol
...
Previously the enabled relabeling with `-relabelConfig` command-line flag could result in missing labels
if a single Influx line protocol message contains multiple field values.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/638
2020-07-23 13:25:37 +03:00
Aliaksandr Valialkin
b8303afcd8
lib/storage: improve prioritizing of data ingestion over querying
...
Prioritize also small merges over big merges.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/291
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648
2020-07-23 01:40:38 +03:00
Aliaksandr Valialkin
20d0c41ac5
app/vmselect/prometheus: support d
, w
and y
suffixes for durations passed to step
in /api/v1/query_range
like Prometheus does
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/641
2020-07-22 16:27:27 +03:00
Aliaksandr Valialkin
bd4299fafe
app/vmselect/netstorage: reduce memory allocations when unpacking time series data by using a pool for unpackWork entries
...
This should slightly reduce load on GC when processing queries that touch big number of time series.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/646 according to the provided memory profile
2020-07-22 15:04:42 +03:00
Aliaksandr Valialkin
a3f48e395e
app/vmagent: add -remoteWrite.decimalPlaces
command-line flag, which may be used for reducing disk space usage on the remote storage
2020-07-21 21:55:42 +03:00
Aliaksandr Valialkin
5bb4fe1ba4
app/vmselect: take into account the time spent in wait queue before query execution as time spent on the query
2020-07-21 19:00:00 +03:00
Aliaksandr Valialkin
0755cb3b50
app/vmselect/promql: skip the first value in time series passed to increase()
if it exceeds by more than 10x the delta between the next value and the first value
...
This should prvent from inflated `increase()` results for time series that start from big initial values.
Such cases may occur when a label value changes in a metric without counter reset.
2020-07-21 17:24:28 +03:00
Aliaksandr Valialkin
71eba8dcf5
app/vmselect: log the total available memory for concurrent requests on not enough memory
errors
...
This should simplify root cause analysis
2020-07-20 19:51:58 +03:00
Aliaksandr Valialkin
3b246aa569
app/vmagent: add -remoteWrite.proxyURL
command-line option
...
This option allows writing data to `-remoteWrite.url` via http, https or socks5 proxy.
This is similar to `proxy_url` option in `remote_write` section of Prometheus.
See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write
2020-07-20 19:31:08 +03:00
Aliaksandr Valialkin
8bee3ef91b
docs/vmagent.md: sync with app/vmagent/README.md
2020-07-20 17:09:30 +03:00
Roman Khavronenko
8949ec961d
app/vmagent: mention grafana dashboard in README ( #639 )
2020-07-20 17:09:27 +03:00
Aliaksandr Valialkin
86b54f3768
app/vmagent/remotewrite: allow passing empty -remoteWrite.urlRelabelConfig
entries
2020-07-20 15:49:13 +03:00
Aliaksandr Valialkin
141e84b5a4
app/vmselect/prometheus: do not return time series with empty list of datapoints from /api/v1/query_range
...
This matches Prometheus behaviour.
This should fix https://github.com/jacksontj/promxy/issues/329
2020-07-20 15:30:13 +03:00
Aliaksandr Valialkin
4d2011a87d
app/vmselect/promql: add mode()
aggregate function
2020-07-20 15:30:11 +03:00
Aliaksandr Valialkin
31ef39e8da
lib/httpserver: log remote address in error message from httpserver.Errorf
...
This should improve detection of the root cause of errors.
Thanks to Anant for the idea.
2020-07-20 14:06:29 +03:00
Aliaksandr Valialkin
427fa43ce2
app/vmselect/promql: add mode_over_time(m[d])
function
...
See https://en.wikipedia.org/wiki/Mode_(statistics) and https://stackoverflow.com/questions/61134078/promql-query-to-return-the-value-from-a-range-vector-which-occurs-maximum-no-of
2020-07-17 18:29:10 +03:00
Aliaksandr Valialkin
eb402a17bd
app/vmselect/promql: optimize group(rollup(m))
calculations
2020-07-17 16:47:30 +03:00
Aliaksandr Valialkin
ea8dc85ba8
app/vmselect/promql: check that any()
doesn't touch metric name
2020-07-17 16:23:11 +03:00
Aliaksandr Valialkin
fc8fe38a82
app/vmselect/promql: add group()
aggregate function to MetricsQL
...
This function has been added in Prometheus 2.20. See https://github.com/prometheus/prometheus/pull/7480
2020-07-17 15:17:38 +03:00
Aliaksandr Valialkin
c64914a7e4
app/vmselect/promql: keep all labels for time series from any()
call
2020-07-17 15:17:37 +03:00
Aliaksandr Valialkin
f9b38f7f2d
app/vminsert/influx: properly handle the case when certain labels with empty values are removed by ApplyRelabeling() call
...
Previously this could lead to `out of range` panic
2020-07-17 00:05:24 +03:00
Aliaksandr Valialkin
14dc426b45
app/vmselect: fix nil pointer dereference
panic when unsuccessfully querying vmstorage
2020-07-16 19:15:18 +03:00
Aliaksandr Valialkin
ce381b3868
app/vmalert: consistently use "%w" instead of "%s" in fmt.Errorf
when wrapping errors
2020-07-15 13:55:13 +03:00
Aliaksandr Valialkin
e6d96bb0bd
docs/vmagent.md: make filtering rules for init container pods less confusing
2020-07-14 20:33:19 +03:00
Aliaksandr Valialkin
c2b4b9138d
app/vmagent/remotewrite: return proper value from tssRelabelPool.New
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/599
2020-07-14 14:28:14 +03:00
Aliaksandr Valialkin
86044f6561
app/{vminsert,vmagent}: add -influxSkipMeasurement
command-line flag for using field name as metric name
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/626
2020-07-14 14:18:40 +03:00
Aliaksandr Valialkin
0e7b2008b2
app/vmselect/prometheus: do not adjust last points in time series with timestamps exceeding the current time
...
Such timestamps usually mean that the query contains `offset`.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/625
2020-07-14 12:56:21 +03:00
Aliaksandr Valialkin
3898cc0285
app/vmselect/prometheus: minimize the diff for the change 1033dc7e2a
over 619b0a25c9
2020-07-13 21:41:17 +03:00
faceair
bf39e67ade
fix empty response template ( #617 )
2020-07-13 21:41:15 +03:00
Aliaksandr Valialkin
b6a5c29549
docs/vmagent.md: sync with app/vmagent/README.md
2020-07-13 21:26:00 +03:00
ofen
9ffa688846
Update README.md ( #621 )
...
Troubleshooting section updated to help out with duplicate targets detection
2020-07-13 21:25:59 +03:00
Aliaksandr Valialkin
4353ff7ef1
app/vmagent: fix data race when multiple -remoteWrite.urlRelabelConfig
options are set
...
Previously multiple goroutines could access remoteWriteCtx.tss concurrently, which could lead to data race
and improper relabeling. Now each goroutine has its own copy of tss during relabeling.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/599
2020-07-10 15:17:23 +03:00
Aliaksandr Valialkin
805a90f642
app/vmagent/remotewrite: typo fix in -remoteWrite.showURL
help message
2020-07-10 14:07:14 +03:00
Aliaksandr Valialkin
6373d377ef
app/{vminsert,vmagent}: add ability to import data in Prometheus exposition format via /api/v1/import/prometheus
2020-07-10 12:13:28 +03:00
Aliaksandr Valialkin
d449d0a0e1
app/vmselect/promql: add missing tests for ifnot
binary operation
2020-07-09 13:24:12 +03:00
Aliaksandr Valialkin
7e706eea13
app/vmselect/promql: refactor implementations for and
and unless
binary operations, so they are closer to or
implementation
2020-07-09 13:06:01 +03:00
Aliaksandr Valialkin
6c1a47b5e0
app/vmselect/promql/active_queries.go: simplify code a bit by inlining getNextActiveQueryID function
2020-07-09 11:18:53 +03:00
Aliaksandr Valialkin
fb86071552
app/vmselect: add /api/v1/status/active_queries
page with the list of currently running queries
...
This is a follow-up for https://github.com/VictoriaMetrics/VictoriaMetrics/pull/575
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/528
2020-07-08 19:09:31 +03:00
DexterZhang
9930ce1fa9
Feat/query list vmselect ( #575 )
...
* feat(vmselect): add support for listing current running queries and canceling specific query
* fix(vmselect): change current queries' pid from int64 counter to uuid
* feat(vmselect): add auth to internal operations like `/resetRollupResultCache`, `/query/list` and `/query/kill`. add flag `internalAuthKey` for these auth
* fix(vmselect): add more info to current queries
* review: delete some unnecessary code and use function instead of init
* review: returen *queriesMap in newQueriesMap
* review: delete unused var in struct queriesMap, add comments to exported functions
* review: add return if error occurs
* feat(vmselect): truncate query string in current running query list API since the size of query string might be large;
use query string's pointer in struct `query` for the same reason;
add query info API to get full access of query's info;
2020-07-08 19:04:29 +03:00
Aliaksandr Valialkin
0bff96fe4b
lib/storage: prioritize data ingestion over heavy queries
...
Heavy queries could result in the lack of CPU resources for processing the current data ingestion stream.
Prevent this by delaying queries' execution until free resources are available for data ingestion.
Expose `vm_search_delays_total` metric, which may be used in for alerting when there is no enough CPU resources
for data ingestion and/or for executing heavy queries.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/291
2020-07-05 19:44:04 +03:00
Roman Khavronenko
9afd19d375
app/vmalert: add retries to remotewrite ( #605 )
...
* app/vmalert: add retries to remotewrite
Remotewrite pkg now does limited number of retries if write request failed.
This suppose to make vmalert state persisting more reliable.
New metrics were added to remotewrite in order to track rows/bytes sent/dropped.
defaultFlushInterval was increased from 1s to 5s for sanity reasons.
* fix
* wip
* wip
* wip
* fix bits alignment bug for 32-bit systems
* fix mistakenly dropped field
2020-07-05 18:47:38 +03:00
Aliaksandr Valialkin
82871fb7a5
app/vmselect/prometheus: small fixes on top of 8bb762124a
2020-07-05 18:17:53 +03:00
faceair
17f175ff5a
fix adjust last points avoid influence earlier value ( #606 )
2020-07-05 18:17:52 +03:00
Ween
d28fb0baf9
[VMAlert] Fix error log when remoteWrite queue size is full ( #602 )
...
* Fix Auto metrics relabeled errors
* Finalize auto-genenated Labels
* Fix Test Errors
* fix error logs when queue is full
Co-authored-by: xinyulong <xinyulong@kuaishou.com>
2020-07-03 16:50:43 +03:00
Aliaksandr Valialkin
8bb3622e9d
app/vminsert: prevent from adding and/or selecting labels with empty values
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/600
2020-07-02 23:17:12 +03:00
Aliaksandr Valialkin
6ebac3ab63
app/vminsert: add ability to apply relabeling to all the incoming metrics if -relabelConfig
command-line arg points to a file with a list of relabel_config
entries
...
See https://victoriametrics.github.io/#relabeling
2020-07-02 20:36:33 +03:00
Aliaksandr Valialkin
a45856570b
all: typo fix: exptected -> expected
2020-07-02 18:06:21 +03:00
Aliaksandr Valialkin
f10e8809c0
app/vmselect: add interpolate
function for filling gaps with linearly interpolated values
...
See https://stackoverflow.com/q/62565021/274937 for details
2020-07-02 14:54:46 +03:00
Aliaksandr Valialkin
2361ad8ab4
lib/promscrape: add ability to set disable_compression
and disable_keepalive
options in scrape_config
section of the config passed to -promscrape.config
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/580
2020-07-02 14:19:34 +03:00
BigFish
aa26b94f33
fix: spelling mistakes ( #594 )
...
Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2020-07-01 01:36:40 +03:00
Aliaksandr Valialkin
4cb3e7595c
app/vmstorage: add -denyQueriesOutsideRetention
command-line flag for denying queries outside the configured retention
2020-07-01 00:58:42 +03:00
Aliaksandr Valialkin
0c4e8aeb2b
all: use errors.As
for inspecting errors that implement httpserver.ErrorWithStatusCode
2020-07-01 00:03:11 +03:00
Aliaksandr Valialkin
d962568e93
all: use %w instead of %s for wrapping errors in fmt.Errorf
...
This will simplify examining the returned errors such as httpserver.ErrorWithStatusCode .
See https://blog.golang.org/go1.13-errors for details.
2020-06-30 23:33:46 +03:00
Roman Khavronenko
156c83d112
app/vmalert: support multiple notifier urls ( #584 ) ( #590 )
...
* app/vmalert: support multiple notifier urls (#584 )
User now can set multiple notifier URLs in the same fashion
as for other vmutils (e.g. vmagent). The same is correct for
TLS setting for every configured URL. Alerts sending is done
in sequential way for respecting the specified URLs order.
* app/vmalert: add basicAuth support for notifier client (#585 )
The change adds possibility to set basicAuth creds for notifier
client in the same fasion as for remote write/read and datasource.
2020-06-29 22:21:56 +03:00
Roman Khavronenko
bbeab70de6
app/vmalert: move flags description and initialization into subpackages
...
The change adds no new functionality and aims to move flags definitions
to subpackages that are using them. This should improve readability
of the main function.
2020-06-29 22:18:29 +03:00
kreedom
63c36e2e69
app/vmalert: properly set transport for HTTP clients
...
Fixes issue #586
2020-06-29 22:18:25 +03:00
Aliaksandr Valialkin
2b504f17de
docs: update the info that docker images are built on top of alpine
image now
...
A follow-up after the commit ff624c9125
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/522
2020-06-26 13:52:25 +03:00
Aliaksandr Valialkin
a586b8b6d4
app/vminsert/netstorage: do not re-route every time series to more than two vmstorage nodes when certain vmstorage nodes are temporarily slower than the rest of them
...
Previously vminsert may spread data for a single time series across all the available vmstorage nodes
when vmstorage nodes couldn't handle the given ingestion rate. This could lead to increased usage
of CPU and memory on every vmstorage node, since every vmstorage node had to register all the time
series seen in the cluster. Now a time series may spread to maximum two vmstorage nodes under heavy load.
Every time series is routed to a single vmstorage node under normal load.
2020-06-25 16:42:37 +03:00
Aliaksandr Valialkin
12b87b2088
app/vmselect/netstorage: reset big result values every 10 seconds instead of after processing every time series
...
This should reduce GC pressure when processing time series with big number of rows
2020-06-24 19:37:35 +03:00
nicbaz
46c5c0772c
vmselect: fix label_replace when mismatch ( #579 )
...
As per documentation on `label_replace` function: "If the regular
expression doesn't match then the timeseries is returned unchanged".
Currently this behavior is not enforced, if a regexp on an existing
tag doesn't match then the tag value is copied as-is in the destination
tag. This fix first checks that the regular expression matches the
source tag before applying anything.
Given the current implementation, this fix also changes the behavior
of the **MetricsQL** `label_transform` function which does not
document this behavior at the moment.
2020-06-23 23:54:29 +03:00
nicbaz
ea2ed4b7e8
vmalert: add support for TLS configuration ( #578 )
...
app/vmalert: add support for TLS configuration
Add support for TLS optional configuration in a similar fashion to what
is currently supported in other vmutils such as vmagent. TLS
configuration options are distinct for datasource, remoteRead,
remoteWrite as well as notifier.
2020-06-23 22:47:23 +03:00
Aliaksandr Valialkin
0fdbe5de25
app/vmselect/netstorage: increase concurrency when processing small number of time series with big number of data points per each time series
...
Previously VictoriaMetrics was processing up to 32 time series in a single goroutine.
This could be slow if each time series contains big number of data points (10M+ or more), since only a single CPU core could be loaded with work,
while other CPU cores were idle. Fix this by launching GOMAXPROCS workers for time series processing.
This should help with https://github.com/VictoriaMetrics/VictoriaMetrics/issues/572
2020-06-23 22:45:57 +03:00
Aliaksandr Valialkin
3a444bb7bb
lib/promrelabel: add support for keep_if_equal
and drop_if_equal
actions to relabel configs
...
These actions may be useful for filtering out unneeded targets and/or metrics if they contain equal label values.
For example, the following rule would leave the target only if __meta_kubernetes_annotation_prometheus_io_port
equals __meta_kubernetes_pod_container_port_number:
- action: keep_if_equal
source_labels: [__meta_kubernetes_annotation_prometheus_io_port, __meta_kubernetes_pod_container_port_number]
2020-06-23 17:29:19 +03:00
kreedom
f227799c87
Support of custom URL path for alert ( #560 )
...
app/vmalert: Support custom URL for alerts source
Add flag `external.alert.source` for configuring custom URL
for alert's source. This may be handy to re-point default source
URL to other systems like Grafana.
Updates #517
2020-06-21 16:33:58 +03:00
Aliaksandr Valialkin
70bf8218bb
app/vmselect/promql: properly override label values from group_left
and group_right
lists like Prometheus does
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/577
2020-06-21 16:32:27 +03:00
Aliaksandr Valialkin
2fc2679a3f
app/vminsert/netstorage: remove possible race condition when broken connection may be recovered before acquiring storageNode.bcLock
2020-06-20 16:38:08 +03:00
Aliaksandr Valialkin
9409a31c07
docs/vmauth.md: mention that we can provide custom integration with SAML
2020-06-19 13:13:53 +03:00
Aliaksandr Valialkin
4400700832
app/vminsert: properly replicate data for the last RF-1
storage nodes for -replicationFactor=RF
...
Previously the data for the last `RF-1` storage noes has been incorrectly replicated to the first storage node.
2020-06-19 12:40:22 +03:00
Aliaksandr Valialkin
4f673a5201
app/vminsert: export metrics for determining ingested rows with dropped or truncated labels
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/565
2020-06-19 01:12:44 +03:00
Aliaksandr Valialkin
6939e36fdd
app/vmselect/promql: fill gaps on right side with values from left side of or
operator in the same way as Prometheus does
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/552
2020-06-18 23:05:23 +03:00
Aliaksandr Valialkin
85c1ccb8b8
app/vminsert/netstorage: add missing return
in storageNode.checkHealth on connection failure
2020-06-18 20:51:51 +03:00
Aliaksandr Valialkin
464682f380
app/vminsert/netstorage: periodically check for each -storageNode
health, so it could be marked as healthy when it is ready to accept data
...
This fixes uneven data routing in cluster version when `-replicationFactor` is set to 1 (default value),
i.e. when the replication is disabled.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/546
2020-06-18 20:42:43 +03:00
Roman Khavronenko
1a01fe2cf2
vmalert-537: allow name duplication for rules within one group. ( #559 )
...
Uniqueness of rule is now defined by combination of its name, expression and
labels. The hash of the combination is now used as rule ID and identifies rule within the group.
Set of rules from coreos/kube-prometheus was added for testing purposes to
verify compatibility. The check also showed that `vmalert` doesn't support
`query` template function that was mentioned as limitation in README.
2020-06-18 18:54:35 +03:00
Aliaksandr Valialkin
87151e825e
docs/vmbackup.md: mention that backups from single-node and cluster versions are incompatible
2020-06-18 18:54:34 +03:00
Aliaksandr Valialkin
cc2225cc49
app/vmselect: fix the error after 936f35920a
2020-06-12 22:00:45 +03:00
Aliaksandr Valialkin
936f35920a
app/vmselect/prometheus: allow returning partial response from /api/v1/export
if -search.denyPartialResponse=false
...
This makes `/api/v1/export` behaviour consistent with other `/api/v1/*` handlers.
2020-06-12 21:11:48 +03:00
Clémence Saussez
0b53e380cf
app/vmalert: fix link to testdata ( #547 )
...
Fix broken link to vmalert test data
Signed-off-by: Clemence Saussez <clemence@zen.ly>
2020-06-10 19:37:21 +03:00
Roman Khavronenko
d71b6e6584
vmalert-491: allow to configure concurrent rules execution per group. ( #542 )
...
The feature allows to speed up group rules execution by
executing them concurrently.
Change also contains README changes to reflect configuration
details.
2020-06-09 15:22:11 +03:00
Roman Khavronenko
5c049bf4dd
vmalert-521: allow to disable rules expression validation. ( #536 )
...
This feature may be useful for using `vmalert` with PromQL
compatible datasources like Loki.
2020-06-09 15:19:25 +03:00
Aliaksandr Valialkin
c1be462d42
app/vmauth: disable automatic response compression/uncompression, since it may work improperly in some cases
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/535
2020-06-05 20:14:07 +03:00
Aliaksandr Valialkin
7680b7155d
app/vmauth: emit fatal errors instead of panics when incorrect command-line flags are set
2020-06-05 20:14:05 +03:00
Aliaksandr Valialkin
01719f4949
app/vmstorage/transport: simplify setupTfss in order to prevent the possibility of nil tfs
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/534
2020-06-05 13:17:26 +03:00
Aliaksandr Valialkin
e4cef1b678
app/vmstorage: prevent from serving conns from vminsert and vmselect after the server is closed
...
Previously it was possible that the connection is served after the server is closed if the following
steps are performed:
1) Server accepts new connection.
2) Server.MustClose() is called and successfully finished.
3) Server starts processing the connection accepted at step 1. There could be various crashes
like in https://github.com/VictoriaMetrics/VictoriaMetrics/issues/534 since the storage may be already closed.
Now the server closes the connection at step 3 without processing it.
2020-06-05 11:55:48 +03:00
Aliaksandr Valialkin
58069f5a6a
app/vmalert: print brief usage info for vmalert -help
2020-06-05 10:43:24 +03:00
Aliaksandr Valialkin
3848ea3a4a
app/vmauth: print brief usage info for vmauth -help
2020-06-05 10:40:11 +03:00
Aliaksandr Valialkin
8ad8ca350a
app/vmagent: print brief usage info for vmagent -help
2020-06-05 10:40:10 +03:00
Aliaksandr Valialkin
3d0a0b3785
lib/fs: optimize MustGetFreeSpace performance by caching the results for up to 2 seconds
2020-06-04 13:14:04 +03:00
DexterZhang
fa103875a0
feat(vmselect): add tmp block dir size metrics vm_tmp_blocks_files_size_total
( #527 )
...
* feat(vmselect): add tmp block dir size metrics `vm_tmp_blocks_files_size_total`
* refactor(vmselect): use free space instead of used space in tmp block file metrics
* fix: add `bytes` suffix to tmp dir free space metric
2020-06-04 13:05:50 +03:00
Aliaksandr Valialkin
faea804b88
app/vmauth: log when -auth.config is reloaded in SIGHUP
2020-06-03 23:22:20 +03:00
Aliaksandr Valialkin
045b87c662
app/vmalert: fix comment for UpdateWith exported methods
2020-06-01 14:35:03 +03:00
Aliaksandr Valialkin
43b14b9569
app/vminsert/netstorage: free up unused memory in buffer after memory usage spikes
2020-06-01 14:33:35 +03:00
Roman Khavronenko
44c51c627f
vmalert: Add recording rules support. ( #519 )
...
* vmalert: Add recording rules support.
Recording rules support required additional service refactoring since
it wasn't planned to support them from the very beginning. The list
of changes is following:
* new entity RecordingRule was added for writing results of MetricsQL
expressions into remote storage;
* interface Rule now unites both recording and alerting rules;
* configuration parser was moved to separate package and now performs
more strict validation;
* new endpoint for listing all groups and rules in json format was added;
* evaluation interval may be set to every particular group;
* vmalert: uncomment tests
* vmalert: rm outdated TODO
* vmalert: fix typos in README
2020-06-01 13:53:46 +03:00
Aliaksandr Valialkin
37aa4fe282
app/vmagent: reload -remoteWrite.relabelConfig and -remoteWrite.urlRelabelConfig on SIGHUP and on /-/reload
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/518
2020-05-30 14:37:02 +03:00
Aliaksandr Valialkin
a646131a33
app/vmagent: log fatal errors instead of panics when improper command-line flags are passed to vmagent
2020-05-30 14:22:38 +03:00
Aliaksandr Valialkin
f41a01332a
app/vminsert/netstorage: evenly distribute rerouted rows among all the availalbe storage nodes
...
Previously such rows were distributed to the original storage node or to the next storage node.
This may result to uneven load among the remaining storage nodes.
2020-05-30 13:51:09 +03:00
Aliaksandr Valialkin
02b2064d8e
app/vminsert/netstorage: do not increment vm_rpc_rows_lost_total when all the vmstorage nodes are unavailable, since vminsert retries sending the data instead of dropping it
2020-05-28 22:36:56 +03:00
Aliaksandr Valialkin
7a61357b5d
app/vminsert/netstorage: make sure that the the data is always replicated among -replicationFactor vmstorage nodes
...
Previously vminsert could write multiple copies of the data to a single vmstorage node when the ingestion rate
exceeds the maximum throughput for connections to vmstorage nodes.
2020-05-28 19:59:07 +03:00
Aliaksandr Valialkin
77e5165e7b
app/vminsert: add -replicationFactor
command-line flag for enabling data replication among available -storageNode instances
2020-05-27 17:29:44 +03:00
Aliaksandr Valialkin
b4e3bffe4b
app/vminsert/netstorage: emit warnings instead of errors when re-routing data to healthy storage nodes
2020-05-27 16:31:41 +03:00
Aliaksandr Valialkin
75f2f3b09d
app/vminsert/netstorage: improve ingestion performance when a single vmstorage node is slower than other vmstorage nodes
...
Previously the ingestion performance has been limited by the slowest vmstorage node.
Now vminsert should re-route data from the slowest vmstorage node to the remaining nodes.
2020-05-27 15:08:22 +03:00
Aliaksandr Valialkin
9844845d79
app/vminsert: tune the maximum summary buffer size for pending data to 1/4 of available RAM, since 1/2 of RAM is too big considering GOGC overhead
2020-05-25 02:00:37 +03:00
Aliaksandr Valialkin
4a82631e44
app/vminsert: limit the summary buffer sizes for all the storage nodes to a half of the allowed memory
2020-05-25 01:39:33 +03:00
Aliaksandr Valialkin
4bd3d4b148
app/vminsert/netstorage: do not return error from storageNode.flushBufLocked when the buffer has been successfully re-routed to healthy nodes
...
This should reduce the number of false errors in the log and the number of falsely lost rows
2020-05-22 18:29:43 +03:00
Aliaksandr Valialkin
6edc33d9bb
app/vminsert/netstorage: capture the first error instead of the last error when sending data to vmstorage
...
The first error has more chances to point to the real root cause of the issue.
2020-05-22 17:49:33 +03:00
Aliaksandr Valialkin
bb4a2bf1aa
app/vmauth: fix make run-vmauth
command
2020-05-22 16:45:19 +03:00
Aliaksandr Valialkin
dcbdc009f5
app/vmagent: check for error returned from flag.Set
2020-05-21 16:30:48 +03:00
Aliaksandr Valialkin
b59e089ac7
app/vmagent: add -dryRun
option for checking all the configs mentioned in command-line flags without running vmagent
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/362
2020-05-21 15:23:18 +03:00
Aliaksandr Valialkin
901093279e
app/vmstorage/transport: update stale comment - vmstorage now sends small ack
packets to vminsert
2020-05-21 14:04:52 +03:00
kreedom
2752d6cb26
vmalert add quotes escape function ( #510 )
...
* vmalert add quotes escape function
Co-authored-by: kreedom
2020-05-21 12:10:35 +03:00
Aaron France
b26245c48b
Update README.md
2020-05-21 12:10:33 +03:00
Aliaksandr Valialkin
d83c68ca03
app/vmselect/promql: add ascent_over_time(m[d])
and descent_over_time(m[d])
functions
...
These functions could be useful in GPS tracking apps for calculating the summary for height gain/loss
over the given duration `d`.
2020-05-21 12:06:34 +03:00
Aliaksandr Valialkin
8ff28f5b91
app/vmselect/promql: update numbers after the upgrade of github.com/VictoriaMetrics/metrics from v1.11.2 to v1.11.3
2020-05-20 03:07:07 +03:00
Aliaksandr Valialkin
ddc9e69bd6
docs/vmagent.md: mention an alternative to refresh_interval
option in scrape configs
2020-05-19 23:10:16 +03:00
Aliaksandr Valialkin
7d46dd452a
app/vmselect/promql: move common code from aggrFuncOutliersK and newAggrFuncRangeTopK into getRangeTopKTimeseries
2020-05-19 16:11:03 +03:00
Aliaksandr Valialkin
37068064dd
app/vmselect/promql: fix outilersk
calculations
2020-05-19 14:45:10 +03:00
Aliaksandr Valialkin
fc81ea38d4
app/vmselect/promql: add outliersk(N, m)
aggregate function for anomaly detection across groups of similar time series
2020-05-19 13:52:44 +03:00
Aliaksandr Valialkin
9ca781b8f0
app/vmalert/notifier: go fmt
2020-05-19 13:00:18 +03:00
kreedom
27911ae179
vmalert - add expr to variables, add escape functions ( #495 )
...
* vmalert - add expr to variables, add escape functions
Co-authored-by: kreedom
2020-05-19 11:55:03 +03:00
Roman Khavronenko
c7f3e58032
vmalert: avoid sending resolves for pending alerts ( #498 )
...
Before the change we were sending notifications to notifier
if following conditions are met:
* alert is in Fire state
* alert is in Inactive state
We were sending Inactive notifications to resolve alert ASAP.
Unfortunately, we were sending resolves for Pending alerts that become
Inactive, which is wrong.
In this change we delete alert from the active list if
it was Pending and become Inactive. In this way we now
have Inactive alerts only if they were in state Fire before.
See test change for example.
2020-05-19 11:55:00 +03:00
Roman Khavronenko
e5f5342e18
vmalert: fix potential race during configuration reloads ( #497 )
...
Configuration reload and rules evaluation can't be executed
in same time now. This may make reload time longer but
prevents from potential races.
2020-05-19 11:54:55 +03:00
Aliaksandr Valialkin
b99d03a956
app/vmalert: run make quicktemplate-gen
from the root dir of the repository
2020-05-16 22:45:45 +03:00
Aliaksandr Valialkin
2784015a4d
all: print --help
output to stdout instead of stderr
...
This is easier to grep and pipe
2020-05-16 12:03:06 +03:00
Aliaksandr Valialkin
dbf8048134
app/vmrestore: document better that vmrestore
works like rsync --delete
, i.e. it deletes files in -storageDataPath
, which are missing in the backup
2020-05-16 09:02:46 +03:00
Aliaksandr Valialkin
e544155a82
app/vmagent/Makefile: fix make run-vmagent
rule
2020-05-15 19:35:16 +03:00
Aliaksandr Valialkin
6c43ba1cb1
app/vmagent/remotewrite: remove unused import after the commit 93267f143f
2020-05-15 17:42:31 +03:00
Aliaksandr Valialkin
1d71253653
app/vmagent/remotewrite: allow ingesting time series with multiple samples at once
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/481
2020-05-15 17:37:27 +03:00
Aliaksandr Valialkin
a853869e75
app/vmstorage/transport: prevent from uncontrolled memory usage growth when vminsert
sends big packets with too long labels
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/490
2020-05-15 15:42:54 +03:00
Aliaksandr Valialkin
1e5c1d7eaa
app/vmstorage: add vm_slow_metric_name_loads_total
metric, which could be used as an indicator when more RAM is needed for improving query performance
2020-05-15 14:12:24 +03:00
Aliaksandr Valialkin
d6b9a49481
app/vmstorage: add vm_slow_row_inserts_total
and vm_slow_per_day_index_inserts_total
metrics for determining whether VictoriaMetrics required more RAM for the current number of active time series
2020-05-15 13:46:57 +03:00
Roman Khavronenko
e850bf0eff
vmalert: fix the access to rules slice element by wrong index ( #486 )
...
During group's update rules deletion was causing slice
mutations while slice index was assumed to be unchanged.
This caused "slice bounds out of range" errors when multiple
rules were deleted sequentially.
2020-05-15 13:26:06 +03:00
hagen1778
d369450f27
vmalert: update README
2020-05-15 13:26:04 +03:00
Aliaksandr Valialkin
3845420a8f
lib: extract common code for returning fast unix timestamp into lib/fasttime
2020-05-14 23:06:50 +03:00
Roman Khavronenko
e208e76222
vmalert: check if remoteRead object was initied before calling Restore ( #473 )
...
The check for non-nil remoteRead was mistakenly dropped
during refactoring which caused panics when `vmalert`
wasn't configured with `remoteRead` flag.
2020-05-13 22:57:26 +03:00
Roman Khavronenko
1523890742
vmalert: fix flag names and description in README ( #475 )
...
Change also adds the recommendation for `remotewrite`
queue error.
2020-05-13 22:57:20 +03:00
肖贝贝
8c3e9adf7f
Feat/vmalert add max queue size ( #472 )
...
* feat: add remoteWrite.maxQueueSize to reduce queue full
* rename remote(write|read) flags to remote(Write|Read) for the sake of consistency
Co-authored-by: xiaobeibei <xiaobeibei@bigo.sg>
2020-05-13 22:57:16 +03:00
Aliaksandr Valialkin
bac9a684e8
docs/vmbackup.md: add a link to vmbackuper tool
2020-05-13 22:57:11 +03:00
Aliaksandr Valialkin
f3d9a5b0ec
app/vmselect/promql: suppress "SA4006: this value of dstValues
is never used" error in golangci-lint
2020-05-13 11:46:05 +03:00
Aliaksandr Valialkin
3b0f66a227
app/vmagent: fix a bug with improper relabeling when multiple -remoteWrite.urlRelableConfig
args are set
...
This bug could result in incorrect relabeling and metrics' drop.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/467
2020-05-12 22:03:45 +03:00
Aliaksandr Valialkin
18a0caee43
app/vmselect/promql: fix any(..)
calculations - return all the data points instead of the first one
2020-05-12 20:36:49 +03:00
Aliaksandr Valialkin
3d3f41b961
app/vmstorage/transport: fix panic during server stop on 32-bit arches
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/212
2020-05-12 20:21:40 +03:00
Aliaksandr Valialkin
81b8811cf4
app/vmselect/promql: remove -search.maxPointsPerTimeseries
command-line flag
...
Limit the estimated time series count after aggregation with grouping by the number of source time series.
2020-05-12 19:54:44 +03:00
Aliaksandr Valialkin
408ade27a9
app/vmselect/promql: add any(x) by (y)
aggregate function, which returns any time series from q
for each group y
2020-05-12 19:50:29 +03:00
Aliaksandr Valialkin
21c2982ac8
app/vmselect/promql: support for sum(x) by (y) limit N
syntax in order to limit the number of output time series after aggregation
2020-05-12 19:50:12 +03:00
Aliaksandr Valialkin
f341c6fcc4
Revert "app/vmselect: add -search.estimatedSeriesCountAfterAggregation
command-line flag for tuning the probability of OOMs or false-positive not enough memory
errors"
...
This reverts commit fbb7986dd2380fce2fc8633b7eda8b67f419e74c.
Reason for revert: this commit has been removed from single-node version
2020-05-12 19:50:08 +03:00
Aliaksandr Valialkin
d54a93fc81
app/vmagent: fix scraping mTLS targets, which has been broken in v1.35.1
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/470
2020-05-12 17:23:43 +03:00
Aliaksandr Valialkin
405cf44aed
app/vmagent,lib/promscrape: do not set HostClient.DialDualStack, since it isnt used if HostClient.Dial is set
2020-05-12 15:24:53 +03:00
Aliaksandr Valialkin
da6a84e147
app/vmagent/remotewrite: properly dial TCP6 addresses set via -remoteWrite.url
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/469
2020-05-12 15:24:50 +03:00
Aliaksandr Valialkin
4e237b4670
app/vminsert/influx: support passing AccountID and ProjectID via plain TCP and UDP
...
Now `vminsert` accepts AccountID and ProjectID via `VictoriaMetrics_AccountID` and `VictoriaMetrics_ProjectID` tags
when reading Influx line protocol data via plain TCP or UDP (i.e. when `-influxListenAddr` is set).
2020-05-12 13:13:04 +03:00
Aliaksandr Valialkin
f7753b1469
lib/storage: gradually pre-populate per-day inverted index for the next day
...
This should prevent from CPU usage spikes at 00:00 UTC every day when
inverted index for new day must be quickly created for all the active time series.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/430
2020-05-12 12:13:32 +03:00
Roman Khavronenko
0157566fdb
vmalert: cleanup and restructure of code to improve maintainability ( #471 )
...
The change introduces new entity `manager` which replaces
`watchdog`, decouples requestHandler and groups. Manager
supposed to control life cycle of groups, rules and
config reloads.
Groups export an ID method which returns a hash
from filename and group name. ID supposed to be unique
identifier across all loaded groups.
Some tests were added to improve coverage.
Bug with wrong annotation value if $value is used in
templates after metrics being restored fixed.
Notifier interface was extended to accept context.
New set of metrics was introduced for config reload.
2020-05-11 14:35:55 +03:00
Nikolay Khramchikhin
0e8c345ffb
vmalert config reload
...
added config hot reload for vmalert with sighup and api call
2020-05-11 14:35:50 +03:00
Aliaksandr Valialkin
6646b380ef
docs/vmauth.md: fix a link to docker images
2020-05-08 14:11:10 +03:00
Aliaksandr Valialkin
28ad350a31
app/vmagent: return 200 from /-/reload
endpoint as Prometheus does
2020-05-07 19:29:48 +03:00
Aliaksandr Valialkin
3052b479b7
lib/httpserver: reduce typical duration for http server graceful shutdown
...
Previously the duration for graceful shutdown for http server could take more than a minute
because of imporperly set timeouts in setNetworkTimeout.
Now typical duration for graceful shutdown should be reduced to less than 5 seconds.
2020-05-07 14:16:38 +03:00
Aliaksandr Valialkin
dc04040781
docs/{vmagent,vmauth}: small clarifications in the docs
2020-05-07 12:55:06 +03:00
Aliaksandr Valialkin
2b403d3f42
app/vmauth: prevent from attacks with ..
in path for accessing resources outside the configured url_prefix
2020-05-07 12:55:04 +03:00
Aliaksandr Valialkin
20538a2a5d
app/vmagent: allow setting independent auth configs per each configured -remoteWrite.url
2020-05-06 16:52:32 +03:00
Aliaksandr Valialkin
12dbb9e22c
app/vmagent: properly set client-side TLS certificates for -remoteWrite.url
. Previously they were mistakenly set as server-side
2020-05-06 16:50:37 +03:00
Aliaksandr Valialkin
8665c2edb1
docs/vmagent.md: small fixes
2020-05-06 14:49:25 +03:00
Aliaksandr Valialkin
8ab5e47b5c
lib/promscrape: add Prometheus-compatible DNS-based service discovery aka dns_sd_configs
2020-05-06 00:02:41 +03:00
Aliaksandr Valialkin
21b91599c2
docs/{vmauth,vmagent}: fix ports for profiling
2020-05-05 20:16:09 +03:00
Aliaksandr Valialkin
309700ab8c
docs/vmauth.md: mention that we can help creating customized proxy
2020-05-05 12:34:08 +03:00
Aliaksandr Valialkin
20e958789a
docs/{vmagent,vmauth}: add Profiling
section
2020-05-05 11:45:29 +03:00
Aliaksandr Valialkin
1153f30fee
docs: add vmauth.md
2020-05-05 11:17:45 +03:00
Aliaksandr Valialkin
782fb30cd0
app/vmauth: build fixes
2020-05-05 11:03:25 +03:00
Aliaksandr Valialkin
de31d16154
app/vmauth: add initial version of vmauth. See https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmauth/README.md for details
2020-05-05 10:56:20 +03:00
Aliaksandr Valialkin
61df59b9ea
docs/vmagent.md: /targets
page doesnt expose infomration about imporperly configured scrape configs now. It is written in error log instead
2020-05-05 10:56:18 +03:00
Roman Khavronenko
abce2b092f
app/vmalert: restore alerts state from datasource metrics ( #461 )
...
* app/vmalert: restore alerts state from datasource metrics
Vmalert will restore alerts state for rules that have `rule.For` > 0 from previously written timeseries via `remotewrite.url` flag.
* app/vmalert: mention remotewerite and remoteread configuration in README
2020-05-05 00:52:19 +03:00
Aliaksandr Valialkin
89aa6dbf56
lib/promscrape: add Prometheus-compatible service discovery for Consul aka consul_sd_configs
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/330
2020-05-04 20:53:06 +03:00
Aliaksandr Valialkin
b21b73115a
app/vminsert: add /-/reload
handler in the same way as for vmagent
2020-04-30 02:18:08 +03:00
DexterZhang
ae215e5538
feat(vmagent): add promscrap config reload suppport via http ( #450 )
...
* feat(vmagent): add promscrap config reload suppport via http endpoint `/-/reload`
* fix: typo fix
2020-04-30 02:18:01 +03:00
Artem Navoiev
121f7e1d56
Update README.md
2020-04-29 17:41:04 +03:00
Aliaksandr Valialkin
b6d88bac04
vendor: use github.com/VictoriaMetrics/fasthttp instead of github.com/fasthttp/fasthttp
...
The upstream fasthttp may contain issues like 996610f021
,
plus a code that isn't used by VictoriaMetrics. So let's use a private copy under our control instead.
2020-04-29 16:43:09 +03:00
Aliaksandr Valialkin
9ed4951ec8
lib/metricsql: move it to a separate repository - github.com/VictoriaMetrics/metrics
2020-04-28 15:30:06 +03:00
Aliaksandr Valialkin
cd1145e5f4
app/vmselect: add -search.estimatedSeriesCountAfterAggregation
command-line flag for tuning the probability of OOMs or false-positive not enough memory
errors
2020-04-28 12:51:48 +03:00
Aliaksandr Valialkin
a858b7e393
app/vmalert: added missing comments for public entities
2020-04-28 11:19:48 +03:00
Aliaksandr Valialkin
716bbe79d4
app/vminsert/netstorage: increase timeout for waiting for ack
message after sending big data block to vmstorage
2020-04-28 11:19:46 +03:00
Aliaksandr Valialkin
50af16baf2
app/vmalert: fix build
2020-04-28 00:34:01 +03:00
Aliaksandr Valialkin
e3db2c73a6
app/vmalert: sync with master branch
2020-04-28 00:19:42 +03:00
Aliaksandr Valialkin
7644f40763
app/vmalert: include it into the next release
2020-04-28 00:11:41 +03:00
Aliaksandr Valialkin
86a1d9cb0c
lib/promscrape: add initial support for Prometheus-compatible service discovery for Amazon EC2 aka ec2_sd_configs
2020-04-27 19:29:22 +03:00
Aliaksandr Valialkin
0daa37fa02
lib/promscrape/discovery/gce: allow empty project and zone for gce_sd_config
2020-04-27 11:45:45 +03:00
Aliaksandr Valialkin
989d84cf3f
app/{vminsert,vmstorage}: wait for ack
from vmstorage
after each packet sent to it from vminsert
...
This should protect from possible data loss when `vmstorage` is stopped while the packet is sent from `vminsert`.
This commit switches to new protocol between vminsert and vmstorage, which is incompatible
with the previous protocol. So it is required that both vminsert and vmstorage nodes are updated.
2020-04-27 09:53:26 +03:00
Aliaksandr Valialkin
e933cbac16
lib/storage: postpone reading data from blocks during search
...
This eliminates the need for storing block data into temporary files on a single-node VictoriaMetrics
during heavy queries, which touch big number of time series over long time ranges.
This improves single-node VM performance on heavy queries by up to 2x.
2020-04-27 08:44:01 +03:00
Aliaksandr Valialkin
23a310cc68
app/vmselect/netstorage: substitute sorting packedTimeseries with the natural order of the fetched blocks
...
This should minimize the number of disk seeks when reading data from temporary file.
2020-04-26 16:46:17 +03:00
Aliaksandr Valialkin
31861c5b8e
lib/promscrape/discovery/gce: allow empty zone
arg in gce_sd_config
- in this case zones for the given project are automatically discovered
2020-04-26 14:37:38 +03:00
Aliaksandr Valialkin
d9bdda408c
docs/{vmbackup,vmrestore}.md: update -help
output
2020-04-24 22:44:45 +03:00
Jason Gardner
7a6b2839b4
app/vmbackup: added ability to create and delete snapshots during backup ( #428 )
...
* app/vmbackup: added ability to create and delete snapshots during backup
Resolves: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/422
* Add snapshot create and delete url flags
* Fixed errcheck warnings in build
2020-04-24 22:35:50 +03:00
Aliaksandr Valialkin
32b3f959fc
app/vmselect: fix description for -search.resetCacheAuthKey
2020-04-24 19:44:35 +03:00
Aliaksandr Valialkin
069690e3bd
lib/promscrape: initial implementation for gce_sd_configs
aga Prometheus-compatible service discovery for Google Compute Engine
2020-04-24 17:53:43 +03:00
Aliaksandr Valialkin
f9526809e5
app/vmselect: add /api/v1/status/tsdb
page with useful stats for locating root cause for high cardinality issues
...
See https://prometheus.io/docs/prometheus/latest/querying/api/#tsdb-stats
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/425
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/268
2020-04-22 22:03:23 +03:00
Aliaksandr Valialkin
b59f1f1504
app/vmselect: add -search.minStalenessInterval
command-line flag for removing gaps on graphs built from time series with irregular duration between samples
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/426
2020-04-20 19:42:41 +03:00
Aliaksandr Valialkin
603d4c9217
app/vmselect: merge -search.maxLookback
and -search.maxStalenessInterval
flags, since it has been appeared they have identical purpose :(
...
Leave both flags for backwards compatibility reasons.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/209
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/426
2020-04-20 19:28:28 +03:00
Aliaksandr Valialkin
db5fe03170
deployment/docker: allow building docker images on top of any base image set via ROOT_IMAGE environment var
...
For example, the following command will build VictoriaMetrics docker image on top of alpine image:
ROOT_IMAGE=alpine make package-victoria-metrics
2020-04-20 01:16:21 +03:00
Aliaksandr Valialkin
1b911f6965
app/vmagent/remotewrite: retry sending data if the server closes keep-alive connection
...
This should fix the following error when sending data to remote storage:
couldn't send a block with size XX bytes to "YYY": the server closed connection before returning the first response byte. Make sure the server returns 'Connection: close' response header before closing the connection
2020-04-17 15:53:17 +03:00
Aliaksandr Valialkin
9105f72f17
docs/vmagent.md: typo fix: unvailable -> unavailable
2020-04-17 13:12:13 +03:00
Aliaksandr Valialkin
d46311fd93
app/vmagent/README.md: mention about prodmscrape.suppressScrapeErrors
2020-04-17 13:09:08 +03:00
Aliaksandr Valialkin
b9b5641c2f
app/vmselect: properly apply -search.maxLookback
to queries sent to /api/v1/query
2020-04-17 12:31:18 +03:00
Aliaksandr Valialkin
d4bc60d63c
lib/logger: add WARN level for logging expected errors such as invalid user queries
2020-04-15 20:50:45 +03:00
Aliaksandr Valialkin
a873b553cf
app/vmselect: handle timestamp(metric offset X)
the same way as Prometheus does
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/415
2020-04-15 12:01:05 +03:00
Aliaksandr Valialkin
6ec582acb9
lib/promscrape: show information on improperly configured scrape targets at the bottom of /targets
page
...
This is a common error whith improperly configured target autodiscovery and/or relabeling.
This error leads to duplicate scraping of the same targets with the same set of labels, which leads
to duplicate samples in time series.
2020-04-14 14:55:13 +03:00
Aliaksandr Valialkin
755f649c72
docs/vmagent.md: mention that vmagent supports kubernetes_sd_configs
now
2020-04-13 21:07:00 +03:00
Aliaksandr Valialkin
38256bd66d
docs: update minimum supported Go version from 1.12 to 1.13
2020-04-07 13:39:15 +03:00
Aliaksandr Valialkin
2b4d3effad
app/vmagent/remotewrite: add "X-Prometheus-Remote-Write-Version: 0.1.0" http header to remote_write request
...
This header is required by Cortex (and, probably, other remote storage systems).
See 9c1f44d090/docs/apis.md (remote-api)
.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/399
2020-04-04 16:24:47 +03:00
Aliaksandr Valialkin
87da127fbf
app/victoria-metrics: remove accidentally added testdata for single-node VM
2020-04-04 16:09:08 +03:00
Aliaksandr Valialkin
a012f6fe70
app/vmselect/promql: keep metric name after applying first_over_time
and last_over_time
functions
2020-04-04 14:54:02 +03:00
Aliaksandr Valialkin
a53e332a93
app/vmstorage: add missing shutdown for http server on graceful shutdown
...
This could result in the following panic during graceful shutdown when `/metrics` page is requested:
http: panic serving 10.101.66.5:57366: runtime error: invalid memory address or nil pointer dereference
goroutine 2050 [running]:
net/http.(*conn).serve.func1(0xc00ef22000)
net/http/server.go:1772 +0x139
panic(0xa0fc00, 0xe91d80)
runtime/panic.go:973 +0x3e3
github.com/VictoriaMetrics/VictoriaMetrics/lib/workingsetcache.(*Cache).UpdateStats(0x0, 0xc0000516c8)
github.com/VictoriaMetrics/VictoriaMetrics/lib/workingsetcache/cache.go:224 +0x37
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage.(*indexDB).UpdateMetrics(0xc00b931d00, 0xc02c41acf8)
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage/index_db.go:258 +0x9f
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage.(*Storage).UpdateMetrics(0xc0000bc7e0, 0xc02c41ac00)
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage/storage.go:413 +0x4c5
main.registerStorageMetrics.func1(0x0)
github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage/main.go:186 +0xd9
main.registerStorageMetrics.func3(0xc00008c380)
github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage/main.go:196 +0x26
main.registerStorageMetrics.func7(0xc)
github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage/main.go:211 +0x26
github.com/VictoriaMetrics/metrics.(*Gauge).marshalTo(0xc000010148, 0xaa407d, 0x20, 0xb50d60, 0xc005319890)
github.com/VictoriaMetrics/metrics@v1.11.2/gauge.go:38 +0x3f
github.com/VictoriaMetrics/metrics.(*Set).WritePrometheus(0xc000084300, 0x7fd56809c940, 0xc005319860)
github.com/VictoriaMetrics/metrics@v1.11.2/set.go:51 +0x1e1
github.com/VictoriaMetrics/metrics.WritePrometheus(0x7fd56809c940, 0xc005319860, 0xa16f01)
github.com/VictoriaMetrics/metrics@v1.11.2/metrics.go:42 +0x41
github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver.writePrometheusMetrics(0x7fd56809c940, 0xc005319860)
github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver/metrics.go:16 +0x44
github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver.handlerWrapper(0xb5a120, 0xc005319860, 0xc005018f00, 0xc00002cc90)
github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver/httpserver.go:154 +0x58d
github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver.gzipHandler.func1(0xb5a120, 0xc005319860, 0xc005018f00)
github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver/httpserver.go:119 +0x8e
net/http.HandlerFunc.ServeHTTP(0xc00002d110, 0xb5a660, 0xc0044141c0, 0xc005018f00)
net/http/server.go:2012 +0x44
net/http.serverHandler.ServeHTTP(0xc004414000, 0xb5a660, 0xc0044141c0, 0xc005018f00)
net/http/server.go:2807 +0xa3
net/http.(*conn).serve(0xc00ef22000, 0xb5bf60, 0xc010532080)
net/http/server.go:1895 +0x86c
created by net/http.(*Server).Serve
net/http/server.go:2933 +0x35c
2020-04-02 21:09:55 +03:00
Aliaksandr Valialkin
3b744f3c32
app/vmstorage: typo fix
2020-04-01 23:43:09 +03:00
Aliaksandr Valialkin
f838cdc86e
app/vmstorage: add vm_free_disk_space_bytes
metric for monitoring the remaining disk space at -storageDataPath
2020-04-01 23:10:44 +03:00
Aliaksandr Valialkin
5270b7a097
app/victoria-metrics/testdata: add a test for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/395
2020-03-31 12:51:46 +03:00
Aliaksandr Valialkin
d450249955
lib/storage: properly handle {label=~"foo|"}
filters as Prometheus does
...
Such filters must match all the time series with `label="foo"` plus all the time series without `label`
Previously only time series with `label="foo"` were matched.
2020-03-30 20:21:47 +03:00
Aliaksandr Valialkin
c6cbc0bd19
app/vmselect/prometheus: allow passing relative time to start
, end
and time
args of /api/v1/*
queries
2020-03-29 21:56:52 +03:00
Aliaksandr Valialkin
cb8696699a
app/vmselect/prometheus: code simplification: (d.Seconds()/1e3) -> d.Milliseconds()
2020-03-29 21:50:35 +03:00
Aliaksandr Valialkin
ceb6d1459f
docs/vmagent.md: add prometheus remote_write proxy
use case
2020-03-28 23:17:41 +02:00
Dmitry Naumov
b84071fc25
Rootless docker images by default ( #358 )
...
* Rootless docker images by default
* Migrate to rootless base image
Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2020-03-27 21:18:32 +02:00
Aliaksandr Valialkin
58cb7fc476
app/vmselect: adjust label_map()
handling for corner cases
...
The following corner cases now supported:
* label_map(q, "label", "", "foo") - adds `label="foo"` to series with missing `label`
* label_map(q, "label", "foo", "") - removes `label="foo"` from series
All the unmatched labels are kept unchanged.
2020-03-13 18:41:52 +02:00
Aliaksandr Valialkin
0e7a71a245
app/vmselect: add label_map(q, label, srcValue1, dstValue1, ... srcValueN, dstValueN)
function to MetricsQL
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/369
2020-03-12 19:13:56 +02:00
Aliaksandr Valialkin
50555d89d3
app/vmselect: add -search.maxStalenessInterval
for tuning Prometheus data model closer to Influx-style data model
2020-03-11 16:44:03 +02:00
Aliaksandr Valialkin
b46af9678e
app/vmagent: mention that vmagent can filter data
2020-03-11 16:23:10 +02:00
Aliaksandr Valialkin
8939c19281
app/vmstorage: return 500 status code instead of 200 status code on internal errors inside /snapshot/*
handlers
2020-03-10 23:54:27 +02:00
Aliaksandr Valialkin
f6410ff2bf
app/vmselect: add optional max_rows_per_line
query arg to /api/v1/export
...
This arg allows limiting the number of data points that may be exported on a single line.
2020-03-10 21:47:43 +02:00
Aliaksandr Valialkin
2f0a36044c
app/{vmagent,vminsert}: add support for importing csv data via /api/v1/import/csv
2020-03-10 21:17:40 +02:00
Aliaksandr Valialkin
3fc6599aa2
app/vmagent: properly apply -remoteWrite.sendTimeout
to fasthttp.HostClient
2020-03-09 13:31:22 +02:00
Aliaksandr Valialkin
47e986c26f
app/vmagent: properly add labels set via -remoteWrite.label
to metrics before sending them to -remoteWrite.url
2020-03-06 19:28:14 +02:00
Aliaksandr Valialkin
0d893eff36
Makefile: add build and test rules with enabled race detector. These rules have -race
suffix
...
Fix also `unsafe pointer conversion` errors detected by Go1.14. See https://golang.org/doc/go1.14#compiler .
2020-03-05 12:05:16 +02:00
Aliaksandr Valialkin
0176fc4206
app/vmagent/README.md: small fixes
2020-03-04 18:15:24 +02:00
Aliaksandr Valialkin
ac03be5a2c
app/vmagent/README.md: typo fix
2020-03-04 18:05:43 +02:00
Aliaksandr Valialkin
9354b9177a
app/vmagent/README.md: clarification
2020-03-04 18:04:06 +02:00
Aliaksandr Valialkin
4302555228
app/vmagent/README.md: add iot and edge monitoring
use case
2020-03-04 18:01:40 +02:00
Aliaksandr Valialkin
ea5904fd76
app/vmagent/README.md: add use cases
section
2020-03-04 17:42:57 +02:00
Aliaksandr Valialkin
f01d1bf4a8
app/vmagent: add -remoteWrite.maxDiskUsagePerURL
for limiting the maximum disk usage for each -remoteWrite.url
buffer
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/352
2020-03-03 19:49:20 +02:00
Aliaksandr Valialkin
808c17e250
app/vmagent/remotewrite: do not reset empty relabelCtx
2020-03-03 15:01:21 +02:00
Aliaksandr Valialkin
af19ca2483
app/vmagent: add -remoteWrite.urlRelabelConfig
for applying individual relabeling for each -remoteWrite.url
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/320
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/308
2020-03-03 13:13:06 +02:00
Aliaksandr Valialkin
d23df53ba2
app/vmselect/prometheus: do not add __name__!=
filter when searching for all the matching metric names via /api/v1/label/__name__/values
with non-empty label filter
...
This should reduce query time.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/343
2020-02-28 23:36:38 +02:00
Aliaksandr Valialkin
0eed71c7f4
app/vmagent/remotewrite: yet another typo fix
2020-02-28 20:07:00 +02:00
Aliaksandr Valialkin
6cdc97a53f
app/vmagent/remotewrite: typo fix
2020-02-28 19:05:11 +02:00
Aliaksandr Valialkin
cc39c9d74b
app/vmagent/remotewrite: limit memory usage when big scrape blocks are pushed to remote storage
2020-02-28 18:58:13 +02:00
Aliaksandr Valialkin
45d21d18a8
docs: add a doc for vmagent
2020-02-28 12:23:44 +02:00
Aliaksandr Valialkin
8fa1cd24d8
app/vmselect/prometheus: properly pass filter for labelName=__name__
in labelValuesWithMatches
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/343
2020-02-28 12:17:30 +02:00
Aliaksandr Valialkin
cf9aee4ec3
all: properly split vm_deduplicated_samples_total
among cluster components
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/345
2020-02-27 23:47:51 +02:00
Aliaksandr Valialkin
1286cead75
app/vminsert: properly initialize InsertCtx
...
This should prevent from panic described at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/339
2020-02-26 21:21:02 +02:00
Aliaksandr Valialkin
0597f1e39a
app/vmagent: allow setting -httpListenAddr
to empty string in order to disable listening for http requests
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/340
2020-02-26 20:58:26 +02:00
Aliaksandr Valialkin
266101feb4
app/vmagent/README.md: list service discovery mechanisms, which will be added soon
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/334
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/330
2020-02-26 19:28:19 +02:00
Aliaksandr Valialkin
c6c7843e93
app/vmagent: add -remoteWrite.maxBlockSize
command-line flag for limiting the maximum size of unpacked block to send to remote storage
2020-02-25 19:58:11 +02:00
Aliaksandr Valialkin
c4194020ef
app/vmagent: do not allow sending unpacked requests with sizes exceeding -maxInsertRequestSize
2020-02-25 19:35:43 +02:00
Aliaksandr Valialkin
2471340e0d
app/vmagent: add ability to accept Influx line protocol data via TCP and UDP
...
Just set `-influxListenAddr` command-line flag
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/333
2020-02-25 19:18:01 +02:00
Aliaksandr Valialkin
f96fb93ca5
app/vmagent: add a counter for /targets
handler calls
2020-02-25 18:17:25 +02:00
Aliaksandr Valialkin
25c570dae7
app/vmagent/README.md: mention that vmagent
exposes target statuses at /targets
page
2020-02-25 18:16:08 +02:00
Aliaksandr Valialkin
ca28a3e805
app/vmagent: logo fix
2020-02-25 00:09:55 +02:00
Aliaksandr Valialkin
777a39f7a1
app/vmagent: update docs
2020-02-25 00:09:53 +02:00
Aliaksandr Valialkin
61e67b8922
app/vmagent/README.md: small fixes
2020-02-24 21:26:12 +02:00
Aliaksandr Valialkin
6ca1e58d98
app/vmselect/promql: properly take into account the first datapoint when calculating rollup_candlestick
2020-02-24 13:25:07 +02:00
Aliaksandr Valialkin
b58e3fc8a9
app/vmselect/promql: do not take into account values outside the current window in rollup_candlestick
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/309
2020-02-23 18:06:26 +02:00
Yaroslav
c69d4b01f0
fix rollupOpen(), rollupHigh(), rollupLow() functions ( #328 )
2020-02-23 18:06:25 +02:00
Aliaksandr Valialkin
7ee7614e90
app/vmagent: initial implementation for vmagent
2020-02-23 17:31:54 +02:00
Aliaksandr Valialkin
f22aefdb16
app/vmselect/promql: log when rollupResult cache is cleared
2020-02-21 20:06:53 +02:00
Aliaksandr Valialkin
d5c2a0ce64
app/vmselect: add -search.cacheTimestampOffset
command-line flag
...
This flag can be used for removing gaps on graphs if the difference between the current time
and the timestamps from the ingested data exceeds 5 minutes.
This is the case when the time between data sources and VictoriaMetrics is improperly synchronized.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/312
2020-02-21 14:02:15 +02:00
Aliaksandr Valialkin
c70822db50
app/vmselect: add /internl/resetRollupResultCache
handler for resetting response cache
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/312
2020-02-21 14:02:12 +02:00
Aliaksandr Valialkin
afecb34491
app/vmstorage: limit the maximum error message size before sending it to client
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/315
2020-02-13 17:33:12 +02:00
Aliaksandr Valialkin
846d7fa7e9
app/vmselect: add sort_by_label(q, label)
and sort_by_label_desc(q, label)
functions
...
This is implementation of https://github.com/prometheus/prometheus/pull/1533 for VictoriaMetrics.
2020-02-13 17:01:50 +02:00
Aliaksandr Valialkin
347aaba79d
lib/{storage,mergeset}: use time.Ticker instead of time.Timer where appropriate
...
It has been appeared that time.Timer was used in places where time.Ticker must be used instead.
This could result in blocked goroutines as in the https://github.com/VictoriaMetrics/VictoriaMetrics/issues/316 .
2020-02-13 13:21:48 +02:00
Aliaksandr Valialkin
6e0013ca39
app/vmselect/prometheus: typo fix in -latencyOffset
description
2020-02-12 14:00:38 +02:00
Edouard Hur
e8f92a4ee8
Cluster - prometheus metrics fix ( #314 )
...
* add missing '/{}' in prom query range requests
* fix missing leading '/' on prom lavelValuesErrors path
2020-02-10 22:15:21 +02:00
Aliaksandr Valialkin
1010a57882
all: allow setting flags via environment vars
...
Now flags can be set via environment vars with the same names as flags.
Command-line flags override flags set via env vars.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/311
2020-02-10 13:31:21 +02:00
Aliaksandr Valialkin
ea66212c93
lib/storage: move -dedup.minScrapeInterval
flag outside lib/storage, so it doesnt show up in vminsert
in cluster version
2020-02-10 13:07:25 +02:00
Aliaksandr Valialkin
e6d9ea3094
app/vmselect/promql: do not add step to range end, since this hack became obsolete since commit 9e1119dab8
2020-02-05 21:23:44 +02:00
Aliaksandr Valialkin
4a1de7fee9
app/vmselect/promql: properly adjust time range for data to select
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/309
2020-02-05 21:23:43 +02:00
Aliaksandr Valialkin
8e77b54846
app/vmselect: unconditionally offset -step
to rollup_candlestick
. This makes results more consistent
2020-02-04 23:31:47 +02:00
Aliaksandr Valialkin
ce38b176bc
app/vmselect/promql: automatically apply offset -step
to rollup_candlestick
function in order to obtain the expected OHLC results
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/309
2020-02-04 23:24:04 +02:00
Aliaksandr Valialkin
4f7116d1ee
app/vmselect/promql: adjust rollup_candlestick calculations to the exepcted results
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/309
2020-02-04 22:43:28 +02:00
Aliaksandr Valialkin
ccd3aa4f15
app/vmselect: take into account the time the requests wait in the queue if -search.maxConcurrentRequests
is exceeded
...
This will prevent from excess CPU usage for timed out queries.
2020-02-04 16:20:48 +02:00
Aliaksandr Valialkin
e6bf88a4d4
app/vmselect: add a placeholder for /api/v1/metadata
, which could be requested by Grafana
...
See https://prometheus.io/docs/prometheus/latest/querying/api/#querying-metric-metadata
VictoriaMetrics doesn't collect any metadata for metrics, so just return empty response.
2020-02-04 15:56:01 +02:00
Aliaksandr Valialkin
7cde594696
all: do not clash flag description with back-quoted flag types
...
See https://golang.org/pkg/flag/#PrintDefaults for more details.
2020-02-04 15:56:01 +02:00
Aliaksandr Valialkin
45bc6c62f2
app/vmselect/promql: adjust and
and unless
binary operator handling to be consistent with Prometheus
2020-01-31 18:52:51 +02:00
Aliaksandr Valialkin
e3adc095bd
all: add -dedup.minScrapeInterval
command-line flag for data de-duplication
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/86
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/278
2020-01-31 01:18:54 +02:00
Aliaksandr Valialkin
cb5c39ee70
lib/fs: optimize small reads for ReaderAt.MustReadAt
by reading from memory-mapped space instead of reading from file descriptor
...
This should improve performance when reading many small blocks.
2020-01-30 15:16:16 +02:00
Aliaksandr Valialkin
4ed5e9a7ce
lib/storage: pre-fetch metricNames for the found metricIDs in Search.Init
...
This should speed up Search.NextMetricBlock loop for big number of found time series.
2020-01-30 15:16:16 +02:00
Aliaksandr Valialkin
170c1c3a4e
app/vmselect/promql: add keep_next_value(q)
for filling gaps with the next non-empty value
2020-01-29 00:48:14 +02:00
Aliaksandr Valialkin
a9c1d5b351
app/vminsert: moved -maxInsertRequestSize
command-line flag out of lib/prompb
in order to prevent its inclusion in vmselect
and vmstorage
apps
2020-01-28 22:53:50 +02:00
Aliaksandr Valialkin
b28c9a3944
app/vmselect/promql: return expected results from increase()
over the beginning of time series, which start from big value
...
Examples for such counters: OS-level counters for network or cpu stats.
2020-01-28 16:31:05 +02:00
Aliaksandr Valialkin
3e304890a6
app/vmselect/promql: fix panic on a single zero vmrange bucket in prometheus_buckets() function
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/296
2020-01-27 18:05:12 +02:00
Aliaksandr Valialkin
4d70a81e18
app/vminsert: do not drop pending rows if all the vmstorage backends are unavailable
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/294
2020-01-24 22:10:10 +02:00
Aliaksandr Valialkin
0cda6afa8e
app/vminsert: move ingestion protocol parsers to lib/protoparser, so they could be re-used in the upcoming vmagent
2020-01-24 16:55:18 +02:00
Aliaksandr Valialkin
ea53a21b02
all: consistently log durations in seconds with millisecond precision
...
This should improve logs readability
2020-01-22 18:35:24 +02:00
Aliaksandr Valialkin
e1a264173a
app/vmselect: mention the original query and time range in error messages
...
This should simplify debugging invalid or heavy queries.
2020-01-22 17:34:35 +02:00
Aliaksandr Valialkin
e127173984
app/vmselect: mention command-line flag, which could be used for adjusting query timeouts, in timeout errors
2020-01-22 15:53:42 +02:00
Aliaksandr Valialkin
f3b9f8b823
app/vmselect/prometheus: increase default value -maxExportDuration
to 30 days, since 10 minutes beat users exporting bit amounts of data
2020-01-22 15:53:41 +02:00
Aliaksandr Valialkin
40e564eb9c
app/vmselect/promql: add range_over_time(m[d])
function for calculating value range for m
over d
2020-01-21 19:05:29 +02:00
Aliaksandr Valialkin
ecddba30fe
app/vminsert/netstorage: increase timeout for pushing data from vminsert to vmstorage by 3x
...
Our clients report that the previous timeout could lead to frequent errors when
vmstorage starts background merge for big parts on slow HDD.
2020-01-21 18:21:49 +02:00
Aliaksandr Valialkin
9eaa2ab871
app/vmselect/promql: add label_match(q, label, regexp)
and label_mismatch(q, label, regexp)
functions for filtering out time series with labels matching the given regexp
2020-01-21 15:00:35 +02:00
Aliaksandr Valialkin
cbd0452317
app/vminsert: increase default value for -insert.maxQueueDuration
from 30s to 60s
...
This should help catching up with high ingestion rate after VictoriaMetrics restart.
2020-01-18 14:39:30 +02:00
Aliaksandr Valialkin
2084921e64
all: use github.com/klauspost/compress/gzip
instead of compress/gzip
...
`github.com/klauspost/compress/gzip` is more optimized than `compress/gzip`.
This gives better gzip compression and decompression speeds.
2020-01-17 23:59:17 +02:00
Aliaksandr Valialkin
54db08a60f
app/vmselect/netstorage: make fmt
2020-01-17 17:46:20 +02:00
Aliaksandr Valialkin
d21cc2d16a
app/vmselect/netstorage: limit the maximum size for in-memory buffer for temporary blocks file
...
This should reduce memory usage on systems with more than 8GB RAM.
2020-01-17 16:28:28 +02:00
Aliaksandr Valialkin
b05f6cf11c
app/vmselect: limit the default value for -search.maxConcurrentRequests
, so it plays well on systems with more than 16 vCPUs
...
A single heavy request can saturate all the available CPUs, so let's limit the number of concurrent requests to lower value.
This will give more chances for executing insert path.
2020-01-17 16:16:43 +02:00
Aliaksandr Valialkin
a9f683423c
app/{vminsert,vmselect}: improve error messages when VictoriaMetrics cannot handle too high number of concurrent inserts / selects
2020-01-17 16:16:43 +02:00
Aliaksandr Valialkin
4b16b7fd11
all: mention command-line flags used for limiting the incoming request size in error messages
...
This should improve error logs usability.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/287
2020-01-16 13:06:43 +02:00
Aliaksandr Valialkin
ce0b602405
app/vmselect/promql: fix panic on sum(aggr_over_time(...))
with incorrect number of args
2020-01-15 16:26:16 +02:00
Aliaksandr Valialkin
bcd3f0c5bd
app/vmselect/promql: add hoeffding_bound_upper(phi, m[d])
and hoeffding_bound_lower(phi, m[d])
functions
...
These functions can be used for calculating Hoeffding bounds
for `m` over `d` time range and for the given `phi` in the range `[0..1]`.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/283
2020-01-11 14:47:13 +02:00
Aliaksandr Valialkin
fc01b11ddc
app/vmselect/promql: return continuous values for min_over_time
and max_over_time
when step
is smaller than scrape_interval
2020-01-11 12:47:57 +02:00
Aliaksandr Valialkin
16fb128bbc
app/vmselect/promql: do not take into account the previous point before time window in square brackets for min_over_time
, max_over_time
, rollup_first
and rollup_last
functions
...
This makes the behaviour for these functions similar to Prometheus when processing broken time series with irregular data points
like `gitlab_runner_jobs`. See https://gitlab.com/gitlab-org/gitlab-exporter/issues/50 for details.
2020-01-11 00:26:11 +02:00
Aliaksandr Valialkin
1c445bf7eb
vendor: update github.com/valyala/fastjson from v1.4.2 to v1.4.5
...
This should fix parsing Inf values in `/api/v1/import`. The previous attempt to fix this in VictoriaMetrics v1.32.1 was unsuccessful.
2020-01-10 23:14:34 +02:00
Aliaksandr Valialkin
adc36d00b7
app/vmselect/promql: properly handle aggr(aggr_over_time(...))
2020-01-10 21:57:11 +02:00
Aliaksandr Valialkin
87a106702b
app/vmselect/promql: add aggr_over_time(("aggr_func1", "aggr_func2", ...), m[d])
function
...
This function can be used for simultaneous calculating of multiple `aggr_func*` functions
that accept range vector. For example, `aggr_over_time(("min_over_time", "max_over_time"), m[d])`
would calculate `min_over_time` and `max_over_time` for `m[d]`.
2020-01-10 21:18:12 +02:00
Aliaksandr Valialkin
c314d9a219
app/vmselect/promql: add tmin_over_time(m[d])
and tmax_over_time(m[d])
functions
...
These functions return timestamp in seconds for the minimum and maximum value for `m` over time range `d`
2020-01-10 19:39:34 +02:00
Aliaksandr Valialkin
705af61587
app/{vmbackup,vmrestore}: add backup complete
file to backup when it is complete and check for this file before restoring from backup
...
This should prevent from restoring from incomplete backups.
Add `-skipBackupCompleteCheck` command-line flag to `vmrestore` in order to be able restoring from old backups without `backup complete` file.
2020-01-09 15:35:45 +02:00
Aliaksandr Valialkin
7c6df1e51d
app/vmselect/promql: skip rate
calculation for the first point on time series
2020-01-08 14:42:44 +02:00
Aliaksandr Valialkin
76707b2ab9
app/vmselect/promql: fix calculations for histogram_share
2020-01-04 14:44:15 +02:00
Aliaksandr Valialkin
accad01b3e
app/vmselect/promql: add missing MetricName into netstorage.Result in tests
2020-01-04 12:53:14 +02:00
Aliaksandr Valialkin
6f29d37cb5
app/vmselect/promql: add histogram_share(le, buckets)
function
2020-01-04 12:53:08 +02:00
Aliaksandr Valialkin
2290503140
app/vmselect/promql: add absent_over_time(m[d])
func similar to the function in Prometheus 2.16
...
See https://github.com/prometheus/prometheus/issues/2882
2020-01-04 12:53:01 +02:00
Aliaksandr Valialkin
67f94bbe12
app/vmselect/promql: add histogram_over_time(m[d])
rollup function
2020-01-04 12:52:56 +02:00
Aliaksandr Valialkin
9a1f6848ca
app/vmselect/promql: fix results caching for multi-arg rollup functions such as quantile_over_time
...
Previosly only a single arg was taken into account, so caching didn't work properly for multi-arg rollup funcs.
2020-01-03 20:44:54 +02:00
Aliaksandr Valialkin
3d0c7b095a
app/vmselect/promql: use scrapeInterval instead of window in denominator when calculating rate
for the first point on the time series
...
This should provide better estimation for `rate` in the beginning of time series.
2020-01-03 19:03:32 +02:00
Aliaksandr Valialkin
6ea7f23446
app/vmselect/promql: increase the estimated number of time series returned by aggr() by (something)
from 100 to 1K, since 100 may result in OOM for high number of time series
2020-01-03 01:02:30 +02:00
Aliaksandr Valialkin
e0abf45d45
app/vmselect/promql: add share_le_over_time
and share_gt_over_time
functions for SLI and SLO calculations
2020-01-03 00:41:36 +02:00
Aliaksandr Valialkin
eb1a66c577
lib/metricsq: add ExpandWithExprs
2019-12-25 22:20:21 +02:00
Aliaksandr Valialkin
453d71d082
Rename lib/promql to lib/metricsql and apply small fixes
2019-12-25 22:09:09 +02:00
Mike Poindexter
009d1559db
Split Extended PromQL parsing to a separate library
2019-12-25 22:09:07 +02:00
Aliaksandr Valialkin
ff18101d30
app/vmselect/promql: make sure AdjustStartEnd returns time range covering the same number of points as the initial time range
...
This should prevent from the following panic at app/vmselect/promql/binary_op.go:255:
BUG: len(leftVaues) must match len(rightValues) and len(dstValues)
2019-12-24 22:45:49 +02:00
Aliaksandr Valialkin
e24ee43109
app/vmselect/promql: adjust calculations for rate
and increase
for the first value
...
These calculations should trigger alerts on `/api/v1/query` for counters starting from values greater than 0.
2019-12-24 19:41:03 +02:00
Aliaksandr Valialkin
9a2554691c
app/vmselect/promql: properly calculate rate
on the first data point
...
It is calculated as `value / scrape_interval`, since the value was missing on the previous scrape,
i.e. we can assume its value was 0 at this time.
2019-12-24 15:55:15 +02:00
Aliaksandr Valialkin
97de50dd4c
app/vmselect/netstorage: improve error message when reading data size in readBytes
2019-12-24 14:40:14 +02:00
Aliaksandr Valialkin
29d2ce54cb
all: use gozstd instead of pure Go zstd for GOARCH=amd64
2019-12-24 12:43:59 +02:00
Aliaksandr Valialkin
6358cf3d47
app/vmselect/netstorage: move MustAdviseSequentialRead to lib/fs
2019-12-23 23:16:26 +02:00
Aliaksandr Valialkin
cc8a1bae0e
app/vmselect: add -search.maxExportDuration
command-line flag for limiting /api/v1/export
duration
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/275
2019-12-20 11:37:18 +02:00
Aliaksandr Valialkin
8b56b849e9
app/vminsert: return StatusNoContent http response for /api/v1/import
to be consistent with other insert handlers
2019-12-19 01:22:01 +02:00
Aliaksandr Valialkin
6a185b7809
app/vmselect: add ability to pass match[]
, start
and end
to /api/v1/labels
...
This makes the `/api/v1/labels` handler consistent with already existing functionality for `/api/v1/label/.../values`.
See https://github.com/prometheus/prometheus/issues/6178 for more details.
2019-12-15 00:20:43 +02:00
Aliaksandr Valialkin
a7bf8e77af
app/vminsert: simultaneously accept telnet put
and HTTP /api/put
OpenTSDB metrics at -opentsdbListenAddr
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/266
2019-12-14 00:42:18 +02:00
Aliaksandr Valialkin
b238997a84
all: rename Extended PromQL
to PromQL extensions
2019-12-12 19:29:59 +02:00
Aliaksandr Valialkin
cffaeda0f1
all: publish Docker images for the following GOARCH: amd64, arm, arm64, ppc64le and 386
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/258
2019-12-11 23:33:11 +02:00
Aliaksandr Valialkin
c25b97829f
app/vmselect/promql: return lower
and upper
bounds for the estimated percentile from histogram_quantile
if third arg is passed
...
Updates https://github.com/prometheus/prometheus/issues/5706
2019-12-11 14:00:18 +02:00
Aliaksandr Valialkin
f79b61e2a1
app/vmselect/promql: return matrix instead of vector on subqueries to /api/v1/query
like Prometheus does
2019-12-11 00:57:54 +02:00
Aliaksandr Valialkin
5d2ff573aa
app/vmselect/promql: allow negative offsets
...
Updates https://github.com/prometheus/prometheus/issues/6282
2019-12-11 00:57:51 +02:00
Aliaksandr Valialkin
c81a89a8ed
app/vminsert: add /api/v1/import
handler
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6
2019-12-09 22:37:49 +02:00
Aliaksandr Valialkin
0c304439d4
app/vminsert: consistency renaming for counters
2019-12-09 16:44:26 +02:00
Aliaksandr Valialkin
c217a53c35
make vendor-update
2019-12-07 23:11:26 +02:00
Aliaksandr Valialkin
3534e71c96
app/vminsert/influx: add a test case from https://community.librenms.org/t/integration-with-victoriametrics/9689
2019-12-07 23:00:54 +02:00
Aliaksandr Valialkin
d39bba3547
app/vmselect/promql: add {topk|bottomk}_{min|max|avg|median}
aggregate functions for returning the exact k time series on the given time range
...
The full list of functions added:
- `topk_min(k, q)` - returns top K time series with the max minimums on the given time range
- `topk_max(k, q)` - returns top K time series with the max maximums on the given time range
- `topk_avg(k, q)` - returns top K time series with the max averages on the given time range
- `topk_median(k, q)` - returns top K time series with the max medians on the given time range
- `bottomk_min(k, q)` - returns bottom K time series with the min minimums on the given time range
- `bottomk_max(k, q)` - returns bottom K time series with the min maximums on the given time range
- `bottomk_avg(k, q)` - returns bottom K time series with the min averages on the given time range
- `bottomk_median(k, q)` - returns bottom K time series with the min medians on the given time range
2019-12-05 19:27:45 +02:00
Aliaksandr Valialkin
e0f43e1f66
app/vmselect: add placeholders for /api/v1/rules
and /api/v1/alerts
2019-12-03 19:38:09 +02:00
Aliaksandr Valialkin
4e22b521c2
lib/storage: remove metricID with missing metricID->metricName entry
...
The metricID->metricName entry can be missing in the indexdb after unclean shutdown
when only a part of entries for new time series is written into indexdb.
Recover from such a situation by removing the broken metricID. New metricID
will be automatically created for time series with the given metricName
when new data point will arive to it.
2019-12-02 20:52:13 +02:00
Aliaksandr Valialkin
819bb36852
app/vmselect/promql: estimate per-series scrape interval as 0.6 quantile for the first 100 intervals
...
This should improve scrape interval estimation for tiem series with gaps.
2019-12-02 13:43:04 +02:00
Aliaksandr Valialkin
1595dcd3d9
app/vminsert/influx: allow empty measurement in Influx line protocol
...
In this case metric names are mapped directly from field names without any prefixes.
2019-11-30 21:59:07 +02:00
Aliaksandr Valialkin
1e2019b1b6
app/vmselect/promql: fix corner case for increase
over time series with gaps
...
In this case `increase` could return invalid high value for the first point after the gap.
2019-11-30 01:34:18 +02:00
Aliaksandr Valialkin
4c63caa37c
deployment/docker/certs: update TLS certs source from alpine:3.9 to alpine:3.10
2019-11-29 19:55:36 +02:00
Aliaksandr Valialkin
7e734433a3
lib/backup: cosmetic fixes after #243
2019-11-29 18:07:41 +02:00
glebsam
4a192cb832
Add option to provide custom endpoint for S3, add option to specify S3 config profile ( #243 )
...
* Add option to provide custom endpoint for S3 for use with s3-compatible storages, add option to specify S3 config profile
* make fmt
2019-11-29 18:07:39 +02:00
Aliaksandr Valialkin
def9ccd360
app/vmselect/prometheus: consistently apply nocache
arg to /api/v1/query
the same way ast to /api/v1/query_range
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/241
2019-11-26 22:55:50 +02:00
Aliaksandr Valialkin
e0ac068112
app/vmselect/prometheus: fix content-type for /api/v1/export
responses
...
The correct Content-Type should be `application/stream+json` instead of `application/json`
Thanks to Joshua Ryder for pointing to this.
2019-11-26 17:44:27 +02:00
Aliaksandr Valialkin
28cc4c09b5
app/vmselect/promql: remove zero timeseries from prometheus_buckets
output
2019-11-25 19:10:13 +02:00
Aliaksandr Valialkin
8811bec14e
app/vmselect/prometheus: reduce default value for -search.latencyOffset
from 60s to 30s
...
30 seconds should be enough for almost all the cases
2019-11-25 16:33:36 +02:00
Aliaksandr Valialkin
f7da9b2db2
app/vmselect/promql: allow nested parens
2019-11-25 16:13:33 +02:00
Aliaksandr Valialkin
d2619d6dce
vendor: update github.com/VictoriaMetrics/metrics from v1.9.0 to v1.9.1
2019-11-25 15:22:50 +02:00
Aliaksandr Valialkin
f46fb6c740
app/vmselect/promql: re-use metrics.Histogram when calculating histogram function for each point on the graph
...
This should reduce the amounts memory allocations
2019-11-25 14:24:30 +02:00
Aliaksandr Valialkin
0f184affa7
app/vmselect/promql: optimize binary search over big number of samples during rollup calculations
2019-11-25 14:01:54 +02:00
Aliaksandr Valialkin
dbd07041ae
app/vmselect/promql: adjust tests after the upgrade of github.com/VictoriaMetrics/metrics from v1.8.3 to v1.9.0
2019-11-25 13:44:08 +02:00
Aliaksandr Valialkin
8bb254d960
app/vmselect/promql: add histogram
aggregate function, which is useful for building heatmaps from multiple time series
2019-11-24 00:04:15 +02:00
Aliaksandr Valialkin
414259f47b
app/vmselect/promql: do not take into account buckets with negative counters in prometheus_buckets
2019-11-23 14:19:19 +02:00
Aliaksandr Valialkin
193d553f6d
app/vmselect/promql: properly handle histogram_quantile(0, ...)
with zero buckets
2019-11-23 14:02:25 +02:00
Aliaksandr Valialkin
f8298c7f13
app/vmselect: add vm_per_query_{rows,series}_processed_count
histograms
2019-11-23 13:23:03 +02:00
Aliaksandr Valialkin
4d76977745
app/vmselect/promql: transparently apply prometheus_buckets
in histogram_quantile
2019-11-23 11:49:16 +02:00
Aliaksandr Valialkin
5f6f03c692
app/vmselect/promql: add prometheus_buckets
function for converting the upcoming histogram buckets from github.com/VictoriaMetrics/metrics
to Prometheus-compatible buckets
2019-11-23 00:21:56 +02:00
Aliaksandr Valialkin
17d08c1fe0
app/vmselect: adjust end
arg instead of adjusting start
arg if start > end
...
`start` arg has higher chances to be set properly comparing to `end` arg,
so it is expected that the `end` arg could be adjusted if it was set incorrectly.
2019-11-22 16:12:53 +02:00
Aliaksandr Valialkin
216a260ced
app/{vmbackup,vmrestore}: add -maxBytesPerSecond
command-line flag for limiting the used network bandwidth during backup / restore
2019-11-19 20:32:43 +02:00
Aliaksandr Valialkin
5ae47e8940
app/vmselect/prometheus: properly adjust too big time time
on /api/v1/query
...
Too big `time` must be adjusted to `now()-queryOffset`.
2019-11-19 00:42:07 +02:00
Aliaksandr Valialkin
77bb66a5be
app/vmselect/promql: properly calculate integrate(q[d])
2019-11-13 21:11:03 +02:00
Aliaksandr Valialkin
c33640664a
app/vmselect/promql: use universal approach for determining maxByteSliceLen on 32-bit and 64-bit archs
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/235
2019-11-13 20:26:07 +02:00
Aliaksandr Valialkin
d297b65089
lib/storage: add vm_cache_size_bytes{type="storage/hour_metric_ids"}
metric
2019-11-13 20:26:05 +02:00
Aliaksandr Valialkin
494ad0fdb3
lib/storage: remove inmemory index for recent hour, since it uses too much memory
...
Production workload shows that the index requires ~4Kb of RAM per active time series.
This is too much for high number of active time series, so let's delete this index.
Now the queries should fall back to the index for the current day instead of the index
for the recent hour. The query performance for the current day index should be good enough
given the 100M rows/sec scan speed per CPU core.
2019-11-13 18:08:58 +02:00
Aliaksandr Valialkin
633dd81bb5
lib/storage: add -disableRecentHourIndex
flag for disabling inmemory index for recent hour
...
This may be useful for saving RAM on high number of time series aka high cardinality
2019-11-13 15:10:12 +02:00
Aliaksandr Valialkin
f1620ba7c0
lib/storage: fix inmemory inverted index issues found in v1.29
...
Issues fixed:
- Slow startup times. Now the index is loaded from cache during start.
- High memory usage related to superflouos index copies every 10 seconds.
2019-11-13 13:35:38 +02:00
Aliaksandr Valialkin
87b39222be
Revert "lib/fs: do not postpone directory removal on NFS error"
...
This reverts commit 21aeb02b46649ac9906cb37733f7b155a77a0db9.
2019-11-12 16:29:50 +02:00
Oleg Kovalov
74ba42d111
fix misspelled words ( #229 )
2019-11-12 00:18:24 +02:00
Aliaksandr Valialkin
c48e39eea9
lib/storage: add tests for dateMetricIDCache
2019-11-11 13:21:05 +02:00
Aliaksandr Valialkin
5f52eb7653
lib/fs: do not postpone directory removal on NFS error
...
Continue trying to remove NFS directory on temporary errors for up to a minute.
The previous async removal process breaks in the following case during VictoriaMetrics start
- VictoriaMetrics opens index, finds incomplete merge transactions and starts replaying them.
- The transaction instructs removing old directories for parts, which were already merged into bigger part.
- VictoriaMetrics removes these directories, but their removal is delayed due to NFS errors.
- VictoriaMetrics scans partition directory after all the incomplete merge transactions are finished
and finds directories, which should be removed, but weren't still removed due to NFS errors.
- VictoriaMetrics panics when it finds unexpected empty directory.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/162
2019-11-10 13:27:16 +02:00
Aliaksandr Valialkin
9ea2bd822e
lib/storage: implement per-day inverted index
2019-11-10 00:20:32 +02:00
Aliaksandr Valialkin
5d8de72414
app/vmrestore: the upcoming release would be 1.29.0
2019-11-10 00:20:18 +02:00
Aliaksandr Valialkin
dea2f3efed
lib/storage: use specialized cache for (date, metricID) entries
...
This improves ingestion performance.
2019-11-09 23:09:18 +02:00
Aliaksandr Valialkin
46e67bb78c
lib/storage: export vm_new_timeseries_created_total
metric for determining time series churn rate
2019-11-08 19:58:21 +02:00
Aliaksandr Valialkin
0063c857f5
lib/storage: add inmemory inverted index for the last hour
...
It should improve performance for `last N hours` dashboards with update intervals smaller than 1 hour.
2019-11-08 19:37:46 +02:00
Aliaksandr Valialkin
33abbec6b4
app/vmselect/promql: adjust memory limits calculations for incremental aggregate functions
...
Incremental aggregate functions don't keep all the selected time series in memory -
they keep only up to GOMAXPROCS time series for incremental aggregations.
Take into account that the number of time series in RAM can be higher if they are split
into many groups with `by (...)` or `without (...)` modifiers.
This should reduce the number of `not enough memory for processing ... data points` false
positive errors.
2019-11-08 19:37:43 +02:00
Aliaksandr Valialkin
7d7fbf890e
app/{vmbackup,vmrestore}: add vmbackup
and vmrestore
tools for creating backups on s3 or gcs from instant snapshots
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/203
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/38
2019-11-07 21:26:43 +02:00
Aliaksandr Valialkin
1c777e0245
lib/storage: substitute error message about unsorted items in the index block after metricIDs merge with counter
...
The origin of the error has been detected and documented in the code,
so it is enough to export a counter for such errors at `vm_index_blocks_with_metric_ids_incorrect_order_total`,
so it could be monitored and alerted on high error rates.
Export also the counter for processed index blocks with metricIDs - `vm_index_blocks_with_metric_ids_processed_total`,
so its' rate could be compared to `rate(vm_index_blocks_with_metric_ids_incorrect_order_total)`.
2019-11-06 14:32:41 +02:00
Aliaksandr Valialkin
4a8251feff
app/vmselect/promql: add lag(q[d])
function, which returns the lag between the current timestamp and the timstamp for the last data point in q
2019-11-01 12:21:43 +02:00
Aliaksandr Valialkin
6ab9c98a1e
app/vmstorage: add -bigMergeConcurrency
and -smallMergeConcurrency
flags for tuning the maximum number of CPU cores used during merges
2019-10-31 16:17:29 +02:00
Aliaksandr Valialkin
4e6bf6f538
app/vmselect: add -search.latencyOffset
flag for tuning the time after data collection when data points become visible in query results
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/218
2019-10-28 12:32:36 +02:00
Aliaksandr Valialkin
78fc35c9b1
all: make fmt
2019-10-17 20:05:12 +03:00
Aliaksandr Valialkin
5b01b7fb01
all: add support for GOARCH=386 and fix all the issues related to 32-bit architectures such as GOARCH=arm
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/212
2019-10-17 18:27:49 +03:00
Aliaksandr Valialkin
99786c2864
app/vmselect/prometheus: add -search.maxLookback
command-line flag for overriding dynamic calculations for max lookback interval
...
This flag is similar to `-search.lookback-delta` if set. The max lookback interval is determined dynamically
from interval between datapoints for each input time series if the flag isn't set.
The interval can be overriden on per-query basis by passing `max_lookback=<duration>` query arg to `/api/v1/query` and `/api/v1/query_range`.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/209
2019-10-15 21:37:17 +03:00
Aliaksandr Valialkin
a5302a6651
app/vmselect/promql: take into account the previous point when calculating max_over_time
and min_over_time
...
This lines up with `first_over_time` function used in `rollup_candlestick`, so `rollup=low` always returns
the minimum value.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/204
2019-10-08 12:30:16 +03:00
Aliaksandr Valialkin
483af3a97a
app/vmselect/netstorage: hint the OS that tmpBlocksFile is read almost sequentially
...
This became the case after b7ee2e7af2
.
2019-09-30 00:13:33 +03:00
Aliaksandr Valialkin
946ca438a6
app/vmselect/netstorage: marshal block outside tmpBlocksFile.WriteBlock
...
This also allows marshaling outside lock, thus reducing the amount of work under the lock.
2019-09-28 20:57:20 +03:00
Aliaksandr Valialkin
e92e39eddf
app/vmselect/netstorage: reduce the number of disk seeks when the query processes big number of time series
2019-09-28 20:57:20 +03:00
Aliaksandr Valialkin
56dff57f77
app/vmselect/netstorage: reduce memory usage when fetching big number of data blocks from vmstorage
...
Dump data blocks directly to temporary file instead of buffering them in RAM
2019-09-28 12:21:57 +03:00
Aliaksandr Valialkin
ba460f62e6
app/vmselect/promql: do not generate timestamps for NaN values in timestamp
function according to Prometheus logic
2019-09-27 18:55:16 +03:00
Aliaksandr Valialkin
bd1cf053f6
app/vmselect/promql: add increases_over_time
and decreases_over_time
functions
...
`increases_over_time(q[d])` returns the number of `q` increases during the given duration `d`.
`decreases_over_time(q[d])` returns the number of `q` decreases during the given duration `d`.
2019-09-25 20:38:51 +03:00
Aliaksandr Valialkin
8d398af92f
app/vminsert/netstorage: mention the data size that cannot be sent to vmstorage
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/175
2019-09-25 12:53:41 +03:00
Aliaksandr Valialkin
73ac7b8dd6
app/vminsert/netstorage: make sure the conn exists before closing it in storageNode.closeBrokenConn
...
The conn can be missing or already closed during the call to storageNode.closeBrokenConn.
Prevent `nil pointer dereference` panic by verifying whether the conn is already closed.
Thanks to @CH-anhngo for reporting the issue.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/189
2019-09-25 10:36:50 +03:00
Aliaksandr Valialkin
b0c738ae8b
app/vminsert/opentsdbhttp: remove FATAL prefix from logger.Fatalf errors for the sake of consistency with other logger.Fatalf calls
2019-09-19 22:16:11 +03:00
Aliaksandr Valialkin
550a12415a
app/vminsert/netstorage: log network errors when sending data to vmstorage nodes
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/175
2019-09-13 22:26:24 +03:00
Aliaksandr Valialkin
ee4585db33
app/vmselect/promql: properly handle subqueries like aggr_func(rollup_func(metric[window:step]))
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/184
2019-09-13 21:42:11 +03:00
Aliaksandr Valialkin
828e5f6d26
app/vmselect/promql: binary operation fixes according to Prometheus behaviour
...
The follosing issues were fixed:
- VictoriaMetrics could leave superflouos labels when using `on` or `ignoring` modifiers
- VictoriaMetrics could return `duplicate timeseries` error when using `group_left` or `group_right` with non-empty label list
2019-09-13 17:43:09 +03:00
Aliaksandr Valialkin
ed50b8792b
app/vminsert/netstorage: reduce the maximum buffer size for rerouted rows, so it occupies less RAM
2019-09-11 14:50:30 +03:00
Aliaksandr Valialkin
b101064f8b
all: report the number of bytes read on io.ReadFull error
...
This should simplify error investigation similar to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/175
2019-09-11 14:50:24 +03:00
Aliaksandr Valialkin
2f4c950fe9
app/vminsert/netstorage: send per-storageNode bufs to vmstorage nodes in parallel
...
This should improve the maximum ingestion throughput.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/175
2019-09-11 14:50:19 +03:00
Aliaksandr Valialkin
694cc59ed1
app/vminsert/netstorage: dynamically adjust timeouts for sending packets from vminsert to vmstorage depending on packet size
...
Bigger packets will have more chances to be sent to vmstorage.
2019-09-11 14:50:14 +03:00
Aliaksandr Valialkin
2c654258ef
lib/fs: add MustStopDirRemover for waiting until pending directories are removed on graceful shutdown
...
This patch is mainly required for laggy NFS. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/162
2019-09-05 11:17:17 +03:00
Aliaksandr Valialkin
d0953e9f02
app/vmselect/promql: ignore grouping by destination label in count_values
, since such a grouping is performed automatically
2019-09-04 19:59:02 +03:00
Aliaksandr Valialkin
f78ffe565f
app/vmselect/promql: do not return artificial points beyond the last point in time series
2019-09-04 16:34:29 +03:00
Aliaksandr Valialkin
a7d5d611fe
app/vmselect/prometheus: do not adjust start
and end
args in /api/v1/query_range
if nocache=1
arg is set
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/171
2019-09-04 13:10:17 +03:00
Aliaksandr Valialkin
b08f085082
app/vmselect/promql: reset timeseries name on group_left and group_right as Prometheus does
2019-09-03 20:43:29 +03:00
Aliaksandr Valialkin
458d412bb6
app/vmselect/netstorage: adaptively adjust the maximum inmemory file size for storing temporary blocks
...
The maximum inmemory file size now depends on `-memory.allowedPercent`.
This should improve performance and reduce the number of filesystem calls
on machines with big amounts of RAM when performing heavy queries
over big number of samples and time series.
2019-09-03 13:32:18 +03:00
Aliaksandr Valialkin
604a4312f9
all: port to FreeBSD on GOARCH=amd64
2019-08-28 01:46:09 +03:00
Aliaksandr Valialkin
5893a9f9a3
app/vmstorage: increase default values for search.maxTagKeys, search.maxTagValues and search.maxUniqueTimeseries
2019-08-27 14:28:26 +03:00
Aliaksandr Valialkin
38711526d3
app/vminsert/influx: set db
label only if Influx line doesnt have db
tag
2019-08-24 13:55:01 +03:00
Aliaksandr Valialkin
1ee536f9fd
app/vminsert: skip empty tags
2019-08-24 13:36:41 +03:00
Aliaksandr Valialkin
a283023d16
app/vminsert/opentsdbhttp: skip invalid rows and continue parsing the remaining rows
...
Invalid rows are logged and counted in `vm_rows_invalid_total{type="opentsdb-http"}` metric
2019-08-24 13:36:41 +03:00
Aliaksandr Valialkin
38b9615c53
app/vminsert/opentsdb: skip invalid rows and continue parsing the remaining rows
...
Invalid rows are logged and counted in `vm_rows_invalid_total{type="opentsdb"}` metric
2019-08-24 13:36:41 +03:00
Aliaksandr Valialkin
2a8fc41bab
app/vminsert/graphite: skip invalid rows and continue parsing the remaining rows
...
Invalid rows are logged and counted in `vm_rows_invalid_total{type="graphite"}` metric
2019-08-24 13:36:41 +03:00
Aliaksandr Valialkin
22685ef94d
app/vminsert/influx: skip invalid rows and continue parsing the remaining rows
...
Invalid influx lines are logged and counted in `vm_rows_invalid_total{type="influx"}` metric.
2019-08-24 13:36:41 +03:00
Aliaksandr Valialkin
425a81a6c7
app/vminsert/influx: do not allow escaping newline char, since they dont occur in real life
...
The prefious report with escaped newline chars in influx line protocol was false alarm.
2019-08-23 18:43:00 +03:00
Aliaksandr Valialkin
8da8dd0876
app/vminsert/opentsdbhttp: allow timestamp as float64 and as string, since it occurs in real life
2019-08-23 18:35:52 +03:00
Aliaksandr Valialkin
0ea21eb9dc
app/vminsert/influx: handle \r\n
aka crlf
influx line endings from windows world
...
Such lines exist in real life.
2019-08-23 18:28:54 +03:00
Aliaksandr Valialkin
b3502b2b39
app/vminsert/influx: allow escaping newline char
...
Though newline char isn't mentioned in escape rules at https://docs.influxdata.com/influxdb/v1.7/write_protocols/line_protocol_tutorial/ ,
there are reports that such chars occur in real life
2019-08-23 15:14:58 +03:00
Aliaksandr Valialkin
f1f8fce4f7
app/vminsert/influx: skip comments starting with #
in influx line protocol
2019-08-23 14:43:24 +03:00
Aliaksandr Valialkin
697de90893
app/vminsert: do not drop data in reroutedBuf if all the storage nodes are unhealthy
2019-08-23 10:38:19 +03:00
Aliaksandr Valialkin
a5dc54efc3
app/vminsert: properly limit the size of reroutedBuf
2019-08-23 10:29:51 +03:00
Aliaksandr Valialkin
c197641978
all: return 503 http error if service is temporarily unavailable
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/156
2019-08-23 09:49:50 +03:00
Aliaksandr Valialkin
e734076f0f
app/vminsert: allow setting the maximum number of labels per time series via -maxLabelsPerTimeseries
2019-08-23 08:47:18 +03:00
Aliaksandr Valialkin
e9db22a551
app/vmselect/promql: attempt to repair invalid bucket counts passed to histogram_quantile
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/136
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/154
2019-08-22 14:39:24 +03:00
Aliaksandr Valialkin
0697164b4f
app/vminsert: add ability to ingest data via HTTP OpenTSDB /api/put
requests
...
This is manual merge of the https://github.com/VictoriaMetrics/VictoriaMetrics/pull/152
Thanks to nustinov@gmail.com for the initial pull request.
2019-08-22 12:46:54 +03:00
Aliaksandr Valialkin
4d555c7c87
app/vminsert/opentsdb: fix BenchmarkRowsUnmarshal by adding missing put
prefixes to each line
2019-08-21 19:15:04 +03:00
Aliaksandr Valialkin
90a4b00b10
app/vmselect/promql: fix panic on -search.disableCache
...
Reset the cache if it is disabled instead of stopping, since it is stopped on graceful shutdown.
2019-08-21 17:12:01 +03:00
Aliaksandr Valialkin
491b1762c8
app/vmselect/promql: explain why empty timeseries arent removed in transformLabelValue
2019-08-21 11:29:41 +03:00
Aliaksandr Valialkin
db1de4277c
app/vmselect/promql: remove NaNs from /api/v1/query_range
output like Prometheus does
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/153
2019-08-20 23:01:59 +03:00
Aliaksandr Valialkin
99331606e1
app/vmselect/promql: pre-allocate memory for map for checking for duplicate timeseries
...
This should reduce memory allocations for big number of timeseries
2019-08-20 23:01:57 +03:00
Aliaksandr Valialkin
1101765adb
app/vmselect/promql: add label_value(q, label_name)
func, which returns numeric value labels with name label_name
in q
2019-08-20 00:28:44 +03:00
Aliaksandr Valialkin
940349ccb9
app/vmselect/promql: independently track offset hints for tStart and tEnd
...
This should improve performance if timeseries starts or ends on the selected time range
2019-08-19 13:40:24 +03:00
Aliaksandr Valialkin
6ae4b4190f
app/vmselect/promql: optimize search for timestamp boundaries in rollupConfig.Do
...
This should improve the performance of queries over big number of time series
with big number of output points.
2019-08-19 13:03:38 +03:00
Aliaksandr Valialkin
005aabd305
app/vmselect/promql: add scrape_interval(q[d])
function, which would return scrape interval for q
over d
2019-08-18 21:08:15 +03:00
Aliaksandr Valialkin
218cb4623a
app/vmselect/promql: hande comparisons with NaN
similar to Prometheus
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/150
2019-08-18 00:25:58 +03:00
Aliaksandr Valialkin
dcce92c63c
app/vmselect/promql: add lifetime(q[d])
function, which returns the lifetime of q
over d
in seconds.
...
This function is useful for determining time series lifetime.
`d` must exceed the expected lifetime of the time series, otherwise
the function would return values close to `d`.
2019-08-16 11:59:51 +03:00
Aliaksandr Valialkin
0cb66a8f95
app/vmselect/promql: fix corner-case calculation for ideriv
2019-08-16 11:59:50 +03:00
Aliaksandr Valialkin
1b5b9ced27
app/vmselect/promql: properly handle corner cases for rollup functions
2019-08-15 23:31:28 +03:00
Aliaksandr Valialkin
b8bbe92de1
app/vmselect/promql: store compressed results in the cache
...
This should increase rollup results cache capacity.
2019-08-14 02:32:16 +03:00
Aliaksandr Valialkin
8c2158af24
all: use workingsetcache instead of fastcache
...
This should reduce the amount of RAM required for processing time series
with non-zero churn rate.
The previous cache behavior can be restored with `-cache.oldBehavior` command-line flag.
2019-08-13 21:40:28 +03:00
Aliaksandr Valialkin
f56c1298ad
app/vmstorage: add vm_concurrent_addrows_*
metrics for tracking concurrency for Storage.AddRows calls
...
Track also the number of dropped rows due to the exceeded timeout
on concurrency limit for Storage.AddRows. This number is tracked in `vm_concurrent_addrows_dropped_rows_total`
2019-08-06 15:08:43 +03:00
Aliaksandr Valialkin
8e05758ff5
app: add vm_concurrent_
metrics for visibility in concurrency limiters for vminsert and vmselect
2019-08-05 18:30:29 +03:00
Aliaksandr Valialkin
53c8f56436
app/vmselect: allow passing match[]
, start
and time
to /api/v1/label/<label_name>/values
...
`/api/v1/label/<label_name>/values?match[]=q` emulates emulates `label_values(q, <label_name>)`
call in Grafana templating.
2019-08-04 23:07:00 +03:00
Aliaksandr Valialkin
880b1d80b1
app/vmselect: optimize /api/v1/series
by skipping storage data
...
Fetch and process only time series metainfo.
2019-08-04 23:00:46 +03:00
Aliaksandr Valialkin
7f5afae1e3
app/vmselect/prometheus: prevent from fetching and scanning all the data on /api/v1/searies
call by default
2019-08-04 19:42:45 +03:00
Aliaksandr Valialkin
000c154641
app/vmselect/promql: tune automatic window adjustement
...
Increase the windows adjustement for small scrape intervals,
since they usually have higher jitter.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/139
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/134
2019-08-04 19:34:11 +03:00
Aliaksandr Valialkin
1d4ddadbb1
app/vmselect/promql: further increase the allowed jitter for scrape interval
...
Real-world production data shows higher jitter than 1/8 of scrape interval.
This may results in gaps on the graph. So increase the allowed jitter to 1/4
of scrape interval in order to reduce the probability of gaps on the graphs
over time series with high jitter for scrape_interval.
2019-08-02 20:16:41 +03:00
Aliaksandr Valialkin
8ed84a4713
app/vminsert/influx: round automatically generated timestamp according to the given precision
arg
2019-08-02 00:24:39 +03:00
Aliaksandr Valialkin
ade7bc30db
app/vmselect/promql: tolerate higher jitter in scrape interval
...
Allow jitter for up to 1/8 instead of 1/16 for the scrape interval.
This should imrpove graphs when `step` is smaller than the `scrape_interval`.
2019-08-01 23:25:53 +03:00
Aliaksandr Valialkin
c994fbf500
app/vmselect/promql: add vm_slow_queries_total
metric for counting slow queries
...
The query is slow if its execution time exceeds `-search.logSlowQueryDuration`
2019-07-31 03:36:45 +03:00
Aliaksandr Valialkin
071a122119
app/vmselect/promql: return NaN from histogram_quantile if at least a single bucket is broken
2019-07-31 01:18:11 +03:00
Aliaksandr Valialkin
b9a16b93e7
app/vmselect/promql: allow adjusting window for default rollup function
...
Default rollup function is `last_over_time`. It must support adjusting
the provided window in order to prevent from gaps on the graph
for window values smaller than scrape interval.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/134
2019-07-31 00:45:58 +03:00
Aliaksandr Valialkin
c901a6472f
app/vmselect/promql: return NaN values if invalid bucket counts are passed to histogram_quantile
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/136
2019-07-30 22:05:55 +03:00
Aliaksandr Valialkin
5b8526e925
app/vmselect/netstorage: improve error message when reading data blocks from storage
...
Mention the block number in the error. This should simplify troubleshooting in this code.
2019-07-28 12:17:33 +03:00
Aliaksandr Valialkin
b7089705b7
app/vminsert: add vm_rows_per_insert
summary metric
...
This metric should help tuning batch sizes on clients writing data to VictoriaMetrics
2019-07-27 13:28:20 +03:00
Aliaksandr Valialkin
1fd4e9fb5c
app/vminsert: improve error messages for Influx, OpenTSDB and Graphite parsing
...
Include in the error message the line which failed to parse.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/127
2019-07-26 22:09:21 +03:00
Aliaksandr Valialkin
8253790157
app/vmstorage: consistency renaming for ignored rows
metrics
...
vm_too_big_timestamp_rows_total -> vm_rows_ignored_total{reason="big_timestamp"}
vm_too_small_timestamp_rows_total -> vm_rows_ignored_total{reason="small_timestamp"}
2019-07-26 20:02:24 +03:00
Aliaksandr Valialkin
c6bec48927
lib/storage: add metrics for calculating skipped rows outside the retention
...
The metrics are:
- vm_too_big_timestamp_rows_total
- vm_too_small_timestamp_rows_total
2019-07-26 14:11:56 +03:00
Aliaksandr Valialkin
aac482517f
app/vmselect/promql: return NaN from count()
over zero time series
...
This aligns `count` behavior with Prometheus.
2019-07-25 22:02:34 +03:00
Aliaksandr Valialkin
0e52357f35
app/vmselect/promql: properly calculate incremental aggregations grouped by __name__
...
Previously the following query may fail on multiple distinct metric names match:
sum(count_over_time{__name__!=''}) by (__name__)
2019-07-25 21:53:26 +03:00
Aliaksandr Valialkin
54f035d4ce
all: small updates after PR #114
2019-07-24 17:43:43 +03:00
Aliaksandr Valialkin
cb8104cf77
app: clarify error messages when -storageNode
arg is missing in vminsert and vmselect
2019-07-20 10:21:59 +03:00
Aliaksandr Valialkin
2b4254d01f
app/vminsert: use netutil.TCPListener for collecting network-related metrics for Graphite and OpenTSDB TCP traffic
2019-07-15 22:58:35 +03:00
Aliaksandr Valialkin
092c9b39a8
app/vmselect/promql: remove empty time series after applying filters like q > 0
...
This should reduce CPU and RAM usage for queries over high number of time series.
2019-07-12 19:59:49 +03:00
Aliaksandr Valialkin
6875fb411a
app/vmselect/promql: parallelize incremental aggregation to multiple CPU cores
...
This may reduce response times for aggregation over big number of time series
with small step between output data points.
2019-07-12 15:53:12 +03:00
Aliaksandr Valialkin
4a8e6f47fe
app/vmselect/prometheus: set start
arg in /api/v1/series
to the minimum allowed time by default as Prometheus does
...
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/91
2019-07-11 17:11:37 +03:00
Aliaksandr Valialkin
3313cdf816
app/vmselect/prometheus: convert negative times to 0, since they arent supported by the storage
2019-07-11 17:11:35 +03:00
Aliaksandr Valialkin
cbab86fd9d
app/vmselect/promql: reduce RAM usage for aggregates over big number of time series
...
Calculate incremental aggregates for `aggr(metric_selector)` function instead of
keeping all the time series matching the given `metric_selector` in memory.
2019-07-10 13:03:36 +03:00
Aliaksandr Valialkin
ba8195c58e
all: consistency renaming: bytesSize -> sizeBytes
2019-07-10 00:47:42 +03:00
Aliaksandr Valialkin
df6f17b82c
app/vmselect/promql: mention -search.logSlowQueryDuration
flag value in the slow query log message
2019-07-10 00:43:01 +03:00
Aliaksandr Valialkin
73ae889244
app/vmselect/promql: extract rmoeveGroupTags
function for removing unneeded tags from MetricName according to the given modifierExpr
2019-07-09 23:20:58 +03:00
Aliaksandr Valialkin
603b34edbd
app/vmselect/promql: properly preserve metric name after applying functions in any case from transformFuncsKeepMetricGroup
2019-07-09 23:10:49 +03:00
Aliaksandr Valialkin
d6ec95693d
app/vmselect/prometheus: typo fix
2019-07-07 23:34:04 +03:00
Aliaksandr Valialkin
36636c1f6f
app/vmselect/prometheus: handle minTime
and maxTime
values that may be set by Promxy or Prometheus client
...
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/88
2019-07-07 21:53:52 +03:00
Aliaksandr Valialkin
bba07d05fe
app/vmselect/promql: remove empty timeseries left after topk
call
2019-07-04 19:43:07 +03:00
Aliaksandr Valialkin
41f512af1c
all: add vm_data_size_bytes
metrics for easy monitoring of on-disk data size and on-disk inverted index size
2019-07-04 19:43:04 +03:00
Aliaksandr Valialkin
512a627855
app/vmselect/prometheus: update adjustLastPoints function
...
- Do not overwrite last points by the previous NaNs, since this may result in empty time series.
- Overwrite the last 2 points instead of 3. This should be enough in most cases.
2019-07-04 09:30:56 +03:00
Aliaksandr Valialkin
858746fa6c
app/vmselect/promql: gracefully handle duplicate timestamps in irate
and rollup_rate
funcs
...
Previously such timestamps result in `+Inf` results. Now the previous timestamp is used
for the calculations.
2019-07-03 12:41:30 +03:00
Aliaksandr Valialkin
a3abed80ff
app/vmselect: do not return empty time series in /api/v1/query
result
2019-07-01 17:16:26 +03:00
Aliaksandr Valialkin
c3c60bee45
app/vmselect: add -search.denyPartialResponse
flag for disabling partial responses if some of vmstorage nodes are unavailable
...
Also accept `deny_partial_response` query arg in Prometheus API handlers. If it is set to true,
then return error if some of vmstorage nodes are unavailable.
2019-06-30 01:27:07 +03:00
Aliaksandr Valialkin
72a3050c41
app/vmselect/promql: consistency renaming: candlestick -> rollup_candlestick
2019-06-29 03:13:25 +03:00
Aliaksandr Valialkin
391bc8bf38
app/vmselect: fix 32bit arm build
...
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/83
2019-06-27 19:37:17 +03:00
Aliaksandr Valialkin
96342f1422
app/vmselect: add candlestick(m[d])
func for returning open
, close
, low
and high
rollups on the given time range d
...
This function is frequently used in financial apps. See https://en.wikipedia.org/wiki/Candlestick_chart
2019-06-27 18:46:54 +03:00
Jiri Tyr
e7a0bf1a71
Change the default influxMeasurementFieldSeparator
2019-06-26 13:22:54 +03:00
Aliaksandr Valialkin
d5cb9fddd8
app/vminsert: fix inifinite loop when reading two lines without newline in the end
...
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/82
2019-06-26 02:52:56 +03:00
Aliaksandr Valialkin
4f54bcf90b
app/vmselect/promql: suppress error when template func is used inside modifier list. Just leave it as is
...
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/78
2019-06-25 20:43:57 +03:00
Aliaksandr Valialkin
d0bf4393a9
app/vmselect/promql: increase default value for -search.maxPointsPerTimeSeries from 10k to 30k
...
This may be required for subqueries with small steps. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/77
2019-06-24 22:53:25 +03:00
Aliaksandr Valialkin
334cf253c7
app/vmselect/promql: adjust value returned by linearRegression to the end of time range like Prometheus does
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/71
2019-06-24 22:46:03 +03:00
Aliaksandr Valialkin
14cd628948
app/vmselect/promql: add sum2
and sum2_over_time
, geomean
and geomean_over_time
funcs.
...
These functions may be useful for statistic calculations.
2019-06-24 16:45:00 +03:00
Aliaksandr Valialkin
0eac538fc8
app/vmselect/promql: adjust the provided window only for range functions with dt in denominator
...
This should fix range function calculations such as `changes(m[d])` where `d` is smaller
than the scrape interval.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/72
2019-06-23 19:27:25 +03:00
Aliaksandr Valialkin
ec57e59154
app/vmselect/promql: use deriv_fast
instead of deriv
in ttf
, since deriv
calculations have been changed recently
2019-06-23 15:54:12 +03:00
Aliaksandr Valialkin
516062b162
app/vmselect/promql: adjust ttf
calculation, so deriv(freev)
for freev
=m[d]
could be properly calculated
2019-06-23 14:31:36 +03:00
Aliaksandr Valialkin
a4e040f5ef
app/vmselect/promql: typo fixes in comments
2019-06-21 23:22:54 +03:00
Aliaksandr Valialkin
c05d443791
app/vmselect/promql: add deriv_fast
function for calculating fast derivative
...
`deriv_fast` calculates derivative based on the first and the last point on the interval
instead of calculating linear regression based on all the data points on the interval.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/73
2019-06-21 23:05:48 +03:00
Aliaksandr Valialkin
98eafdbd58
app/vmselect/promql: use linear regression in deriv
func like Prometheus does
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/73
2019-06-21 22:54:34 +03:00
Aliaksandr Valialkin
f334908c22
app/vmselect/promql: ajdust data model to the model used in Prometheus
...
Do not take into account data points on the range `[timestamp .. timestamp+step)`
when calculating value on the given `timestamp`.
Use only data points from the past when performing these calculations like Prometheus does.
This should reduce discrepancies between results returned by VictoriaMetrics
and results returned by Prometheus.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/72
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/71
2019-06-21 21:55:25 +03:00
Aliaksandr Valialkin
837e349b7d
app/vmselect/promql: do not strip __name__
form time series after binary comparison operation
...
Example:
foo > 10
Would leave `foo` name for all the matching time series on the left.
2019-06-21 13:08:02 +03:00
Aliaksandr Valialkin
e84b7641ef
app/vmselect/promql: remove unused func keepLastValue
; updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/69
2019-06-20 14:35:19 +03:00
Aliaksandr Valialkin
db042bf6d6
app/vmselect/promql: typo fix; updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/69
2019-06-20 14:33:52 +03:00
Aliaksandr Valialkin
3838d224d5
app/vminsert/opentsdb: remove unused const maxReadPacketSize
; update https://github.com/VictoriaMetrics/VictoriaMetrics/issues/69
2019-06-20 14:30:02 +03:00
Aliaksandr Valialkin
a3a53647ba
app/vmselect/prometheus: return better error messages on missing args to /api/v1/*
2019-06-20 14:07:44 +03:00
Aliaksandr Valialkin
a0c22a6830
app/vmstorage: add vm_cache_entries{type="storage/hour_metric_ids"}
metric for tracking active time series count
2019-06-19 18:37:38 +03:00
Aliaksandr Valialkin
d4ed6189d4
app/vminsert/graphite: allow skipping timestamps in Graphite plaintext protocol
...
In this case VictoriaMetrics uses the ingestion time as a timestamp.
2019-06-18 19:05:46 +03:00
Aliaksandr Valialkin
e40224d5de
lib/flagutil: add NewArray helper func
2019-06-18 10:44:09 +03:00
Aliaksandr Valialkin
3b16d49514
app/vminsert/influx: add -influxSkipSingleField
flag for using {measurement}
instead of {measurement}{separator}{field_name}
for Influx lines with a single field
...
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/66
2019-06-17 19:05:46 +03:00
Aliaksandr Valialkin
5f0b3589b2
app/vminsert/influx: add -influxMeasurementFieldSeparator flag for the ability to change separator for {measurement}{separator}{field_name}
metric name
2019-06-14 09:57:13 +03:00
Aliaksandr Valialkin
947bc16f8c
app/vmselect/promql: use dynamic limit on memory for concurrent queries
2019-06-12 23:18:23 +03:00
Aliaksandr Valialkin
8567e3463d
app/vmselect/promql: merge non-overlapping duplicate time series in group_left
and group_right
joins
2019-06-12 20:33:01 +03:00
Aliaksandr Valialkin
88005237f4
app/vmselect/promql: swap binary operation with modifier in the error message for improved readability
2019-06-12 17:14:33 +03:00
Aliaksandr Valialkin
a71381ad2a
app/vmselect/promql: list a sample of duplicate time series in the error message for group_left
or group_right
...
This should improve troubleshooting for complex queries involving `group_left` and `group_right` modifiers.
2019-06-12 16:57:34 +03:00
Aliaksandr Valialkin
18d6f293f7
lib/fs: consolidate *RemoveAll* funcs into a single MustRemoveAll func
...
The func syncs parent dir in order to persist directory removal
in the event of power loss
2019-06-12 01:55:18 +03:00
Aliaksandr Valialkin
3437c30180
all: try hard removing directory with contents
...
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/61
2019-06-11 01:58:08 +03:00
Aliaksandr Valialkin
eea7da8e0c
app/vmselect/promql: prevent from count_values
explosion of timeseries, which could result in OOM
2019-06-11 01:03:18 +03:00
Aliaksandr Valialkin
e87a602209
app/vmselect/promql: skip superflouos timestamps copying in count_values
2019-06-11 00:44:09 +03:00
Aliaksandr Valialkin
ec84febc1c
app/vmselect/promql: remove superflouos timeseries copy in histogram_quantile
func
2019-06-11 00:39:35 +03:00
Aliaksandr Valialkin
1fab34fb5c
app/vmselect/promql: remove superflouos timeseries copy in union
func
2019-06-11 00:35:09 +03:00
Aliaksandr Valialkin
a6f368499d
app/vmselect/promql: skip NaN values in count_values
func
2019-06-10 22:42:41 +03:00
Aliaksandr Valialkin
945894e049
app/vmselect: properly handle empty label (aka __name__) in LabelEntries handler
2019-06-10 19:55:02 +03:00
Aliaksandr Valialkin
75a0acf72d
app/vmselect: add /api/v1/labels/count
handler for quick detection of labels with the maximum number of distinct values
2019-06-10 19:54:55 +03:00
Aliaksandr Valialkin
547bcdce63
app/vmstorage: enable compression of responses to vmselect by default
...
This should save vmstorage => vmselect network bandwidth in common case
when recently added data is queried.
2019-06-10 14:54:59 +03:00
Aliaksandr Valialkin
d54f5fec0b
lib/storage: skip adaptive searching for tag filter matching the minimum number of metrics if the identical previous search didn't found such filter
...
This should improve speed for searching metrics among high number of time series
with high churn rate like in big Kubernetes clusters with frequent deployments.
2019-06-10 14:07:47 +03:00
Aliaksandr Valialkin
4c3913290a
app/vmstorage: add missing _total
suffixes to newly added metrics
2019-06-09 22:11:41 +03:00
Aliaksandr Valialkin
d882afa905
lib/storage: optimize time series lookup for recent hours when the db contains many millions of time series with high churn rate (aka frequent deployments in Kubernetes)
2019-06-09 19:14:04 +03:00
Aliaksandr Valialkin
5fcdb4a59a
app/vminsert: improve handling of unhealthy vmstorage nodes
...
* Spread load evenly among remaining healthy nodes instead of hammering
the next node after the unhealthy node.
* Make sure that the packet is flushed to storage node before returning success.
Previously packets could stay in local buffers and thus lost on connection errors.
* Keep rows in the limited memory when all the storage nodes are unhealthy.
2019-06-09 00:42:36 +03:00
Aliaksandr Valialkin
0f64673327
app/vminsert/concurrencylimiter: typo fix in the error message
2019-06-08 22:43:56 +03:00
Aliaksandr Valialkin
89a113cb5d
app/vminsert: really fix #60
...
ReadLinesBlock may accept dstBuf with non-zero length. In this case the last line without trailing newline isn't read.
Fix this by comparing len(dstBuf) to 0 instead of its original length.
2019-06-07 23:40:10 +03:00
Aliaksandr Valialkin
e1c45b314a
app/vminsert: properly read trailing line without newline in the end
...
This fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/60
2019-06-07 23:18:34 +03:00
Aliaksandr Valialkin
8cf0a0e59c
app/vminsert: split vm_rows_inserted_total into per-(accountID, projectID) metrics
...
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/59
2019-06-07 22:11:20 +03:00
Aliaksandr Valialkin
913f888d0c
app/vmselect/promql: properly handle {__name__ op "string"}
queries
...
This has been broken in 7294ef333ad26f4f6578b783e97649e58b1f8945 .
2019-06-07 02:02:09 +03:00
Aliaksandr Valialkin
11979e4d85
app/vmselect/prometheus: report about incorrect time or duration instead of silently using the default value
...
This should prevent from incorrect usage of the querying API.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/52
2019-06-06 22:17:15 +03:00
Aliaksandr Valialkin
5f2aa4539a
app/vminsert: add multi-tenancy support for OpenTSDB and Graphite ingestion via custom tags
...
* VictoriaMetrics_AccountID tag may be used for setting AccountID
* VictoriaMetrics_ProjectID tag may be used for setting ProjectID
2019-06-06 18:07:30 +03:00
Aliaksandr Valialkin
8f4790625d
app/vmselect/promql: return the correct time series from quantile
...
Previously arbitrary time series could be returned from `quantile`
depending on sort order for the last data point in the selected range.
Fix this by returning the calculated time series.
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/55
2019-06-06 17:33:53 +03:00
Aliaksandr Valialkin
2ff0d595b0
app/vmselect/promql: add -search.disableCache
flag for disabling response caching
...
This may be useful for data back-filling, when the response caching
could interfere badly with newly added data points with timestamps
in the past.
2019-06-04 17:30:41 +03:00
Aliaksandr Valialkin
ba58af9d8c
app/vminsert/influx: take into account all the tags for consistent hash calculations
2019-06-03 22:54:21 +03:00
Aliaksandr Valialkin
db21d46417
app/vminsert: emulate influx/query request, which is required for TSBS benchmark
2019-06-03 18:39:46 +03:00
Aliaksandr Valialkin
31d6566aff
app/vminsert: accept data on /insert/<accountID>/prometheus/api/v1/write
2019-06-03 18:18:09 +03:00
Aliaksandr Valialkin
a06b7f7f84
app/vmselect/netstorage: remove spammy error message when certain vmstorage nodes are unavailable during query execution
...
The amount of partial responses may be tracked by `vm_partial_search_results_total` metric.
2019-06-03 17:09:50 +03:00
Aliaksandr Valialkin
53242105fb
app/vmselect/promql: allow escaping identifiers with \
and \xXX
...
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/42
2019-05-31 17:35:54 +03:00
Aliaksandr Valialkin
ee776ca8fc
app/vminsert: add -maxConcurrentInserts
command-line flag for limiting the number of concurrent inserts
2019-05-29 12:40:22 +03:00
Aliaksandr Valialkin
a4ec139a4a
app/vminsert: reduce memory usage for Influx, Graphite and OpenTSDB protocols
...
Do not buffer per-connection data and just store it as it arrives
2019-05-28 18:47:52 +03:00
Aliaksandr Valialkin
a6d02ff275
lib/timerpool: use timer pool in concurrency limiters
...
This should reduce the number of memory allocations in highly loaded system
2019-05-28 17:30:10 +03:00
Aliaksandr Valialkin
b7a91d6ba7
app/vmselect: update comment according to the updated code
2019-05-26 22:39:09 +03:00
Aliaksandr Valialkin
15d1e15ae6
app/vminsert/influx: try converting string values to numeric values, since Influx agents may send numeric values as strings
...
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/34
2019-05-26 22:12:55 +03:00
Aliaksandr Valialkin
a2c71f18a3
app/vmselect/promql: misspeling fix
2019-05-25 21:53:48 +03:00
Aliaksandr Valialkin
bdf696ef18
all: fix misspellings
2019-05-25 21:51:24 +03:00
Aliaksandr Valialkin
121a920a18
Makefile: add -s flag to go fmt
in make fmt
command
2019-05-25 21:44:36 +03:00
Aliaksandr Valialkin
2ff996e276
app/vmselect: log slow queries if their execution time exceeds -search.logSlowQueryDuration
2019-05-24 16:14:46 +03:00
Aliaksandr Valialkin
628708ad76
app/vmselect: consume resultsCh data in exportHandler if writeResponseFunc failed to consume it
2019-05-24 14:54:54 +03:00
Aliaksandr Valialkin
364f4ec3bb
all: remove -p XXXX:XXXX
from docker run
options, since it is unnesessary if --net=host
is set
2019-05-24 12:53:12 +03:00
Aliaksandr Valialkin
f37903adb3
app/vminsert: add -rpc.disableCompression
command-line flag for reducing CPU usage at the cost of higher network bandwidth usage
2019-05-24 12:51:07 +03:00
Aliaksandr Valialkin
8e3eb5b39d
app/vmselect/promql: add alias(q, name)
function that sets the given name to all the time series in q
2019-05-24 02:42:10 +03:00
Aliaksandr Valialkin
bb048937bc
app/vmselect/promql: add label_transform(q, label, regexp, replacement)
function for replacing all the occurences of regexp with replacement in the given label for q
2019-05-23 16:26:07 +03:00
Aliaksandr Valialkin
24578b4bb1
all: open-sourcing cluster version
2019-05-23 00:25:38 +03:00
Aliaksandr Valialkin
1836c415e6
all: open-sourcing single-node version
2019-05-23 00:18:06 +03:00