Commit Graph

3105 Commits

Author SHA1 Message Date
Aliaksandr Valialkin
c994fbf500 app/vmselect/promql: add vm_slow_queries_total metric for counting slow queries
The query is slow if its execution time exceeds `-search.logSlowQueryDuration`
2019-07-31 03:36:45 +03:00
Aliaksandr Valialkin
071a122119 app/vmselect/promql: return NaN from histogram_quantile if at least a single bucket is broken 2019-07-31 01:18:11 +03:00
Aliaksandr Valialkin
b9a16b93e7 app/vmselect/promql: allow adjusting window for default rollup function
Default rollup function is `last_over_time`. It must support adjusting
the provided window in order to prevent from gaps on the graph
for window values smaller than scrape interval.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/134
2019-07-31 00:45:58 +03:00
Aliaksandr Valialkin
c901a6472f app/vmselect/promql: return NaN values if invalid bucket counts are passed to histogram_quantile
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/136
2019-07-30 22:05:55 +03:00
Aliaksandr Valialkin
5b8526e925 app/vmselect/netstorage: improve error message when reading data blocks from storage
Mention the block number in the error. This should simplify troubleshooting in this code.
2019-07-28 12:17:33 +03:00
Aliaksandr Valialkin
b7089705b7 app/vminsert: add vm_rows_per_insert summary metric
This metric should help tuning batch sizes on clients writing data to VictoriaMetrics
2019-07-27 13:28:20 +03:00
Aliaksandr Valialkin
1fd4e9fb5c app/vminsert: improve error messages for Influx, OpenTSDB and Graphite parsing
Include in the error message the line which failed to parse.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/127
2019-07-26 22:09:21 +03:00
Aliaksandr Valialkin
8253790157 app/vmstorage: consistency renaming for ignored rows metrics
vm_too_big_timestamp_rows_total -> vm_rows_ignored_total{reason="big_timestamp"}
  vm_too_small_timestamp_rows_total -> vm_rows_ignored_total{reason="small_timestamp"}
2019-07-26 20:02:24 +03:00
Aliaksandr Valialkin
c6bec48927 lib/storage: add metrics for calculating skipped rows outside the retention
The metrics are:

    - vm_too_big_timestamp_rows_total
    - vm_too_small_timestamp_rows_total
2019-07-26 14:11:56 +03:00
Aliaksandr Valialkin
aac482517f app/vmselect/promql: return NaN from count() over zero time series
This aligns `count` behavior with Prometheus.
2019-07-25 22:02:34 +03:00
Aliaksandr Valialkin
0e52357f35 app/vmselect/promql: properly calculate incremental aggregations grouped by __name__
Previously the following query may fail on multiple distinct metric names match:

    sum(count_over_time{__name__!=''}) by (__name__)
2019-07-25 21:53:26 +03:00
Aliaksandr Valialkin
54f035d4ce all: small updates after PR #114 2019-07-24 17:43:43 +03:00
Aliaksandr Valialkin
cb8104cf77 app: clarify error messages when -storageNode arg is missing in vminsert and vmselect 2019-07-20 10:21:59 +03:00
Aliaksandr Valialkin
2b4254d01f app/vminsert: use netutil.TCPListener for collecting network-related metrics for Graphite and OpenTSDB TCP traffic 2019-07-15 22:58:35 +03:00
Aliaksandr Valialkin
092c9b39a8 app/vmselect/promql: remove empty time series after applying filters like q > 0
This should reduce CPU and RAM usage for queries over high number of time series.
2019-07-12 19:59:49 +03:00
Aliaksandr Valialkin
6875fb411a app/vmselect/promql: parallelize incremental aggregation to multiple CPU cores
This may reduce response times for aggregation over big number of time series
with small step between output data points.
2019-07-12 15:53:12 +03:00
Aliaksandr Valialkin
4a8e6f47fe app/vmselect/prometheus: set start arg in /api/v1/series to the minimum allowed time by default as Prometheus does
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/91
2019-07-11 17:11:37 +03:00
Aliaksandr Valialkin
3313cdf816 app/vmselect/prometheus: convert negative times to 0, since they arent supported by the storage 2019-07-11 17:11:35 +03:00
Aliaksandr Valialkin
cbab86fd9d app/vmselect/promql: reduce RAM usage for aggregates over big number of time series
Calculate incremental aggregates for `aggr(metric_selector)` function instead of
keeping all the time series matching the given `metric_selector` in memory.
2019-07-10 13:03:36 +03:00
Aliaksandr Valialkin
ba8195c58e all: consistency renaming: bytesSize -> sizeBytes 2019-07-10 00:47:42 +03:00
Aliaksandr Valialkin
df6f17b82c app/vmselect/promql: mention -search.logSlowQueryDuration flag value in the slow query log message 2019-07-10 00:43:01 +03:00
Aliaksandr Valialkin
73ae889244 app/vmselect/promql: extract rmoeveGroupTags function for removing unneeded tags from MetricName according to the given modifierExpr 2019-07-09 23:20:58 +03:00
Aliaksandr Valialkin
603b34edbd app/vmselect/promql: properly preserve metric name after applying functions in any case from transformFuncsKeepMetricGroup 2019-07-09 23:10:49 +03:00
Aliaksandr Valialkin
d6ec95693d app/vmselect/prometheus: typo fix 2019-07-07 23:34:04 +03:00
Aliaksandr Valialkin
36636c1f6f app/vmselect/prometheus: handle minTime and maxTime values that may be set by Promxy or Prometheus client
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/88
2019-07-07 21:53:52 +03:00
Aliaksandr Valialkin
bba07d05fe app/vmselect/promql: remove empty timeseries left after topk call 2019-07-04 19:43:07 +03:00
Aliaksandr Valialkin
41f512af1c all: add vm_data_size_bytes metrics for easy monitoring of on-disk data size and on-disk inverted index size 2019-07-04 19:43:04 +03:00
Aliaksandr Valialkin
512a627855 app/vmselect/prometheus: update adjustLastPoints function
- Do not overwrite last points by the previous NaNs, since this may result in empty time series.
- Overwrite the last 2 points instead of 3. This should be enough in most cases.
2019-07-04 09:30:56 +03:00
Aliaksandr Valialkin
858746fa6c app/vmselect/promql: gracefully handle duplicate timestamps in irate and rollup_rate funcs
Previously such timestamps result in `+Inf` results. Now the previous timestamp is used
for the calculations.
2019-07-03 12:41:30 +03:00
Aliaksandr Valialkin
a3abed80ff app/vmselect: do not return empty time series in /api/v1/query result 2019-07-01 17:16:26 +03:00
Aliaksandr Valialkin
c3c60bee45 app/vmselect: add -search.denyPartialResponse flag for disabling partial responses if some of vmstorage nodes are unavailable
Also accept `deny_partial_response` query arg in Prometheus API handlers. If it is set to true,
then return error if some of vmstorage nodes are unavailable.
2019-06-30 01:27:07 +03:00
Aliaksandr Valialkin
72a3050c41 app/vmselect/promql: consistency renaming: candlestick -> rollup_candlestick 2019-06-29 03:13:25 +03:00
Aliaksandr Valialkin
391bc8bf38 app/vmselect: fix 32bit arm build
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/83
2019-06-27 19:37:17 +03:00
Aliaksandr Valialkin
96342f1422 app/vmselect: add candlestick(m[d]) func for returning open, close, low and high rollups on the given time range d
This function is frequently used in financial apps. See https://en.wikipedia.org/wiki/Candlestick_chart
2019-06-27 18:46:54 +03:00
Jiri Tyr
e7a0bf1a71 Change the default influxMeasurementFieldSeparator 2019-06-26 13:22:54 +03:00
Aliaksandr Valialkin
d5cb9fddd8 app/vminsert: fix inifinite loop when reading two lines without newline in the end
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/82
2019-06-26 02:52:56 +03:00
Aliaksandr Valialkin
4f54bcf90b app/vmselect/promql: suppress error when template func is used inside modifier list. Just leave it as is
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/78
2019-06-25 20:43:57 +03:00
Aliaksandr Valialkin
d0bf4393a9 app/vmselect/promql: increase default value for -search.maxPointsPerTimeSeries from 10k to 30k
This may be required for subqueries with small steps. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/77
2019-06-24 22:53:25 +03:00
Aliaksandr Valialkin
334cf253c7 app/vmselect/promql: adjust value returned by linearRegression to the end of time range like Prometheus does
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/71
2019-06-24 22:46:03 +03:00
Aliaksandr Valialkin
14cd628948 app/vmselect/promql: add sum2 and sum2_over_time, geomean and geomean_over_time funcs.
These functions may be useful for statistic calculations.
2019-06-24 16:45:00 +03:00
Aliaksandr Valialkin
0eac538fc8 app/vmselect/promql: adjust the provided window only for range functions with dt in denominator
This should fix range function calculations such as `changes(m[d])` where `d` is smaller
than the scrape interval.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/72
2019-06-23 19:27:25 +03:00
Aliaksandr Valialkin
ec57e59154 app/vmselect/promql: use deriv_fast instead of deriv in ttf, since deriv calculations have been changed recently 2019-06-23 15:54:12 +03:00
Aliaksandr Valialkin
516062b162 app/vmselect/promql: adjust ttf calculation, so deriv(freev) for freev=m[d] could be properly calculated 2019-06-23 14:31:36 +03:00
Aliaksandr Valialkin
a4e040f5ef app/vmselect/promql: typo fixes in comments 2019-06-21 23:22:54 +03:00
Aliaksandr Valialkin
c05d443791 app/vmselect/promql: add deriv_fast function for calculating fast derivative
`deriv_fast` calculates derivative based on the first and the last point on the interval
instead of calculating linear regression based on all the data points on the interval.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/73
2019-06-21 23:05:48 +03:00
Aliaksandr Valialkin
98eafdbd58 app/vmselect/promql: use linear regression in deriv func like Prometheus does
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/73
2019-06-21 22:54:34 +03:00
Aliaksandr Valialkin
f334908c22 app/vmselect/promql: ajdust data model to the model used in Prometheus
Do not take into account data points on the range `[timestamp .. timestamp+step)`
when calculating value on the given `timestamp`.
Use only data points from the past when performing these calculations like Prometheus does.

This should reduce discrepancies between results returned by VictoriaMetrics
and results returned by Prometheus.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/72
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/71
2019-06-21 21:55:25 +03:00
Aliaksandr Valialkin
837e349b7d app/vmselect/promql: do not strip __name__ form time series after binary comparison operation
Example:

  foo > 10

Would leave `foo` name for all the matching time series on the left.
2019-06-21 13:08:02 +03:00
Aliaksandr Valialkin
e84b7641ef app/vmselect/promql: remove unused func keepLastValue; updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/69 2019-06-20 14:35:19 +03:00
Aliaksandr Valialkin
db042bf6d6 app/vmselect/promql: typo fix; updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/69 2019-06-20 14:33:52 +03:00
Aliaksandr Valialkin
3838d224d5 app/vminsert/opentsdb: remove unused const maxReadPacketSize; update https://github.com/VictoriaMetrics/VictoriaMetrics/issues/69 2019-06-20 14:30:02 +03:00
Aliaksandr Valialkin
a3a53647ba app/vmselect/prometheus: return better error messages on missing args to /api/v1/* 2019-06-20 14:07:44 +03:00
Aliaksandr Valialkin
a0c22a6830 app/vmstorage: add vm_cache_entries{type="storage/hour_metric_ids"} metric for tracking active time series count 2019-06-19 18:37:38 +03:00
Aliaksandr Valialkin
d4ed6189d4 app/vminsert/graphite: allow skipping timestamps in Graphite plaintext protocol
In this case VictoriaMetrics uses the ingestion time as a timestamp.
2019-06-18 19:05:46 +03:00
Aliaksandr Valialkin
e40224d5de lib/flagutil: add NewArray helper func 2019-06-18 10:44:09 +03:00
Aliaksandr Valialkin
3b16d49514 app/vminsert/influx: add -influxSkipSingleField flag for using {measurement} instead of {measurement}{separator}{field_name} for Influx lines with a single field
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/66
2019-06-17 19:05:46 +03:00
Aliaksandr Valialkin
5f0b3589b2 app/vminsert/influx: add -influxMeasurementFieldSeparator flag for the ability to change separator for {measurement}{separator}{field_name} metric name 2019-06-14 09:57:13 +03:00
Aliaksandr Valialkin
947bc16f8c app/vmselect/promql: use dynamic limit on memory for concurrent queries 2019-06-12 23:18:23 +03:00
Aliaksandr Valialkin
8567e3463d app/vmselect/promql: merge non-overlapping duplicate time series in group_left and group_right joins 2019-06-12 20:33:01 +03:00
Aliaksandr Valialkin
88005237f4 app/vmselect/promql: swap binary operation with modifier in the error message for improved readability 2019-06-12 17:14:33 +03:00
Aliaksandr Valialkin
a71381ad2a app/vmselect/promql: list a sample of duplicate time series in the error message for group_left or group_right
This should improve troubleshooting for complex queries involving `group_left` and `group_right` modifiers.
2019-06-12 16:57:34 +03:00
Aliaksandr Valialkin
18d6f293f7 lib/fs: consolidate *RemoveAll* funcs into a single MustRemoveAll func
The func syncs parent dir in order to persist directory removal
in the event of power loss
2019-06-12 01:55:18 +03:00
Aliaksandr Valialkin
3437c30180 all: try hard removing directory with contents
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/61
2019-06-11 01:58:08 +03:00
Aliaksandr Valialkin
eea7da8e0c app/vmselect/promql: prevent from count_values explosion of timeseries, which could result in OOM 2019-06-11 01:03:18 +03:00
Aliaksandr Valialkin
e87a602209 app/vmselect/promql: skip superflouos timestamps copying in count_values 2019-06-11 00:44:09 +03:00
Aliaksandr Valialkin
ec84febc1c app/vmselect/promql: remove superflouos timeseries copy in histogram_quantile func 2019-06-11 00:39:35 +03:00
Aliaksandr Valialkin
1fab34fb5c app/vmselect/promql: remove superflouos timeseries copy in union func 2019-06-11 00:35:09 +03:00
Aliaksandr Valialkin
a6f368499d app/vmselect/promql: skip NaN values in count_values func 2019-06-10 22:42:41 +03:00
Aliaksandr Valialkin
945894e049 app/vmselect: properly handle empty label (aka __name__) in LabelEntries handler 2019-06-10 19:55:02 +03:00
Aliaksandr Valialkin
75a0acf72d app/vmselect: add /api/v1/labels/count handler for quick detection of labels with the maximum number of distinct values 2019-06-10 19:54:55 +03:00
Aliaksandr Valialkin
547bcdce63 app/vmstorage: enable compression of responses to vmselect by default
This should save vmstorage => vmselect network bandwidth in common case
when recently added data is queried.
2019-06-10 14:54:59 +03:00
Aliaksandr Valialkin
d54f5fec0b lib/storage: skip adaptive searching for tag filter matching the minimum number of metrics if the identical previous search didn't found such filter
This should improve speed for searching metrics among high number of time series
with high churn rate like in big Kubernetes clusters with frequent deployments.
2019-06-10 14:07:47 +03:00
Aliaksandr Valialkin
4c3913290a app/vmstorage: add missing _total suffixes to newly added metrics 2019-06-09 22:11:41 +03:00
Aliaksandr Valialkin
d882afa905 lib/storage: optimize time series lookup for recent hours when the db contains many millions of time series with high churn rate (aka frequent deployments in Kubernetes) 2019-06-09 19:14:04 +03:00
Aliaksandr Valialkin
5fcdb4a59a app/vminsert: improve handling of unhealthy vmstorage nodes
* Spread load evenly among remaining healthy nodes instead of hammering
  the next node after the unhealthy node.
* Make sure that the packet is flushed to storage node before returning success.
  Previously packets could stay in local buffers and thus lost on connection errors.
* Keep rows in the limited memory when all the storage nodes are unhealthy.
2019-06-09 00:42:36 +03:00
Aliaksandr Valialkin
0f64673327 app/vminsert/concurrencylimiter: typo fix in the error message 2019-06-08 22:43:56 +03:00
Aliaksandr Valialkin
89a113cb5d app/vminsert: really fix #60
ReadLinesBlock may accept dstBuf with non-zero length. In this case the last line without trailing newline isn't read.
Fix this by comparing len(dstBuf) to 0 instead of its original length.
2019-06-07 23:40:10 +03:00
Aliaksandr Valialkin
e1c45b314a app/vminsert: properly read trailing line without newline in the end
This fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/60
2019-06-07 23:18:34 +03:00
Aliaksandr Valialkin
8cf0a0e59c app/vminsert: split vm_rows_inserted_total into per-(accountID, projectID) metrics
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/59
2019-06-07 22:11:20 +03:00
Aliaksandr Valialkin
913f888d0c app/vmselect/promql: properly handle {__name__ op "string"} queries
This has been broken in 7294ef333ad26f4f6578b783e97649e58b1f8945 .
2019-06-07 02:02:09 +03:00
Aliaksandr Valialkin
11979e4d85 app/vmselect/prometheus: report about incorrect time or duration instead of silently using the default value
This should prevent from incorrect usage of the querying API.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/52
2019-06-06 22:17:15 +03:00
Aliaksandr Valialkin
5f2aa4539a app/vminsert: add multi-tenancy support for OpenTSDB and Graphite ingestion via custom tags
* VictoriaMetrics_AccountID tag may be used for setting AccountID
* VictoriaMetrics_ProjectID tag may be used for setting ProjectID
2019-06-06 18:07:30 +03:00
Aliaksandr Valialkin
8f4790625d app/vmselect/promql: return the correct time series from quantile
Previously arbitrary time series could be returned from `quantile`
depending on sort order for the last data point in the selected range.

Fix this by returning the calculated time series.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/55
2019-06-06 17:33:53 +03:00
Aliaksandr Valialkin
2ff0d595b0 app/vmselect/promql: add -search.disableCache flag for disabling response caching
This may be useful for data back-filling, when the response caching
could interfere badly with newly added data points with timestamps
in the past.
2019-06-04 17:30:41 +03:00
Aliaksandr Valialkin
ba58af9d8c app/vminsert/influx: take into account all the tags for consistent hash calculations 2019-06-03 22:54:21 +03:00
Aliaksandr Valialkin
db21d46417 app/vminsert: emulate influx/query request, which is required for TSBS benchmark 2019-06-03 18:39:46 +03:00
Aliaksandr Valialkin
31d6566aff app/vminsert: accept data on /insert/<accountID>/prometheus/api/v1/write 2019-06-03 18:18:09 +03:00
Aliaksandr Valialkin
a06b7f7f84 app/vmselect/netstorage: remove spammy error message when certain vmstorage nodes are unavailable during query execution
The amount of partial responses may be tracked by `vm_partial_search_results_total` metric.
2019-06-03 17:09:50 +03:00
Aliaksandr Valialkin
53242105fb app/vmselect/promql: allow escaping identifiers with \ and \xXX
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/42
2019-05-31 17:35:54 +03:00
Aliaksandr Valialkin
ee776ca8fc app/vminsert: add -maxConcurrentInserts command-line flag for limiting the number of concurrent inserts 2019-05-29 12:40:22 +03:00
Aliaksandr Valialkin
a4ec139a4a app/vminsert: reduce memory usage for Influx, Graphite and OpenTSDB protocols
Do not buffer per-connection data and just store it as it arrives
2019-05-28 18:47:52 +03:00
Aliaksandr Valialkin
a6d02ff275 lib/timerpool: use timer pool in concurrency limiters
This should reduce the number of memory allocations in highly loaded system
2019-05-28 17:30:10 +03:00
Aliaksandr Valialkin
b7a91d6ba7 app/vmselect: update comment according to the updated code 2019-05-26 22:39:09 +03:00
Aliaksandr Valialkin
15d1e15ae6 app/vminsert/influx: try converting string values to numeric values, since Influx agents may send numeric values as strings
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/34
2019-05-26 22:12:55 +03:00
Aliaksandr Valialkin
a2c71f18a3 app/vmselect/promql: misspeling fix 2019-05-25 21:53:48 +03:00
Aliaksandr Valialkin
bdf696ef18 all: fix misspellings 2019-05-25 21:51:24 +03:00
Aliaksandr Valialkin
121a920a18 Makefile: add -s flag to go fmt in make fmt command 2019-05-25 21:44:36 +03:00
Aliaksandr Valialkin
2ff996e276 app/vmselect: log slow queries if their execution time exceeds -search.logSlowQueryDuration 2019-05-24 16:14:46 +03:00
Aliaksandr Valialkin
628708ad76 app/vmselect: consume resultsCh data in exportHandler if writeResponseFunc failed to consume it 2019-05-24 14:54:54 +03:00
Aliaksandr Valialkin
364f4ec3bb all: remove -p XXXX:XXXX from docker run options, since it is unnesessary if --net=host is set 2019-05-24 12:53:12 +03:00
Aliaksandr Valialkin
f37903adb3 app/vminsert: add -rpc.disableCompression command-line flag for reducing CPU usage at the cost of higher network bandwidth usage 2019-05-24 12:51:07 +03:00
Aliaksandr Valialkin
8e3eb5b39d app/vmselect/promql: add alias(q, name) function that sets the given name to all the time series in q 2019-05-24 02:42:10 +03:00
Aliaksandr Valialkin
bb048937bc app/vmselect/promql: add label_transform(q, label, regexp, replacement) function for replacing all the occurences of regexp with replacement in the given label for q 2019-05-23 16:26:07 +03:00
Aliaksandr Valialkin
24578b4bb1 all: open-sourcing cluster version 2019-05-23 00:25:38 +03:00
Aliaksandr Valialkin
1836c415e6 all: open-sourcing single-node version 2019-05-23 00:18:06 +03:00