Aliaksandr Valialkin
6ab9c98a1e
app/vmstorage: add -bigMergeConcurrency
and -smallMergeConcurrency
flags for tuning the maximum number of CPU cores used during merges
2019-10-31 16:17:29 +02:00
Aliaksandr Valialkin
4e6bf6f538
app/vmselect: add -search.latencyOffset
flag for tuning the time after data collection when data points become visible in query results
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/218
2019-10-28 12:32:36 +02:00
Aliaksandr Valialkin
78fc35c9b1
all: make fmt
2019-10-17 20:05:12 +03:00
Aliaksandr Valialkin
5b01b7fb01
all: add support for GOARCH=386 and fix all the issues related to 32-bit architectures such as GOARCH=arm
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/212
2019-10-17 18:27:49 +03:00
Aliaksandr Valialkin
99786c2864
app/vmselect/prometheus: add -search.maxLookback
command-line flag for overriding dynamic calculations for max lookback interval
...
This flag is similar to `-search.lookback-delta` if set. The max lookback interval is determined dynamically
from interval between datapoints for each input time series if the flag isn't set.
The interval can be overriden on per-query basis by passing `max_lookback=<duration>` query arg to `/api/v1/query` and `/api/v1/query_range`.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/209
2019-10-15 21:37:17 +03:00
Aliaksandr Valialkin
a5302a6651
app/vmselect/promql: take into account the previous point when calculating max_over_time
and min_over_time
...
This lines up with `first_over_time` function used in `rollup_candlestick`, so `rollup=low` always returns
the minimum value.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/204
2019-10-08 12:30:16 +03:00
Aliaksandr Valialkin
483af3a97a
app/vmselect/netstorage: hint the OS that tmpBlocksFile is read almost sequentially
...
This became the case after b7ee2e7af2
.
2019-09-30 00:13:33 +03:00
Aliaksandr Valialkin
946ca438a6
app/vmselect/netstorage: marshal block outside tmpBlocksFile.WriteBlock
...
This also allows marshaling outside lock, thus reducing the amount of work under the lock.
2019-09-28 20:57:20 +03:00
Aliaksandr Valialkin
e92e39eddf
app/vmselect/netstorage: reduce the number of disk seeks when the query processes big number of time series
2019-09-28 20:57:20 +03:00
Aliaksandr Valialkin
56dff57f77
app/vmselect/netstorage: reduce memory usage when fetching big number of data blocks from vmstorage
...
Dump data blocks directly to temporary file instead of buffering them in RAM
2019-09-28 12:21:57 +03:00
Aliaksandr Valialkin
ba460f62e6
app/vmselect/promql: do not generate timestamps for NaN values in timestamp
function according to Prometheus logic
2019-09-27 18:55:16 +03:00
Aliaksandr Valialkin
bd1cf053f6
app/vmselect/promql: add increases_over_time
and decreases_over_time
functions
...
`increases_over_time(q[d])` returns the number of `q` increases during the given duration `d`.
`decreases_over_time(q[d])` returns the number of `q` decreases during the given duration `d`.
2019-09-25 20:38:51 +03:00
Aliaksandr Valialkin
8d398af92f
app/vminsert/netstorage: mention the data size that cannot be sent to vmstorage
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/175
2019-09-25 12:53:41 +03:00
Aliaksandr Valialkin
73ac7b8dd6
app/vminsert/netstorage: make sure the conn exists before closing it in storageNode.closeBrokenConn
...
The conn can be missing or already closed during the call to storageNode.closeBrokenConn.
Prevent `nil pointer dereference` panic by verifying whether the conn is already closed.
Thanks to @CH-anhngo for reporting the issue.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/189
2019-09-25 10:36:50 +03:00
Aliaksandr Valialkin
b0c738ae8b
app/vminsert/opentsdbhttp: remove FATAL prefix from logger.Fatalf errors for the sake of consistency with other logger.Fatalf calls
2019-09-19 22:16:11 +03:00
Aliaksandr Valialkin
550a12415a
app/vminsert/netstorage: log network errors when sending data to vmstorage nodes
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/175
2019-09-13 22:26:24 +03:00
Aliaksandr Valialkin
ee4585db33
app/vmselect/promql: properly handle subqueries like aggr_func(rollup_func(metric[window:step]))
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/184
2019-09-13 21:42:11 +03:00
Aliaksandr Valialkin
828e5f6d26
app/vmselect/promql: binary operation fixes according to Prometheus behaviour
...
The follosing issues were fixed:
- VictoriaMetrics could leave superflouos labels when using `on` or `ignoring` modifiers
- VictoriaMetrics could return `duplicate timeseries` error when using `group_left` or `group_right` with non-empty label list
2019-09-13 17:43:09 +03:00
Aliaksandr Valialkin
ed50b8792b
app/vminsert/netstorage: reduce the maximum buffer size for rerouted rows, so it occupies less RAM
2019-09-11 14:50:30 +03:00
Aliaksandr Valialkin
b101064f8b
all: report the number of bytes read on io.ReadFull error
...
This should simplify error investigation similar to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/175
2019-09-11 14:50:24 +03:00
Aliaksandr Valialkin
2f4c950fe9
app/vminsert/netstorage: send per-storageNode bufs to vmstorage nodes in parallel
...
This should improve the maximum ingestion throughput.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/175
2019-09-11 14:50:19 +03:00
Aliaksandr Valialkin
694cc59ed1
app/vminsert/netstorage: dynamically adjust timeouts for sending packets from vminsert to vmstorage depending on packet size
...
Bigger packets will have more chances to be sent to vmstorage.
2019-09-11 14:50:14 +03:00
Aliaksandr Valialkin
2c654258ef
lib/fs: add MustStopDirRemover for waiting until pending directories are removed on graceful shutdown
...
This patch is mainly required for laggy NFS. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/162
2019-09-05 11:17:17 +03:00
Aliaksandr Valialkin
d0953e9f02
app/vmselect/promql: ignore grouping by destination label in count_values
, since such a grouping is performed automatically
2019-09-04 19:59:02 +03:00
Aliaksandr Valialkin
f78ffe565f
app/vmselect/promql: do not return artificial points beyond the last point in time series
2019-09-04 16:34:29 +03:00
Aliaksandr Valialkin
a7d5d611fe
app/vmselect/prometheus: do not adjust start
and end
args in /api/v1/query_range
if nocache=1
arg is set
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/171
2019-09-04 13:10:17 +03:00
Aliaksandr Valialkin
b08f085082
app/vmselect/promql: reset timeseries name on group_left and group_right as Prometheus does
2019-09-03 20:43:29 +03:00
Aliaksandr Valialkin
458d412bb6
app/vmselect/netstorage: adaptively adjust the maximum inmemory file size for storing temporary blocks
...
The maximum inmemory file size now depends on `-memory.allowedPercent`.
This should improve performance and reduce the number of filesystem calls
on machines with big amounts of RAM when performing heavy queries
over big number of samples and time series.
2019-09-03 13:32:18 +03:00
Aliaksandr Valialkin
604a4312f9
all: port to FreeBSD on GOARCH=amd64
2019-08-28 01:46:09 +03:00
Aliaksandr Valialkin
5893a9f9a3
app/vmstorage: increase default values for search.maxTagKeys, search.maxTagValues and search.maxUniqueTimeseries
2019-08-27 14:28:26 +03:00
Aliaksandr Valialkin
38711526d3
app/vminsert/influx: set db
label only if Influx line doesnt have db
tag
2019-08-24 13:55:01 +03:00
Aliaksandr Valialkin
1ee536f9fd
app/vminsert: skip empty tags
2019-08-24 13:36:41 +03:00
Aliaksandr Valialkin
a283023d16
app/vminsert/opentsdbhttp: skip invalid rows and continue parsing the remaining rows
...
Invalid rows are logged and counted in `vm_rows_invalid_total{type="opentsdb-http"}` metric
2019-08-24 13:36:41 +03:00
Aliaksandr Valialkin
38b9615c53
app/vminsert/opentsdb: skip invalid rows and continue parsing the remaining rows
...
Invalid rows are logged and counted in `vm_rows_invalid_total{type="opentsdb"}` metric
2019-08-24 13:36:41 +03:00
Aliaksandr Valialkin
2a8fc41bab
app/vminsert/graphite: skip invalid rows and continue parsing the remaining rows
...
Invalid rows are logged and counted in `vm_rows_invalid_total{type="graphite"}` metric
2019-08-24 13:36:41 +03:00
Aliaksandr Valialkin
22685ef94d
app/vminsert/influx: skip invalid rows and continue parsing the remaining rows
...
Invalid influx lines are logged and counted in `vm_rows_invalid_total{type="influx"}` metric.
2019-08-24 13:36:41 +03:00
Aliaksandr Valialkin
425a81a6c7
app/vminsert/influx: do not allow escaping newline char, since they dont occur in real life
...
The prefious report with escaped newline chars in influx line protocol was false alarm.
2019-08-23 18:43:00 +03:00
Aliaksandr Valialkin
8da8dd0876
app/vminsert/opentsdbhttp: allow timestamp as float64 and as string, since it occurs in real life
2019-08-23 18:35:52 +03:00
Aliaksandr Valialkin
0ea21eb9dc
app/vminsert/influx: handle \r\n
aka crlf
influx line endings from windows world
...
Such lines exist in real life.
2019-08-23 18:28:54 +03:00
Aliaksandr Valialkin
b3502b2b39
app/vminsert/influx: allow escaping newline char
...
Though newline char isn't mentioned in escape rules at https://docs.influxdata.com/influxdb/v1.7/write_protocols/line_protocol_tutorial/ ,
there are reports that such chars occur in real life
2019-08-23 15:14:58 +03:00
Aliaksandr Valialkin
f1f8fce4f7
app/vminsert/influx: skip comments starting with #
in influx line protocol
2019-08-23 14:43:24 +03:00
Aliaksandr Valialkin
697de90893
app/vminsert: do not drop data in reroutedBuf if all the storage nodes are unhealthy
2019-08-23 10:38:19 +03:00
Aliaksandr Valialkin
a5dc54efc3
app/vminsert: properly limit the size of reroutedBuf
2019-08-23 10:29:51 +03:00
Aliaksandr Valialkin
c197641978
all: return 503 http error if service is temporarily unavailable
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/156
2019-08-23 09:49:50 +03:00
Aliaksandr Valialkin
e734076f0f
app/vminsert: allow setting the maximum number of labels per time series via -maxLabelsPerTimeseries
2019-08-23 08:47:18 +03:00
Aliaksandr Valialkin
e9db22a551
app/vmselect/promql: attempt to repair invalid bucket counts passed to histogram_quantile
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/136
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/154
2019-08-22 14:39:24 +03:00
Aliaksandr Valialkin
0697164b4f
app/vminsert: add ability to ingest data via HTTP OpenTSDB /api/put
requests
...
This is manual merge of the https://github.com/VictoriaMetrics/VictoriaMetrics/pull/152
Thanks to nustinov@gmail.com for the initial pull request.
2019-08-22 12:46:54 +03:00
Aliaksandr Valialkin
4d555c7c87
app/vminsert/opentsdb: fix BenchmarkRowsUnmarshal by adding missing put
prefixes to each line
2019-08-21 19:15:04 +03:00
Aliaksandr Valialkin
90a4b00b10
app/vmselect/promql: fix panic on -search.disableCache
...
Reset the cache if it is disabled instead of stopping, since it is stopped on graceful shutdown.
2019-08-21 17:12:01 +03:00
Aliaksandr Valialkin
491b1762c8
app/vmselect/promql: explain why empty timeseries arent removed in transformLabelValue
2019-08-21 11:29:41 +03:00
Aliaksandr Valialkin
db1de4277c
app/vmselect/promql: remove NaNs from /api/v1/query_range
output like Prometheus does
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/153
2019-08-20 23:01:59 +03:00
Aliaksandr Valialkin
99331606e1
app/vmselect/promql: pre-allocate memory for map for checking for duplicate timeseries
...
This should reduce memory allocations for big number of timeseries
2019-08-20 23:01:57 +03:00
Aliaksandr Valialkin
1101765adb
app/vmselect/promql: add label_value(q, label_name)
func, which returns numeric value labels with name label_name
in q
2019-08-20 00:28:44 +03:00
Aliaksandr Valialkin
940349ccb9
app/vmselect/promql: independently track offset hints for tStart and tEnd
...
This should improve performance if timeseries starts or ends on the selected time range
2019-08-19 13:40:24 +03:00
Aliaksandr Valialkin
6ae4b4190f
app/vmselect/promql: optimize search for timestamp boundaries in rollupConfig.Do
...
This should improve the performance of queries over big number of time series
with big number of output points.
2019-08-19 13:03:38 +03:00
Aliaksandr Valialkin
005aabd305
app/vmselect/promql: add scrape_interval(q[d])
function, which would return scrape interval for q
over d
2019-08-18 21:08:15 +03:00
Aliaksandr Valialkin
218cb4623a
app/vmselect/promql: hande comparisons with NaN
similar to Prometheus
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/150
2019-08-18 00:25:58 +03:00
Aliaksandr Valialkin
dcce92c63c
app/vmselect/promql: add lifetime(q[d])
function, which returns the lifetime of q
over d
in seconds.
...
This function is useful for determining time series lifetime.
`d` must exceed the expected lifetime of the time series, otherwise
the function would return values close to `d`.
2019-08-16 11:59:51 +03:00
Aliaksandr Valialkin
0cb66a8f95
app/vmselect/promql: fix corner-case calculation for ideriv
2019-08-16 11:59:50 +03:00
Aliaksandr Valialkin
1b5b9ced27
app/vmselect/promql: properly handle corner cases for rollup functions
2019-08-15 23:31:28 +03:00
Aliaksandr Valialkin
b8bbe92de1
app/vmselect/promql: store compressed results in the cache
...
This should increase rollup results cache capacity.
2019-08-14 02:32:16 +03:00
Aliaksandr Valialkin
8c2158af24
all: use workingsetcache instead of fastcache
...
This should reduce the amount of RAM required for processing time series
with non-zero churn rate.
The previous cache behavior can be restored with `-cache.oldBehavior` command-line flag.
2019-08-13 21:40:28 +03:00
Aliaksandr Valialkin
f56c1298ad
app/vmstorage: add vm_concurrent_addrows_*
metrics for tracking concurrency for Storage.AddRows calls
...
Track also the number of dropped rows due to the exceeded timeout
on concurrency limit for Storage.AddRows. This number is tracked in `vm_concurrent_addrows_dropped_rows_total`
2019-08-06 15:08:43 +03:00
Aliaksandr Valialkin
8e05758ff5
app: add vm_concurrent_
metrics for visibility in concurrency limiters for vminsert and vmselect
2019-08-05 18:30:29 +03:00
Aliaksandr Valialkin
53c8f56436
app/vmselect: allow passing match[]
, start
and time
to /api/v1/label/<label_name>/values
...
`/api/v1/label/<label_name>/values?match[]=q` emulates emulates `label_values(q, <label_name>)`
call in Grafana templating.
2019-08-04 23:07:00 +03:00
Aliaksandr Valialkin
880b1d80b1
app/vmselect: optimize /api/v1/series
by skipping storage data
...
Fetch and process only time series metainfo.
2019-08-04 23:00:46 +03:00
Aliaksandr Valialkin
7f5afae1e3
app/vmselect/prometheus: prevent from fetching and scanning all the data on /api/v1/searies
call by default
2019-08-04 19:42:45 +03:00
Aliaksandr Valialkin
000c154641
app/vmselect/promql: tune automatic window adjustement
...
Increase the windows adjustement for small scrape intervals,
since they usually have higher jitter.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/139
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/134
2019-08-04 19:34:11 +03:00
Aliaksandr Valialkin
1d4ddadbb1
app/vmselect/promql: further increase the allowed jitter for scrape interval
...
Real-world production data shows higher jitter than 1/8 of scrape interval.
This may results in gaps on the graph. So increase the allowed jitter to 1/4
of scrape interval in order to reduce the probability of gaps on the graphs
over time series with high jitter for scrape_interval.
2019-08-02 20:16:41 +03:00
Aliaksandr Valialkin
8ed84a4713
app/vminsert/influx: round automatically generated timestamp according to the given precision
arg
2019-08-02 00:24:39 +03:00
Aliaksandr Valialkin
ade7bc30db
app/vmselect/promql: tolerate higher jitter in scrape interval
...
Allow jitter for up to 1/8 instead of 1/16 for the scrape interval.
This should imrpove graphs when `step` is smaller than the `scrape_interval`.
2019-08-01 23:25:53 +03:00
Aliaksandr Valialkin
c994fbf500
app/vmselect/promql: add vm_slow_queries_total
metric for counting slow queries
...
The query is slow if its execution time exceeds `-search.logSlowQueryDuration`
2019-07-31 03:36:45 +03:00
Aliaksandr Valialkin
071a122119
app/vmselect/promql: return NaN from histogram_quantile if at least a single bucket is broken
2019-07-31 01:18:11 +03:00
Aliaksandr Valialkin
b9a16b93e7
app/vmselect/promql: allow adjusting window for default rollup function
...
Default rollup function is `last_over_time`. It must support adjusting
the provided window in order to prevent from gaps on the graph
for window values smaller than scrape interval.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/134
2019-07-31 00:45:58 +03:00
Aliaksandr Valialkin
c901a6472f
app/vmselect/promql: return NaN values if invalid bucket counts are passed to histogram_quantile
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/136
2019-07-30 22:05:55 +03:00
Aliaksandr Valialkin
5b8526e925
app/vmselect/netstorage: improve error message when reading data blocks from storage
...
Mention the block number in the error. This should simplify troubleshooting in this code.
2019-07-28 12:17:33 +03:00
Aliaksandr Valialkin
b7089705b7
app/vminsert: add vm_rows_per_insert
summary metric
...
This metric should help tuning batch sizes on clients writing data to VictoriaMetrics
2019-07-27 13:28:20 +03:00
Aliaksandr Valialkin
1fd4e9fb5c
app/vminsert: improve error messages for Influx, OpenTSDB and Graphite parsing
...
Include in the error message the line which failed to parse.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/127
2019-07-26 22:09:21 +03:00
Aliaksandr Valialkin
8253790157
app/vmstorage: consistency renaming for ignored rows
metrics
...
vm_too_big_timestamp_rows_total -> vm_rows_ignored_total{reason="big_timestamp"}
vm_too_small_timestamp_rows_total -> vm_rows_ignored_total{reason="small_timestamp"}
2019-07-26 20:02:24 +03:00
Aliaksandr Valialkin
c6bec48927
lib/storage: add metrics for calculating skipped rows outside the retention
...
The metrics are:
- vm_too_big_timestamp_rows_total
- vm_too_small_timestamp_rows_total
2019-07-26 14:11:56 +03:00
Aliaksandr Valialkin
aac482517f
app/vmselect/promql: return NaN from count()
over zero time series
...
This aligns `count` behavior with Prometheus.
2019-07-25 22:02:34 +03:00
Aliaksandr Valialkin
0e52357f35
app/vmselect/promql: properly calculate incremental aggregations grouped by __name__
...
Previously the following query may fail on multiple distinct metric names match:
sum(count_over_time{__name__!=''}) by (__name__)
2019-07-25 21:53:26 +03:00
Aliaksandr Valialkin
54f035d4ce
all: small updates after PR #114
2019-07-24 17:43:43 +03:00
Aliaksandr Valialkin
cb8104cf77
app: clarify error messages when -storageNode
arg is missing in vminsert and vmselect
2019-07-20 10:21:59 +03:00
Aliaksandr Valialkin
2b4254d01f
app/vminsert: use netutil.TCPListener for collecting network-related metrics for Graphite and OpenTSDB TCP traffic
2019-07-15 22:58:35 +03:00
Aliaksandr Valialkin
092c9b39a8
app/vmselect/promql: remove empty time series after applying filters like q > 0
...
This should reduce CPU and RAM usage for queries over high number of time series.
2019-07-12 19:59:49 +03:00
Aliaksandr Valialkin
6875fb411a
app/vmselect/promql: parallelize incremental aggregation to multiple CPU cores
...
This may reduce response times for aggregation over big number of time series
with small step between output data points.
2019-07-12 15:53:12 +03:00
Aliaksandr Valialkin
4a8e6f47fe
app/vmselect/prometheus: set start
arg in /api/v1/series
to the minimum allowed time by default as Prometheus does
...
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/91
2019-07-11 17:11:37 +03:00
Aliaksandr Valialkin
3313cdf816
app/vmselect/prometheus: convert negative times to 0, since they arent supported by the storage
2019-07-11 17:11:35 +03:00
Aliaksandr Valialkin
cbab86fd9d
app/vmselect/promql: reduce RAM usage for aggregates over big number of time series
...
Calculate incremental aggregates for `aggr(metric_selector)` function instead of
keeping all the time series matching the given `metric_selector` in memory.
2019-07-10 13:03:36 +03:00
Aliaksandr Valialkin
ba8195c58e
all: consistency renaming: bytesSize -> sizeBytes
2019-07-10 00:47:42 +03:00
Aliaksandr Valialkin
df6f17b82c
app/vmselect/promql: mention -search.logSlowQueryDuration
flag value in the slow query log message
2019-07-10 00:43:01 +03:00
Aliaksandr Valialkin
73ae889244
app/vmselect/promql: extract rmoeveGroupTags
function for removing unneeded tags from MetricName according to the given modifierExpr
2019-07-09 23:20:58 +03:00
Aliaksandr Valialkin
603b34edbd
app/vmselect/promql: properly preserve metric name after applying functions in any case from transformFuncsKeepMetricGroup
2019-07-09 23:10:49 +03:00
Aliaksandr Valialkin
d6ec95693d
app/vmselect/prometheus: typo fix
2019-07-07 23:34:04 +03:00
Aliaksandr Valialkin
36636c1f6f
app/vmselect/prometheus: handle minTime
and maxTime
values that may be set by Promxy or Prometheus client
...
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/88
2019-07-07 21:53:52 +03:00
Aliaksandr Valialkin
bba07d05fe
app/vmselect/promql: remove empty timeseries left after topk
call
2019-07-04 19:43:07 +03:00
Aliaksandr Valialkin
41f512af1c
all: add vm_data_size_bytes
metrics for easy monitoring of on-disk data size and on-disk inverted index size
2019-07-04 19:43:04 +03:00
Aliaksandr Valialkin
512a627855
app/vmselect/prometheus: update adjustLastPoints function
...
- Do not overwrite last points by the previous NaNs, since this may result in empty time series.
- Overwrite the last 2 points instead of 3. This should be enough in most cases.
2019-07-04 09:30:56 +03:00
Aliaksandr Valialkin
858746fa6c
app/vmselect/promql: gracefully handle duplicate timestamps in irate
and rollup_rate
funcs
...
Previously such timestamps result in `+Inf` results. Now the previous timestamp is used
for the calculations.
2019-07-03 12:41:30 +03:00