Aliaksandr Valialkin
01bb3c06c7
lib/storage: remove inmemory inverted index for recent hours
...
Production load with >10M active time series showed it could
slow down VictoriaMetrics startup times and could eat
all the memory leading to OOM.
Remove inmemory inverted index for recent hours until thorough
testing on production data shows it works OK.
2019-11-13 10:45:53 +02:00
Aliaksandr Valialkin
66c4961ff8
README.md: mention that VictoriaMetrics executable is small
2019-11-12 16:58:15 +02:00
Aliaksandr Valialkin
3e16248ed6
README.md: small updates
2019-11-12 16:54:18 +02:00
Aliaksandr Valialkin
5e6c1cd986
README.md: typo fix
2019-11-12 16:48:40 +02:00
Aliaksandr Valialkin
6c2303764e
Revert "lib/fs: do not postpone directory removal on NFS error"
...
This reverts commit 4c02e496f7
.
Reason for revert: the commit breaks on NFS - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/234
2019-11-12 16:18:09 +02:00
Mike Poindexter
f3ad330635
Add test for invalid caching of tsids ( #232 )
...
* Add test for invalid caching of tsids
* Clean up error handling
2019-11-12 15:09:33 +02:00
Aliaksandr Valialkin
6c362d82cb
README.md: mention that backups are made to S3 or GCS
2019-11-12 14:32:37 +02:00
Aliaksandr Valialkin
661dd190bb
Refer to https://medium.com/@valyala/speeding-up-backups-for-big-time-series-databases-533c1a927883 from multiple places in README.md
2019-11-12 13:02:39 +02:00
Aliaksandr Valialkin
630ba810f1
deployment/docker: upgrade Go from v1.13.4 to v1.13.4
2019-11-12 03:49:19 +02:00
Oleg Kovalov
b4f44befa3
fix misspelled words ( #229 )
2019-11-12 00:16:42 +02:00
Roman Khavronenko
5fc8fb1323
add churn rate panel ( #230 )
2019-11-12 00:14:53 +02:00
Aliaksandr Valialkin
8e8f98f712
lib/storage: add tests for dateMetricIDCache
2019-11-11 13:21:57 +02:00
Aliaksandr Valialkin
c342f5e37e
lib/storage: eliminate data race when updating lastSyncTime in dateMetricIDCache.Has
2019-11-10 22:04:01 +02:00
Aliaksandr Valialkin
56d7cc8a0d
app/victoria-metrics: remove deprecated fs.MustStopDirRemover from main_test.go
2019-11-10 13:37:13 +02:00
Aliaksandr Valialkin
4c02e496f7
lib/fs: do not postpone directory removal on NFS error
...
Continue trying to remove NFS directory on temporary errors for up to a minute.
The previous async removal process breaks in the following case during VictoriaMetrics start
- VictoriaMetrics opens index, finds incomplete merge transactions and starts replaying them.
- The transaction instructs removing old directories for parts, which were already merged into bigger part.
- VictoriaMetrics removes these directories, but their removal is delayed due to NFS errors.
- VictoriaMetrics scans partition directory after all the incomplete merge transactions are finished
and finds directories, which should be removed, but weren't still removed due to NFS errors.
- VictoriaMetrics panics when it finds unexpected empty directory.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/162
2019-11-10 13:24:51 +02:00
Aliaksandr Valialkin
3956003dd0
lib/storage: reorganize the code in getStartDateForPerDayInvertedIndex according to golangci-lint
2019-11-10 00:38:59 +02:00
Aliaksandr Valialkin
5c3fa59181
app/vmrestore: the upcoming release would be 1.29.0
2019-11-10 00:20:41 +02:00
Aliaksandr Valialkin
ee7765b10d
lib/storage: implement per-day inverted index
2019-11-10 00:02:46 +02:00
Aliaksandr Valialkin
5810ba57c2
lib/storage: use specialized cache for (date, metricID) entries
...
This improves ingestion performance.
2019-11-09 23:06:11 +02:00
Aliaksandr Valialkin
e573ef2126
lib/storage: remove unused code from getMetricIDsForTimeRange: it is expected that time range is always non-zero
2019-11-09 19:03:34 +02:00
Aliaksandr Valialkin
823fa085ef
lib/storage: properly set time range when deleting time series
2019-11-09 18:49:49 +02:00
Aliaksandr Valialkin
695c1dc5eb
lib/storage: obtain all the time series ids from (tag->metricIDs) rows instead of (metricID->TSID) rows, since this much faster
2019-11-09 18:04:33 +02:00
Aliaksandr Valialkin
cdbe848102
lib/storage: small code prettifying
2019-11-09 14:19:52 +02:00
Aliaksandr Valialkin
5c25070556
lib/uint64set: remove superflouos check for item existence before deleting it in Set.Subtract
2019-11-09 14:19:47 +02:00
Aliaksandr Valialkin
bb08bab263
lib/storage: inmemoryInvertedIndex prettifying
2019-11-09 14:19:41 +02:00
Aliaksandr Valialkin
6ad7fe8eeb
lib/storage: export vm_new_timeseries_created_total
metric for determining time series churn rate
2019-11-08 21:21:07 +02:00
Aliaksandr Valialkin
9ea549ed24
lib/storage: sync with cluster changes
2019-11-08 21:21:07 +02:00
Aliaksandr Valialkin
63b05c0b9f
app/vmselect/promql: adjust memory limits calculations for incremental aggregate functions
...
Incremental aggregate functions don't keep all the selected time series in memory -
they keep only up to GOMAXPROCS time series for incremental aggregations.
Take into account that the number of time series in RAM can be higher if they are split
into many groups with `by (...)` or `without (...)` modifiers.
This should reduce the number of `not enough memory for processing ... data points` false
positive errors.
2019-11-08 21:21:07 +02:00
Aliaksandr Valialkin
d888b21657
lib/storage: add inmemory inverted index for the last hour
...
It should improve performance for `last N hours` dashboards with update intervals smaller than 1 hour.
2019-11-08 21:21:07 +02:00
Aliaksandr Valialkin
1e46961d68
app/{vmbackup,vmrestore}: add vmbackup
and vmrestore
tools for creating backups on s3 or gcs from instant snapshots
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/203
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/38
2019-11-08 21:21:07 +02:00
Roman Khavronenko
72756ab8c7
#224 : add slow_queries, on-going merges and merge speed panels to dashboard ( #226 )
2019-11-08 21:20:38 +02:00
Aliaksandr Valialkin
543dc8d337
lib/storage: populate partition names from both small
and big
directories
...
Certain partition directories may be missing after restoring from backups
if they had no data. Re-create such directories on start.
2019-11-06 19:49:34 +02:00
Aliaksandr Valialkin
e472f0b23b
lib/storage: substitute error message about unsorted items in the index block after metricIDs merge with counter
...
The origin of the error has been detected and documented in the code,
so it is enough to export a counter for such errors at `vm_index_blocks_with_metric_ids_incorrect_order_total`,
so it could be monitored and alerted on high error rates.
Export also the counter for processed index blocks with metricIDs - `vm_index_blocks_with_metric_ids_processed_total`,
so its' rate could be compared to `rate(vm_index_blocks_with_metric_ids_incorrect_order_total)`.
2019-11-06 14:28:11 +02:00
Aliaksandr Valialkin
c51ca04a43
lib/storage: take into account the requested time range when caching TSIDs for the given tag filters
2019-11-06 14:28:11 +02:00
Aliaksandr Valialkin
e37f06dc52
lib/storage: dump incorrectly sorted items on a single line; this should simplify error reporting
2019-11-05 18:44:22 +02:00
Aliaksandr Valialkin
5c2099ecfe
lib/storage: return back finalPartsToMerge from 2 to 3 in order to prevent from excessive merges in old partitions
2019-11-05 17:27:48 +02:00
Aliaksandr Valialkin
885ba17905
lib/storage: separate the max inverted index scan loops per metric into fast and slow loops
...
Slow loops could require seeks and expensive regexp matching, while fast loops just scans
all the metricIDs for the given `tag=value` prefix. So these operations must have separate
max loops multiplier.
2019-11-05 17:27:48 +02:00
Aliaksandr Valialkin
b9a06e8e74
lib/storage: skip repeated useless work when intersection of metricIDs with the given filter is too expensive
...
This should improve performance for query filters over big number of time series.
2019-11-05 14:19:13 +02:00
Aliaksandr Valialkin
30c8301b11
lib/storage: reduce the maximum inverted index scans before giving up to label filters matching by metric name
...
The new value reduces the amount of wasted work during index scans over big number of time series.
2019-11-05 14:19:06 +02:00
Aliaksandr Valialkin
e53f9e553d
lib/storage: try potentially faster tag filters at first, then apply slower tag filters
...
The fastest tag filters are non-negative non-regexp, since they are the most specific.
The slowest tag filters are negative regexp, since they require scanning
all the entries for the given label.
2019-11-05 14:19:01 +02:00
Aliaksandr Valialkin
d6ade02fd3
Makefile: add pprof-cpu
rule for inspecting CPU profiles with PPROF_FILE=/path/to/cpu.pprof make pprof-cpu
2019-11-04 12:44:09 +02:00
Aliaksandr Valialkin
3c90d77858
lib/storage: pass pointer to MetricName in Fatalf, so it is properly detected as an interface with String() method
...
This fixes lint errors
2019-11-04 01:07:19 +02:00
Artem Navoiev
478767d0ed
add unittests for bytesutil and storage ( #221 )
2019-11-04 00:54:46 +02:00
Aliaksandr Valialkin
02e0b19a62
lib/storage: tune the returned value from adjustMaxMetricsAdaptive
2019-11-04 00:44:37 +02:00
Aliaksandr Valialkin
6be4456d88
lib/{storage,uint64set}: add Set.Union() function and use it
2019-11-04 00:44:37 +02:00
Aliaksandr Valialkin
9becc26f4b
lib/storage: remove interface conversion in hot path during block merging
...
This should improve merge speed a bit for parts with big number of small blocks.
2019-11-03 12:33:34 +02:00
Aliaksandr Valialkin
c62399eb3e
lib/{storage,mergeset}: create missing partition directories after restoring from backups
...
Backup tools could skip empty directories. So re-create such directories on the first run.
2019-11-02 02:27:11 +02:00
Aliaksandr Valialkin
55d728c849
lib/{decimal,encoding}: optimize float64<->decimal conversion for arrays with zeros or ones
...
Time series with only zeros or ones frequently occur in monitoring, so it is worth optimizing their handling.
2019-11-01 16:48:12 +02:00
Aliaksandr Valialkin
808fc0971f
lib/{encoding,decimal}: add benchmarks for blocks containing zeros or ones
...
Time series with such values are quite common in monitoring space,
so it would be great to have benchmarks for them.
2019-11-01 16:48:12 +02:00
Aliaksandr Valialkin
370cfbb365
lib/uint64set: return an emptry set instead of nil set from Set.Clone
, since the caller may add data to the cloned set
...
This fixes the following panic in v1.28.1:
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x783a7e]
goroutine 1155 [running]:
github.com/VictoriaMetrics/VictoriaMetrics/lib/uint64set.(*Set).Add(0x0, 0x15b3bfb41e8b71ec)
github.com/VictoriaMetrics/VictoriaMetrics@/lib/uint64set/uint64set.go:57 +0x2e
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage.(*indexSearch).getMetricIDsForRecentHours(0xc5bdc0dd40, 0x16e273f6b50, 0x16e2745d3f0, 0x5b8d95, 0x10, 0x4a2f51, 0xaa01000000000000)
github.com/VictoriaMetrics/VictoriaMetrics@/lib/storage/index_db.go:1951 +0x260
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage.(*indexSearch).getMetricIDsForTimeRange(0xc5bdc0dd40, 0x16e273f6b50, 0x16e2745d3f0, 0x5b8d95, 0x10, 0xb296c0, 0xc00009cd80, 0x9bc640)
2019-11-01 16:12:44 +02:00