Aliaksandr Valialkin
c1f81f08d4
all: add support for Prometheus staleness markers
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1526
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/748
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1509
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1530
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/845
2021-08-13 12:13:15 +03:00
Aliaksandr Valialkin
c473d8ffe1
li/storage: re-use the per-day inverted index search code for searching in global index
...
This allows removing a big pile of outdated code for global index search.
This may help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1486
2021-07-30 10:28:20 +03:00
Aliaksandr Valialkin
1950f57316
lib/storage: yet another attempt to properly determine disk space shortage, which prevents from optimal merges
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1373
2021-07-27 12:03:31 +03:00
Aliaksandr Valialkin
9d3f9da5ad
lib/storage: make sure the second call to DeduplicateSamples and deduplicateSamplesDuringMerge doesnt change samples
2021-07-15 12:18:38 +03:00
Aliaksandr Valialkin
e992754e79
lib/storage: remove cache directory if it contains reset_cache_on_startup file
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1447
2021-07-13 17:59:51 +03:00
Aliaksandr Valialkin
3f705fe8d7
lib/storage: properly limit the size of storage/date_metricID
cache
2021-07-12 14:25:28 +03:00
Aliaksandr Valialkin
ef3c58d7a3
lib/storage: properly determine when the deduplication is needed in needsDedup
...
Previously needsDedup() could return true if the de-duplication wasn't needed for the following case:
d < interval
/ \
| v | v |
interval interval
Now it properly returns false for this case
2021-07-12 10:54:51 +03:00
Aliaksandr Valialkin
74ace9340d
lib/storage: periodically reset prefetchedMetricIDs cache in order to limit its size under high churn rate
2021-07-07 10:59:39 +03:00
Aliaksandr Valialkin
a846febc89
Revert "lib/uint64set: allow reusing bucket16 structs inside uint64set.Set via uint64set.Release method"
...
This reverts commit 7c6d3981bf
.
Reason for revert: high contention at bucket16Pool on systems with big number of CPU cores.
This slows down query processing significantly.
2021-07-06 18:26:56 +03:00
Aliaksandr Valialkin
b805a675f3
lib/{mergeset,storage}: switch from sync.Pool to chan-based pool for inmemoryPart objects
...
This should reduce memory usage on systems with big number of CPU cores,
since every inmemoryPart object occupies at least 64KB of memory and sync.Pool maintains
a separate pool inmemoryPart objects per each CPU core.
Though the new scheme for the pool worsens per-cpu cache locality, this should be amortized
by big sizes of inmemoryPart objects.
2021-07-06 16:33:25 +03:00
Aliaksandr Valialkin
d8e7c1ef27
lib/uint64set: allow reusing bucket16 structs inside uint64set.Set via uint64set.Release method
...
This reduces the load on memory allocator in Go runtime in production workload.
2021-07-06 16:33:24 +03:00
Aliaksandr Valialkin
22c6e64bbc
lib/storage: consistency renaming: tagCache -> tagFiltersCache
...
This improves code readability
2021-07-06 11:03:30 +03:00
Aliaksandr Valialkin
21abf487c3
lib/workingsetcache: properly update stats for requests and cache misses
...
Previously the stats for cache misses could be improperly counted, because it had inflated cache misses
if the entry was missing in the curr cache, but was existing in the prev cache.
The same applies to cache requests - they were inflated if the entry was missing in the curr cache.
2021-07-06 10:54:38 +03:00
Aliaksandr Valialkin
51516b96e6
lib/storage: tune cache sizes according to production workload
2021-07-05 15:14:45 +03:00
Aliaksandr Valialkin
f12f97daa1
lib/{storage,mergeset}: increase cache timeout for data and index blocks from a minute to two minutes
...
One minute cache timeout result in slower queries in some production workloads where the interval
between query execution is in the range 1 minute - 2 minutes.
2021-07-05 14:25:59 +03:00
Aliaksandr Valialkin
8055439fe4
lib/storage: properly detect free disk space shortage during data merge
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1373
2021-07-02 17:42:23 +03:00
Aliaksandr Valialkin
bced9ee666
lib/{mergeset,storage}: reduce the maximum lifetime for cached indexdb and data blocks from 2 minutes to a minute
...
This should reduce memory usage on a system with high number of active time series and a high churn rate.
One minute is enough for caching the blocks needed for repeated queries (e.g. alerting rules, recording rules and dashboard refreshes).
2021-06-29 19:57:53 +03:00
Aliaksandr Valialkin
609ad6d9bf
lib/storage: put indexDBName into the key for dateTagFilter cache and for uselessTagFilters cache
...
This should prevent from stats overwriting when the previous indexdb is queried.
2021-06-29 13:11:32 +03:00
Aliaksandr Valialkin
b84aea1e6e
lib/protoparser/clusternative: do not pool unmarshalWork structs, since they can occupy big amounts of memory (more than 100MB per each struct)
...
This should reduce memory usage for vmstorage under high ingestion rate when the vmstorage runs on a system with big number of CPU cores
2021-06-23 15:45:08 +03:00
Aliaksandr Valialkin
a22f37599b
lib/storage: tune tag filters search logic
...
Tune the logic according to the logs provided at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338#issuecomment-864293624
The previous logic had a race when multiple concurrent queries execute the same tag filter without prior stats.
This could result in incorrectly stored stats for such tag filter, which then could result in non-optimal sorting of tag filters
for further queries.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338
2021-06-23 13:30:36 +03:00
Aliaksandr Valialkin
a207be3ffb
lib/storage: fix infinite loop introduced in aa9b56a046
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1244
2021-06-17 14:27:30 +03:00
Aliaksandr Valialkin
0efd37cec1
lib/{mergeset,storage}: reduce the number of fsync calls on data ingestion path on systems with many cpu cores
...
VictoriaMetrics maintains a buffer per CPU core for the ingested data. These buffers are flushed to disk every second.
These buffers are flushed to disk in parallel starting from the commit 56b6b893ce
.
This resulted in increased write disk IO usage on systems with many cpu cores
as described at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338#issuecomment-863046999 .
This commit merges the per-CPU buffers into bigger in-memory buffers before flushing them to disk.
This should reduce the rate of fsync syscalls and, consequently, the write disk IO on systems with many CPU cores.
This should help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338
See also https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1244
2021-06-17 13:51:42 +03:00
Aliaksandr Valialkin
b133de1e37
lib/storage: move deletedMetricIDs set from indexDB to Storage
...
This makes consitent the list of deleted metricIDs when it is used from both the current indexDB and the previous indexDB (aka extDB).
This should fix the issue, which could lead to storing new samples under deleted metricIDs after indexDB rotation.
See more details at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1347#issuecomment-861232136 .
Thanks to @tangqipengleoo for the initial analysis and the pull request - https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1383 .
This commit resolves the issue in more generic way compared to https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1383 .
The downside of the commit is the deletedMetricIDs set isn't cleaned from the metricIDs outside the retention. It needs app restart.
This should be OK in most cases.
2021-06-15 15:07:54 +03:00
Aliaksandr Valialkin
ce10bdc82a
lib/storage: reset cache on disk during series deletion and during indexdb rotation
...
This should prevent from inconsistent behavior (aka partially missing data for some time series) after unclean shutdown.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1347
2021-06-11 12:54:36 +03:00
Aliaksandr Valialkin
eb335d2c29
lib/storage: consistency renaming: getMaxRawRowsPerPartition -> getMaxRawRowsPerShard
2021-06-11 10:52:31 +03:00
Aliaksandr Valialkin
d06c0e7a94
lib/storage: reduce the amounts of memory which can be occupied by rawRow items during data ingestion on a system with many CPU cores
2021-06-11 10:49:02 +03:00
Aliaksandr Valialkin
1e4a64844d
lib/storage: properly account the number of loops spent when matching for or suffixes
...
This may help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338
2021-06-08 13:07:14 +03:00
Aliaksandr Valialkin
fc2565b4ee
lib/storage: reduce memory allocations when syncing dateMetricIDCache
2021-06-03 16:20:02 +03:00
Aliaksandr Valialkin
10b2855949
lib/storage: fix spelling typo: borken->broken
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1336
2021-05-27 12:09:17 +03:00
Aliaksandr Valialkin
1c16cbacf5
lib/storage: do not stop data ingestion on the first error in Storage.AddRows
...
Continue data ingestion for the rest of blocks.
2021-05-24 15:32:24 +03:00
Aliaksandr Valialkin
2601844de3
lib/storage: limit the number of rows per each block in Storage.AddRows()
...
This should reduce memory usage when ingesting big blocks or rows.
2021-05-24 15:32:24 +03:00
Aliaksandr Valialkin
95b735a883
lib/storage: allow filling all the rows up to their capacity in rawRowsShard.addRows
...
This should reduce memory usage a bit on data ingestion path
2021-05-24 15:32:24 +03:00
Aliaksandr Valialkin
402a8ca710
lib/storage: do not populate MetricID->MetricName cache during data ingestion
...
This cache isn't needed during data ingestion, so there is no need in spending RAM on it.
This reduces RAM usage on data ingestion path by 30%
2021-05-24 03:06:40 +03:00
Aliaksandr Valialkin
0fc857d363
lib/{mergeset,storage}: reduce the number of IFNO log messages like merged ... items across ... blocks in ... seconds
...
Log these messages if the merge takes more than 30 seconds instead of 10 seconds.
2021-05-23 14:15:49 +03:00
Aliaksandr Valialkin
165a9f9200
app/vmstorage: add ability to limit series cardinality via -storage.maxHourlySeries
and -storage.maxDailySeries
command-line flags
2021-05-20 15:31:57 +03:00
Aliaksandr Valialkin
e228f479a5
lib/storage: remove possible data race when logging dropped labels
2021-05-20 11:54:06 +03:00
Aliaksandr Valialkin
2839055513
lib/storage: substitute GetTSDBStatusForDate with GetTSDBStatusWithFiltersForDate with nil tfss
2021-05-13 09:01:05 +03:00
Aliaksandr Valialkin
008ae25b3a
lib/storage: merge getTSDBStatusForDate with getTSDBStatusWithFiltersForDate
...
These functions are non-trivial, while their code has minimal differences.
It is better from maintainability PoV to merge these functions into a single function.
2021-05-12 18:01:08 +03:00
Nikolay
be87be34a4
Adds tsdb match filters ( #1282 )
...
* init work on filters
* init propose for status filters
* fixes tsdb status
adds test
* fix bug
* removes checks from test
2021-05-12 17:16:58 +03:00
Aliaksandr Valialkin
4e59cf4380
lib/storage: properly apply time range when matching an empty filter
...
It must match all the time series on the given time range.
Previously it was matched to all the time series without the restriction on the given time range.
2021-05-11 01:09:35 +03:00
Aliaksandr Valialkin
326cf83eb4
lib/storage: remove dead code after the commit 3ccf7ea20c
2021-05-08 20:15:59 +03:00
Aliaksandr Valialkin
4a5f45c77e
app/vminsert: add support for data ingestion via other vminsert nodes
2021-05-08 19:53:45 +03:00
Aliaksandr Valialkin
43c52ff77a
lib/storage: use WARNING instead of INFO level for logging dropped labels
2021-05-03 13:57:28 +03:00
Nikolay
62d58324dd
adds stalePartsRemover ( #1261 )
...
for new created partitions
2021-05-03 11:34:33 +03:00
Aliaksandr Valialkin
b43ba6d85f
lib/storage: log dropped labels if the number of labels in a metric exceeds -maxLabelsPerTimeseries
command-line flag value
...
This should improve debuggability for this case.
2021-05-01 09:29:56 +03:00
Aliaksandr Valialkin
e37e1b1e34
lib/{storage,mergeset}: fix unaligned 64-bit atomic operation
panic for 32-bit architectures
...
The panic has been introduced in 56b6b893ce
2021-04-27 16:42:19 +03:00
Aliaksandr Valialkin
2d1d60118d
lib/mergeset: split rows ingestion among multiple shards
...
This improves rows ingestion on systems with many CPU cores by reducing lock contention.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1244
Thanks to @waldoweng for the original idea and draft implementation at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1243
2021-04-27 15:45:11 +03:00
Aliaksandr Valialkin
cba2d13456
lib/storage: typo fix in info message when deleting the part outside the configured retention
...
Previously the message was displaying incorrect retention time
2021-04-27 13:33:36 +03:00
Aliaksandr Valialkin
ab8008d6d7
lib/{storage,mergeset}: remove empty directories on startup. Such directories can be left after unclean shutdown on NFS storage
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1142
2021-04-22 13:03:29 +03:00
Aliaksandr Valialkin
72c41323fa
lib/storage: code clarification: remove caching the found metricName in searchMetricName
2021-04-13 10:20:35 +03:00