Commit Graph

563 Commits

Author SHA1 Message Date
Aliaksandr Valialkin
74ace9340d lib/storage: periodically reset prefetchedMetricIDs cache in order to limit its size under high churn rate 2021-07-07 10:59:39 +03:00
Aliaksandr Valialkin
a846febc89 Revert "lib/uint64set: allow reusing bucket16 structs inside uint64set.Set via uint64set.Release method"
This reverts commit 7c6d3981bf.

Reason for revert: high contention at bucket16Pool on systems with big number of CPU cores.
This slows down query processing significantly.
2021-07-06 18:26:56 +03:00
Aliaksandr Valialkin
b805a675f3 lib/{mergeset,storage}: switch from sync.Pool to chan-based pool for inmemoryPart objects
This should reduce memory usage on systems with big number of CPU cores,
since every inmemoryPart object occupies at least 64KB of memory and sync.Pool maintains
a separate pool inmemoryPart objects per each CPU core.

Though the new scheme for the pool worsens per-cpu cache locality, this should be amortized
by big sizes of inmemoryPart objects.
2021-07-06 16:33:25 +03:00
Aliaksandr Valialkin
d8e7c1ef27 lib/uint64set: allow reusing bucket16 structs inside uint64set.Set via uint64set.Release method
This reduces the load on memory allocator in Go runtime in production workload.
2021-07-06 16:33:24 +03:00
Aliaksandr Valialkin
22c6e64bbc lib/storage: consistency renaming: tagCache -> tagFiltersCache
This improves code readability
2021-07-06 11:03:30 +03:00
Aliaksandr Valialkin
21abf487c3 lib/workingsetcache: properly update stats for requests and cache misses
Previously the stats for cache misses could be improperly counted, because it had inflated cache misses
if the entry was missing in the curr cache, but was existing in the prev cache.

The same applies to cache requests - they were inflated if the entry was missing in the curr cache.
2021-07-06 10:54:38 +03:00
Aliaksandr Valialkin
51516b96e6 lib/storage: tune cache sizes according to production workload 2021-07-05 15:14:45 +03:00
Aliaksandr Valialkin
f12f97daa1 lib/{storage,mergeset}: increase cache timeout for data and index blocks from a minute to two minutes
One minute cache timeout result in slower queries in some production workloads where the interval
between query execution is in the range 1 minute - 2 minutes.
2021-07-05 14:25:59 +03:00
Aliaksandr Valialkin
8055439fe4 lib/storage: properly detect free disk space shortage during data merge
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1373
2021-07-02 17:42:23 +03:00
Aliaksandr Valialkin
bced9ee666 lib/{mergeset,storage}: reduce the maximum lifetime for cached indexdb and data blocks from 2 minutes to a minute
This should reduce memory usage on a system with high number of active time series and a high churn rate.
One minute is enough for caching the blocks needed for repeated queries (e.g. alerting rules, recording rules and dashboard refreshes).
2021-06-29 19:57:53 +03:00
Aliaksandr Valialkin
609ad6d9bf lib/storage: put indexDBName into the key for dateTagFilter cache and for uselessTagFilters cache
This should prevent from stats overwriting when the previous indexdb is queried.
2021-06-29 13:11:32 +03:00
Aliaksandr Valialkin
b84aea1e6e lib/protoparser/clusternative: do not pool unmarshalWork structs, since they can occupy big amounts of memory (more than 100MB per each struct)
This should reduce memory usage for vmstorage under high ingestion rate when the vmstorage runs on a system with big number of CPU cores
2021-06-23 15:45:08 +03:00
Aliaksandr Valialkin
a22f37599b lib/storage: tune tag filters search logic
Tune the logic according to the logs provided at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338#issuecomment-864293624

The previous logic had a race when multiple concurrent queries execute the same tag filter without prior stats.
This could result in incorrectly stored stats for such tag filter, which then could result in non-optimal sorting of tag filters
for further queries.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338
2021-06-23 13:30:36 +03:00
Aliaksandr Valialkin
a207be3ffb lib/storage: fix infinite loop introduced in aa9b56a046
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1244
2021-06-17 14:27:30 +03:00
Aliaksandr Valialkin
0efd37cec1 lib/{mergeset,storage}: reduce the number of fsync calls on data ingestion path on systems with many cpu cores
VictoriaMetrics maintains a buffer per CPU core for the ingested data. These buffers are flushed to disk every second.
These buffers are flushed to disk in parallel starting from the commit 56b6b893ce .
This resulted in increased write disk IO usage on systems with many cpu cores
as described at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338#issuecomment-863046999 .

This commit merges the per-CPU buffers into bigger in-memory buffers before flushing them to disk.
This should reduce the rate of fsync syscalls and, consequently, the write disk IO on systems with many CPU cores.

This should help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338
See also https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1244
2021-06-17 13:51:42 +03:00
Aliaksandr Valialkin
b133de1e37 lib/storage: move deletedMetricIDs set from indexDB to Storage
This makes consitent the list of deleted metricIDs when it is used from both the current indexDB and the previous indexDB (aka extDB).
This should fix the issue, which could lead to storing new samples under deleted metricIDs after indexDB rotation.
See more details at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1347#issuecomment-861232136 .

Thanks to @tangqipengleoo for the initial analysis and the pull request - https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1383 .

This commit resolves the issue in more generic way compared to https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1383 .

The downside of the commit is the deletedMetricIDs set isn't cleaned from the metricIDs outside the retention. It needs app restart.
This should be OK in most cases.
2021-06-15 15:07:54 +03:00
Aliaksandr Valialkin
ce10bdc82a lib/storage: reset cache on disk during series deletion and during indexdb rotation
This should prevent from inconsistent behavior (aka partially missing data for some time series) after unclean shutdown.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1347
2021-06-11 12:54:36 +03:00
Aliaksandr Valialkin
eb335d2c29 lib/storage: consistency renaming: getMaxRawRowsPerPartition -> getMaxRawRowsPerShard 2021-06-11 10:52:31 +03:00
Aliaksandr Valialkin
d06c0e7a94 lib/storage: reduce the amounts of memory which can be occupied by rawRow items during data ingestion on a system with many CPU cores 2021-06-11 10:49:02 +03:00
Aliaksandr Valialkin
1e4a64844d lib/storage: properly account the number of loops spent when matching for or suffixes
This may help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338
2021-06-08 13:07:14 +03:00
Aliaksandr Valialkin
fc2565b4ee lib/storage: reduce memory allocations when syncing dateMetricIDCache 2021-06-03 16:20:02 +03:00
Aliaksandr Valialkin
10b2855949 lib/storage: fix spelling typo: borken->broken
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1336
2021-05-27 12:09:17 +03:00
Aliaksandr Valialkin
1c16cbacf5 lib/storage: do not stop data ingestion on the first error in Storage.AddRows
Continue data ingestion for the rest of blocks.
2021-05-24 15:32:24 +03:00
Aliaksandr Valialkin
2601844de3 lib/storage: limit the number of rows per each block in Storage.AddRows()
This should reduce memory usage when ingesting big blocks or rows.
2021-05-24 15:32:24 +03:00
Aliaksandr Valialkin
95b735a883 lib/storage: allow filling all the rows up to their capacity in rawRowsShard.addRows
This should reduce memory usage a bit on data ingestion path
2021-05-24 15:32:24 +03:00
Aliaksandr Valialkin
402a8ca710 lib/storage: do not populate MetricID->MetricName cache during data ingestion
This cache isn't needed during data ingestion, so there is no need in spending RAM on it.

This reduces RAM usage on data ingestion path by 30%
2021-05-24 03:06:40 +03:00
Aliaksandr Valialkin
0fc857d363 lib/{mergeset,storage}: reduce the number of IFNO log messages like merged ... items across ... blocks in ... seconds
Log these messages if the merge takes more than 30 seconds instead of 10 seconds.
2021-05-23 14:15:49 +03:00
Aliaksandr Valialkin
165a9f9200 app/vmstorage: add ability to limit series cardinality via -storage.maxHourlySeries and -storage.maxDailySeries command-line flags 2021-05-20 15:31:57 +03:00
Aliaksandr Valialkin
e228f479a5 lib/storage: remove possible data race when logging dropped labels 2021-05-20 11:54:06 +03:00
Aliaksandr Valialkin
2839055513 lib/storage: substitute GetTSDBStatusForDate with GetTSDBStatusWithFiltersForDate with nil tfss 2021-05-13 09:01:05 +03:00
Aliaksandr Valialkin
008ae25b3a lib/storage: merge getTSDBStatusForDate with getTSDBStatusWithFiltersForDate
These functions are non-trivial, while their code has minimal differences.
It is better from maintainability PoV to merge these functions into a single function.
2021-05-12 18:01:08 +03:00
Nikolay
be87be34a4 Adds tsdb match filters (#1282)
* init work on filters

* init propose for status filters

* fixes tsdb status
adds test

* fix bug

* removes checks from test
2021-05-12 17:16:58 +03:00
Aliaksandr Valialkin
4e59cf4380 lib/storage: properly apply time range when matching an empty filter
It must match all the time series on the given time range.
Previously it was matched to all the time series without the restriction on the given time range.
2021-05-11 01:09:35 +03:00
Aliaksandr Valialkin
326cf83eb4 lib/storage: remove dead code after the commit 3ccf7ea20c 2021-05-08 20:15:59 +03:00
Aliaksandr Valialkin
4a5f45c77e app/vminsert: add support for data ingestion via other vminsert nodes 2021-05-08 19:53:45 +03:00
Aliaksandr Valialkin
43c52ff77a lib/storage: use WARNING instead of INFO level for logging dropped labels 2021-05-03 13:57:28 +03:00
Nikolay
62d58324dd adds stalePartsRemover (#1261)
for new created partitions
2021-05-03 11:34:33 +03:00
Aliaksandr Valialkin
b43ba6d85f lib/storage: log dropped labels if the number of labels in a metric exceeds -maxLabelsPerTimeseries command-line flag value
This should improve debuggability for this case.
2021-05-01 09:29:56 +03:00
Aliaksandr Valialkin
e37e1b1e34 lib/{storage,mergeset}: fix unaligned 64-bit atomic operation panic for 32-bit architectures
The panic has been introduced in 56b6b893ce
2021-04-27 16:42:19 +03:00
Aliaksandr Valialkin
2d1d60118d lib/mergeset: split rows ingestion among multiple shards
This improves rows ingestion on systems with many CPU cores by reducing lock contention.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1244

Thanks to @waldoweng for the original idea and draft implementation at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1243
2021-04-27 15:45:11 +03:00
Aliaksandr Valialkin
cba2d13456 lib/storage: typo fix in info message when deleting the part outside the configured retention
Previously the message was displaying incorrect retention time
2021-04-27 13:33:36 +03:00
Aliaksandr Valialkin
ab8008d6d7 lib/{storage,mergeset}: remove empty directories on startup. Such directories can be left after unclean shutdown on NFS storage
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1142
2021-04-22 13:03:29 +03:00
Aliaksandr Valialkin
72c41323fa lib/storage: code clarification: remove caching the found metricName in searchMetricName 2021-04-13 10:20:35 +03:00
Aliaksandr Valialkin
59ccc43e3a lib/storage: properly handle big time ranges passed to /api/v1/labels and /api/v1/label/<labelName>/values
It should be faster querying all the labels and/or all the values instead of querying per-day labels/values on time ranges exceeding maxDaysForPerDaySearch
2021-04-07 13:33:10 +03:00
Aliaksandr Valialkin
512addc608 app/{vminsert,vmagent}: add -sortLabels command-line option for sorting time series labels before ingesting them in the storage
This option can be useful when samples for the same time series are ingested with distinct order of labels.
For example, metric{k1="v1",k2="v2"} and metric{k2="v2",k1="v1"}.
2021-03-31 23:27:21 +03:00
Aliaksandr Valialkin
ae1c653d55 lib/storage: reduce memory usage when ingesting samples for the same time series with distinct order of labels 2021-03-31 21:22:40 +03:00
Aliaksandr Valialkin
940a547116 lib/storage: do not update b.nextIdx if no samples are removed because of retention 2021-03-29 12:13:38 +03:00
Aliaksandr Valialkin
9c2be144cf app/vmselect: log the metric which trigger rollup result cache reset
This should help finding the source of stale metrics
2021-03-25 21:32:28 +02:00
Aliaksandr Valialkin
f971fe86cd lib/storage: tune loopsCountPerMetricNameMatch according to production workload 2021-03-25 13:48:17 +02:00
Aliaksandr Valialkin
9947c65df3 lib/storage: do not reload metricName for the same metricID in Search.NextMetricBlock
This should speed up Search.NextMetricBlock a bit
2021-03-23 17:59:34 +02:00
Aliaksandr Valialkin
12ca0efc19 lib/storage: respect the deadline passed to Storage.SearchMetricNames 2021-03-22 23:03:00 +02:00
Aliaksandr Valialkin
40e47935e7 lib/storage: improve Search.NextMetricBlock performance by using MetricID->MetricName cache 2021-03-22 23:02:59 +02:00
Aliaksandr Valialkin
1618b0ca6d lib/storage: tune loopsCountPerMetricNameMatch 2021-03-22 12:57:54 +02:00
Aliaksandr Valialkin
7503111feb lib/storage: small code simplification after 6cee5338b2 2021-03-18 15:22:39 +02:00
Aliaksandr Valialkin
4443254fb9 lib/storage: prevent from infinite loop if {__graphite__="..."} filter matches a metric name with *, [ or { chars
The idea has been borrowed from https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1137
2021-03-18 14:57:39 +02:00
Aliaksandr Valialkin
b859fe7879 lib/storage: faster move heavy filters to the end of list 2021-03-17 15:11:56 +02:00
Aliaksandr Valialkin
41fe707bec lib/storage: limit loops count in order to reduce max CPU usage during filter search 2021-03-17 00:48:44 +02:00
Aliaksandr Valialkin
e2a0c8bd72 lib/storage: do not modify filterLoopsCount stats with loopsCount stats
Such a modification can result in incorrect filter sorting later
2021-03-17 00:48:44 +02:00
Aliaksandr Valialkin
727ded9d4e lib/storage: time series search optimization according to production workload profiling
Do not pass filter metric ids to getMetricIDsForTagFilter, since it has been appeared that this slows down
the function by multiple times when it finds big number of metricIDs (tens of millions).
2021-03-16 20:08:43 +02:00
Aliaksandr Valialkin
f4a44d6c0d lib/storage: further tuning for time series search 2021-03-16 18:47:29 +02:00
Aliaksandr Valialkin
d074326970 app/vmstorage: add -logNewSeries command-line flag for determining the source of series churn rate 2021-03-15 22:40:28 +02:00
Aliaksandr Valialkin
fc902734d9 lib/storage: further tuning for time series selector code 2021-03-15 20:32:37 +02:00
Aliaksandr Valialkin
1c26020080 lib/storage: tune per-day index search 2021-03-15 13:36:36 +02:00
Aliaksandr Valialkin
b2732575f7 lib/storage: further tune filters sorting logic 2021-03-12 00:51:35 +02:00
John Belmonte
edf39aa225 spelling fix: adjacent (#1115) 2021-03-09 09:19:16 +02:00
Aliaksandr Valialkin
c4a0bd5eac lib/storage: go fmt 2021-03-08 11:59:31 +02:00
Aliaksandr Valialkin
c76a904bb0 lib/storage: tune loopsCount estimations in getMetricIDsForTagFilterSlow
The adjusted estmations give up to 2x lower median response times on 200qps /api/v1/query_range workload
2021-03-07 21:17:48 +02:00
Aliaksandr Valialkin
c8dde1fd6b lib/storage: typo fix: umarshal -> unmarshal 2021-03-02 20:48:44 +02:00
Aliaksandr Valialkin
06676c8feb lib/storage: consistency renaming: durationsPerDateTagFilterCache -> loopsPerDateTagFilterCache 2021-02-23 15:50:08 +02:00
faceair
b1409a7413 lib/storage: correct tagfilter match cost (#1079) 2021-02-22 21:54:37 +02:00
Aliaksandr Valialkin
587132555f lib/mergeset: reduce memory usage for inmemoryBlock by using more compact items representation
This also should reduce CPU time spent by GC, since inmemoryBlock.items don't have pointers now,
so GC doesn't need visiting them.
2021-02-21 22:09:10 +02:00
Aliaksandr Valialkin
bd3bcdc43c lib/storage: do not re-calculate stats for heavy tag filters
This should reduce the number of slow queries when stats for heavy tag filters was recalculated.
2021-02-21 21:43:37 +02:00
Aliaksandr Valialkin
b8a5ee2e93 lib/{mergeset,storage}: allow merging smaller number of small parts
While this may increase CPU and disk IO usage needed for background merge,
this also recudes CPU usage during queries in production. This is because
such queries tend to read recently added data and it is better to have lower number
of parts for such data in order to reduce CPU usage.

This partially reverts ebf8da3730
2021-02-21 21:43:37 +02:00
Aliaksandr Valialkin
34195218e1 lib/{mergeset,storage}: do not use pools for indexBlock and inmemoryBlock during their caching, since this results in higher memory usage in production without any performance gains 2021-02-21 21:43:37 +02:00
Aliaksandr Valialkin
10ccb92e4d lib/storage: use composite index for a query with a name filter and negative filters 2021-02-18 18:57:45 +02:00
Aliaksandr Valialkin
418de71509 lib/storage: properly handle queries containing a filter on metric name plus any number of negative filters and zero non-negative filters
Example: `node_cpu_seconds_total{mode!="idle"}`
2021-02-18 18:33:05 +02:00
Aliaksandr Valialkin
628b8eb55e lib/storage: prevent from running identical heavy tag filters in concurrent queries when measuring the number of loops for such tag filter.
This should reduce CPU usage spikes when measuring the number of loops needed for heavy tag filters
2021-02-18 14:01:18 +02:00
Aliaksandr Valialkin
fd41f070db lib/storage: sort tag filters by the number of loops they need for the execution
This metric should work better than the filter execution duration, since it cannot be distorted
by concurrently running queries.
2021-02-18 12:52:29 +02:00
Aliaksandr Valialkin
9566015a36 Revert "lib/mergeset: tune lifetime for entries inside block caches"
This reverts commit 458c89324d.

Production testing revealed zero improvements for memory usage with reduced lifetime for entries in block caches.
2021-02-17 20:42:15 +02:00
Aliaksandr Valialkin
b4c7d5992b lib/storage: move composite filters to the top during sorting 2021-02-17 20:26:32 +02:00
Aliaksandr Valialkin
2005d8212a lib/storage: return back filter arg to getMetricIDsForTagFilter function
The filter arg has been removed in the commit c7ee2fabb8
because it was preventing from caching the number of matching time series per each tf.

Now the cache contains duration for tf execution, so the filter shouldn't break such caching.
2021-02-17 19:33:15 +02:00
Aliaksandr Valialkin
83da939947 app/vmstorage: export vm_composite_filter_success_conversions_total and vm_composite_filter_missing_conversions_total metrics 2021-02-17 19:13:49 +02:00
Aliaksandr Valialkin
0c5bb2a397 lib/storage: revert ecf132933e, since negative filters require the same amount of work as non-negative filters 2021-02-17 18:56:13 +02:00
Aliaksandr Valialkin
bbc287ea6a lib/storage: tag filters sorting... 2021-02-17 18:56:12 +02:00
Aliaksandr Valialkin
f5e841c1e9 lib/storage: further tune tag filters sorting 2021-02-17 17:28:36 +02:00
Aliaksandr Valialkin
9d1f14d94c lib/storage: tune the logic for sorting tag filters according the their exeuction times 2021-02-17 15:02:19 +02:00
Aliaksandr Valialkin
1a19702d92 lib/storage: make sure that nobody uses partitions when closing the table 2021-02-17 15:02:18 +02:00
Aliaksandr Valialkin
1ab7a8dfd5 lib/storage: more tuning for tag filters sorting according the time they take 2021-02-16 21:27:27 +02:00
Aliaksandr Valialkin
35a23234ca lib/mergeset: tune lifetime for entries inside block caches
This should reduce memory usage in general case without significant CPU usage increase
2021-02-16 18:12:32 +02:00
Aliaksandr Valialkin
55952f8f2e lib/storage: tune sorting for tag filters 2021-02-16 13:07:42 +02:00
Aliaksandr Valialkin
3eae03a337 lib/storage: increase match cost for negative tag filters, since they need to scan all the label pairs 2021-02-15 16:39:52 +02:00
Aliaksandr Valialkin
46e98ed490 vendor: update github.com/VictoriaMetrics/metrics from v1.13.1 to v1.14.0
The new version switches from log-linear histograms to log-based histograms,
which provide up to 3.6 times better accuracy.
2021-02-15 15:11:15 +02:00
Aliaksandr Valialkin
93ff866e91 lib/storage: reduce the minimum supported retention for inverted index from one month to one day 2021-02-15 15:11:15 +02:00
Aliaksandr Valialkin
6f3bbf21b8 lib/storage: sort tag filters by actual execution time instead of by the number of matching time series
This should improve query speed for queries with regexp filters matching small number of time series
on a label with big number of unique values.
2021-02-15 00:19:46 +02:00
Aliaksandr Valialkin
9e3993c585 lib/storage: properly hanle regexp tag filters with dots, which can be converted to full string match filters.
For example `{label=~"foo\.bar"}` should be converted to `{label="foo.bar"}`. Previously it has was mistakenly conveted to `{label="foo\.bar"}` .
This could result in missing time series for such tag filters.
2021-02-14 23:39:19 +02:00
Aliaksandr Valialkin
4e645a5fd3 lib/storage: return back in-order applying of tag filters, since concurrently executing tag filters can result in CPU and RAM waste in common case 2021-02-10 22:43:07 +02:00
Aliaksandr Valialkin
eeb92eb7fc lib/storage: load metadata before loading indexdb, since indexdb depends on the metadata 2021-02-10 17:55:51 +02:00
Aliaksandr Valialkin
08f21d8761 app/vmstorage: export vm_composite_index_min_timestamp metric 2021-02-10 17:14:00 +02:00
Aliaksandr Valialkin
b27288f1b0 lib/storage: parallelize tag filters execution a bit
This should reduce execution time when a query contains multiple tag filters and each such filter matches big number of time series.
2021-02-10 16:32:27 +02:00
Aliaksandr Valialkin
4262c2f7c2 lib/storage: remove filter arg from getMetricIDsForDateTagFilter function
The `filter` arg breaks the logic for sorting tag filters by the matching metrics,
which may result in non-optimal performance during time series search.
2021-02-10 16:32:26 +02:00
Aliaksandr Valialkin
681dfb7485 lib/storage: fix inconsistencies in error logs 2021-02-10 16:32:21 +02:00
Aliaksandr Valialkin
148422bcba lib/storage: disable composite index usage when querying old data 2021-02-10 14:57:58 +02:00
Aliaksandr Valialkin
17d5a03f6e lib/storage: fix metric name match for composite filter 2021-02-10 01:27:34 +02:00
Aliaksandr Valialkin
fa0ef143b1 lib/storage: optimize search by label filters matching big number of time series 2021-02-10 00:46:17 +02:00
Aliaksandr Valialkin
5c9715a89a lib/storage: reduce lock contention in dateMetricIDCache when registering new time series for the current day
This should help systems with multiple CPU cores
2021-02-10 00:04:19 +02:00
Aliaksandr Valialkin
9ed7789fef optimize Storage.updatePerDateData() 2021-02-09 02:59:53 +02:00
Aliaksandr Valialkin
ea328b7391 lib/storage: skip deduplication when creating inmemory data blocks
The deduplication will be performed later during merging such blocks.
2021-02-09 02:26:16 +02:00
Aliaksandr Valialkin
7b7963a77f lib/mergeset: unconditionally cache indexdb blocks
Production workloads show that indexdb blocks must be cached unconditionally for reducing CPU usage.
This shouldn't increase memory usage too much, since unused blocks are removed from the cache every two minutes.
2021-02-09 00:49:59 +02:00
Aliaksandr Valialkin
e8ee9fa7fe app/vmstorage: export missing vm_cache_size_bytes metrics for indexdb and data caches 2021-02-09 00:49:58 +02:00
Aliaksandr Valialkin
2dbb12563b lib/storage: optimize data ingestion in the beginning of every hour
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1046
2021-02-08 12:04:51 +02:00
Aliaksandr Valialkin
c6a7288109 lib/storage: check for prevHourMetricIDs cache before falling back to checking for (date, metricID) entries during data ingestion
This should reduce possible CPU usage spikes at the beginning of every hour.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1046
2021-02-04 18:46:23 +02:00
Aliaksandr Valialkin
8249f13104 app/vmselect,lib/storage: properly parse Graphite selectors with inner wildcards
Example: foo{bar{x,yz},a[b-c],*de}
2021-02-03 20:16:28 +02:00
Aliaksandr Valialkin
2976ec89b8 lib/storage: fix a bug, which breaks searching by Graphite wildcard filters 2021-02-03 20:15:50 +02:00
Aliaksandr Valialkin
45a63a1da9 sort orSuffixes in tagFilter.InitFromGraphiteQuery for faster seeks 2021-02-03 20:15:37 +02:00
Aliaksandr Valialkin
4b930b9ffe app/vmselect: add ability to set Graphite-compatible filter via {__graphite__="foo.*.bar"} syntax 2021-02-03 01:17:19 +02:00
Aliaksandr Valialkin
3d79471fb3 lib/storage: inline marshalTags function and remove the code for handling duplicate tags from here
This is a follow-up commit after c8ea697db8
2021-01-12 15:20:22 +02:00
Aliaksandr Valialkin
719ad49adf lib/storage: de-duplicate tags in MetricName.sortTags
Leave only the last tag among tags with duplicate keys. This is needed for reliable addition of extra_labels
during data ingestion. See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1007 for details.
2021-01-12 15:03:22 +02:00
Aliaksandr Valialkin
d5a2b120e9 app/vmstorage: disable final merge by default, since it may result in high disk IO and CPU usage without measurable benefits such as increased query performance and reduced disk space usage 2021-01-08 00:12:12 +02:00
Aliaksandr Valialkin
ca8919e8e1 lib/storage: wait for pending transactions before closing and dropping the partition
This deflakes `make test-full-386` test
2020-12-25 11:46:47 +02:00
Aliaksandr Valialkin
c0511144e3 lib/storage: physically remove stale parts
Previously they were removed from partition struct, but the corresponding directories weren't removed.

This is a follow-up for 46dba00756
2020-12-24 16:56:09 +02:00
Aliaksandr Valialkin
66f8fbbb32 lib/storage: do not remove parts outside the configured retention if they are currently merged
These parts are automatically removed after the merge is complete.
2020-12-24 09:02:12 +02:00
Aliaksandr Valialkin
fa3bcf220f lib/storage: remove stale parts as soon as they go outside the configured retention
Previously such parts could remain undeleted for long durations until they are merged with other parts.
This should help for `-retentionPeriod` values smaller than one month.
2020-12-22 19:55:07 +02:00
Aliaksandr Valialkin
6859737329 lib/storage: properly determine max rows for output part when merging small parts 2020-12-18 23:26:28 +02:00
Aliaksandr Valialkin
edbe35509e lib/{storage,mergeset}: tune background merge process in order to reduce CPU usage and disk IO usage 2020-12-18 20:01:20 +02:00
Aliaksandr Valialkin
1a237c6903 all: properly handle CPU limits set on the host system/container
This can reduce memory usage on systems with enabled CPU limits.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/946
2020-12-08 21:07:03 +02:00
Aliaksandr Valialkin
9eca96596f lib/storage: add missing (AccountID, ProjectID) in MetricName.String() test 2020-11-29 01:25:50 +02:00
Aliaksandr Valialkin
03002f1fe1 lib/storage: log metric name plus all its labels when the metric timestamp is outside the configured retention
This should simplify debugging when the source of the metric with unexpected timestamp must be found.
2020-11-25 14:44:29 +02:00
Aliaksandr Valialkin
4848a05924 lib/storage: typo fix in error message: allowd->allowed 2020-11-25 14:15:54 +02:00
Aliaksandr Valialkin
7f3e884a31 all: spelling fix: superflouos->superfluous. This is a follow-up for 0acdab3ab9 2020-11-24 12:42:04 +02:00
Aliaksandr Valialkin
f4fd917e4f lib/fs: replace fs.OpenReaderAt with fs.MustOpenReaderAt
All the callers for fs.OpenReaderAt expect that the file will be opened.
So it is better to log fatal error inside fs.MustOpenReaderAt instead of leaving this to the caller.
2020-11-23 09:57:30 +02:00
Aliaksandr Valialkin
7d76fdedcc app/vmselect: use storage.NewSearchQuery() instead of constructing storage.SearchQuery in-place
This should prevent from bugs when AccountID and ProjectID aren't set in storage.SearchQuery.
2020-11-16 18:04:33 +02:00
Aliaksandr Valialkin
a9287cf564 lib/storage: do not pass (accountID, projectID) to SearchTagNames(), since they are already passed via tfss 2020-11-16 18:04:30 +02:00
Aliaksandr Valialkin
ac7460abdd lib/storage: add a test for Storage.SearchMetricNames 2020-11-16 13:18:48 +02:00
Aliaksandr Valialkin
eea1be0d5c app/vmselect/graphite: add /tags/findSeries handler from Graphite Tags API
See https://graphite.readthedocs.io/en/stable/tags.html#exploring-tags
2020-11-16 12:52:23 +02:00
Aliaksandr Valialkin
4be5b5733a app/vminsert: add /tags/tagSeries and /tags/tagMultiSeries handlers from Graphite Tags API
See https://graphite.readthedocs.io/en/stable/tags.html#adding-series-to-the-tagdb
2020-11-16 02:40:04 +02:00
Aliaksandr Valialkin
9ec964bff8 lib/storage: do not show artifically created label for reverse Graphite labels at /api/v1/labels page 2020-11-16 00:44:54 +02:00
immerrr again
1ec1a9f27f app/vmstorage: add "/internal/force_flush" endpoint (#893) 2020-11-11 14:46:37 +02:00
Aliaksandr Valialkin
72011bcc45 app/vmselect: properly handle errors in GetLabelsOnTimeRange and GetLabelValuesOnTimeRange 2020-11-05 01:36:34 +02:00
Aliaksandr Valialkin
f2bff64933 lib/storage: remove data race when updating rowsDeleted 2020-11-05 01:19:30 +02:00
Aliaksandr Valialkin
c5e6c5f5a6 app/vmselect: optimize querying for /api/v1/labels and /api/v1/label/<name>/values when start and end args are set 2020-11-05 01:19:29 +02:00
Aliaksandr Valialkin
c736339843 lib/{storage,mergeset}: clean cached index blocks and inmemory blocks more aggressively
Previously such blocks were cleaned after they weren't accessed during 10 minutes.
Now they are cleaned after one minute of missing access. This should reduce memory usage in general case.
2020-11-04 16:44:15 +02:00
Aliaksandr Valialkin
c0bd208c77 lib/storage: do not report about the need of free disk space if parts cannot be merged due to too big write amplification 2020-11-03 15:32:09 +02:00
Aliaksandr Valialkin
1b9778a756 lib/storage: remove unneeded fmt.Sprintf 2020-11-03 14:21:04 +02:00
Aliaksandr Valialkin
f3a7e6f6e3 lib/storage: remove obsolete code 2020-11-02 19:17:30 +02:00
Aliaksandr Valialkin
901514be88 lib/storage: drop more samples outside the given retention during background merge
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/17
2020-10-31 20:44:47 +02:00
Aliaksandr Valialkin
7599e5c835 lib/storage: properly handle the case when key="__name__" is passed to MetricName.AddTag* 2020-10-20 20:09:52 +03:00
Aliaksandr Valialkin
9c5cd5a6c5 lib/storage: code cleanup after 5bfd4e6218 2020-10-20 16:10:53 +03:00
Aliaksandr Valialkin
0db7c2b500 app/vmstorage: support for -retentionPeriod smaller than one month
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/173
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/17
2020-10-20 14:42:46 +03:00
Aliaksandr Valialkin
efb1989193 lib/storage: small code adjustements after d2960a20e0
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/781
2020-10-17 01:17:12 +03:00
faceair
8ddf089deb evaluate the execution cost of all tag filters (#824)
* evaluate the execution cost of all tag filters

* fix suffixes typo
2020-10-17 01:13:20 +03:00
Aliaksandr Valialkin
d2e917d1cb app/vmstorage: add vm_rows_added_to_storage_total metric, which shows the total number of rows added to storage since app start 2020-10-09 13:36:17 +03:00
Aliaksandr Valialkin
b51fa16177 app/vmstorage: add -finalMergeDelay command-line flag for configuring the delay before final merge for per-month partitions after no new data is ingested to it 2020-10-07 17:42:31 +03:00
Aliaksandr Valialkin
fd7dd5064a lib/storage: code cleanup after 10f2eedee0
Remove the code that uses metricIDs caches for the current and the previous hour during metricIDs search,
since this code became unused after implementing per-day inverted index almost a year ago.

While at it, fix a bug, which could prevent from finding time series with names containing dots (aka Graphite-like names
such as `foo.bar.baz`).
2020-10-01 19:12:04 +03:00
Aliaksandr Valialkin
3ad7566a87 lib/storage: imrpove cache effectiveness for time series ids matching the given filters
Previously the maximum cache lifetime has been limited by 10 seconds. Now it is extended up to a day.
This should reduce CPU usage in the following cases:

* when querying recently added data with small churn rate for time series
* when querying historical data
2020-10-01 14:39:46 +03:00
Aliaksandr Valialkin
7c2e4e267a lib/storage: allow set values higher than 1 for vm_merge_need_free_disk_space if there are multiple partitions with deferred merges due to disk space shortage 2020-09-29 22:53:34 +03:00
Aliaksandr Valialkin
097a4c10dd app/vmstorage: add metrics for determining whether background merges need additional disk space to complete
These metrics are:

* vm_small_merge_need_free_disk_space
* vm_big_merge_need_free_disk_space

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/686
2020-09-29 21:47:47 +03:00
Aliaksandr Valialkin
338a53ccf9 lib/storage: fix tests for 32-bit arches such as GOARCH=386 and GOARCH=arm 2020-09-29 13:10:37 +03:00
Aliaksandr Valialkin
ef416c72c2 lib/storage: fix 32-bit builds for GOARH=386 or GOARCH=arm 2020-09-29 12:42:25 +03:00
Aliaksandr Valialkin
6d8c23fdbd app/{vminsert,vmselect}: skip accountID and projectID when marshaling/unmarshaling MetricName in /api/v1/export/native and /api/v1/import/native
This is needed in order to be able to migrate native data from/to single-node VictoriaMetrics
2020-09-28 00:58:58 +03:00
Aliaksandr Valialkin
aadbd014ff all: add native format for data export and import
The data can be exported via [/api/v1/export/native](https://victoriametrics.github.io/#how-to-export-data-in-native-format) handler
and imported via [/api/v1/import/native](https://victoriametrics.github.io/#how-to-import-data-in-native-format) handler.
2020-09-27 17:36:38 +03:00
Aliaksandr Valialkin
533bf76a12 lib/storage: correctly use maxBlockSize in various checks
Previously `maxBlockSize` has been multiplied by 8 in certain checks. This is unnecessary.
2020-09-24 18:13:15 +03:00
Aliaksandr Valialkin
31e341371b lib/storage: code prettifying after be5e1222f3
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/781
2020-09-22 00:42:20 +03:00
faceair
ad41e39350 add filter to getMetricIDs (#783)
* add getMetricIDs filter

* check nil filter before use
2020-09-22 00:42:19 +03:00
Aliaksandr Valialkin
a9321f6a60 lib/storage: reduce CPU load for idle VictoriaMetrics by reducing the frequency for the need for background merges 2020-09-21 15:51:26 +03:00
Aliaksandr Valialkin
778ea183ca lib/decimal: properly store Inf values
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/752
2020-09-18 19:08:53 +03:00
Aliaksandr Valialkin
d96858b921 lib/storage: add /internal/force_merge handler for running forced compactions on historical per-month partitions
This may be useful for freeing up storage space after time series deletion.

See https://victoriametrics.github.io/#force-merge for more details.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/686
2020-09-17 12:20:56 +03:00
Aliaksandr Valialkin
3abbb38254 lib/{mergeset,storage}: compare errors with errors.Is() 2020-09-17 03:03:10 +03:00
Aliaksandr Valialkin
ddb3519e17 lib/{mergeset,storage}: code prettifying 2020-09-17 02:06:37 +03:00
Aliaksandr Valialkin
bf826dd828 lib/storage: removed duplicate checks for empty parts during merge - another check is in the beginning of mergeParts functions 2020-09-17 01:49:08 +03:00
Aliaksandr Valialkin
81c05f669b lib/storage: do not store inf values, since they may lead to significant precision loss for previously stored values
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/752
2020-09-11 14:45:20 +03:00
Aliaksandr Valialkin
f307e6f432 app/vmselect: initial implementation of Graphite Metrics API
See https://graphite-api.readthedocs.io/en/latest/api.html#the-metrics-api
2020-09-11 00:30:20 +03:00
Aliaksandr Valialkin
f5cb213ef9 lib/storage: reuse timestamp blocks for adjancent metric blocks with identical timestamps
This should reduce disk space usage when scraping targets containing metrics with identical names
such as `node_cpu_seconds_total`, histograms, quantiles, etc.

Expose `vm_timestamps_blocks_merged_total` and `vm_timestamps_bytes_saved_total` metrics for monitoring
the effectiveness of timestamp blocks merging.
2020-09-09 23:59:21 +03:00
Aliaksandr Valialkin
4ce1368e4b lib/storage: mention time range used in the query that led to error message
This should improve detecting slow queries with too big time ranges
2020-08-10 13:46:29 +03:00
Aliaksandr Valialkin
f92255e803 lib/storage: mention tag filters used in the query that led to error message
This should improve detecting invalid or heavy queries that lead to errors.
2020-08-10 13:36:54 +03:00
Aliaksandr Valialkin
b3d4ff7ee2 app/vmstorage: improve error logging when the request times out 2020-08-10 13:17:24 +03:00
Aliaksandr Valialkin
307281e922 lib/storage: slow down concurrent searches when the number of concurrent inserts reaches the limit
This should improve data ingestion performance when heavy searches are executed

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/618
2020-08-07 08:49:13 +03:00
Aliaksandr Valialkin
dd1d59f57a lib/storage: properly check timeouts and pace limits
Previously they were checked on every iteration for small number of iterations
2020-08-07 08:40:56 +03:00
Aliaksandr Valialkin
a2039b3bbc app/vmselect: return the upper bound on the number of found time series from storage.Search.Init
This is used by a single-node version in order to reduce memory allocations during search.
See bc8381613d for details.
2020-08-06 19:20:31 +03:00
Aliaksandr Valialkin
b690eeff53 lib/storage: reduce the frequency (and overhead) for timeout and pace limiter checks by 4x 2020-08-06 18:45:47 +03:00
Aliaksandr Valialkin
13f8644f8e lib/storage: optimize prefetching metric names for the given metricIDs 2020-08-06 16:52:58 +03:00
Aliaksandr Valialkin
a3e91c593b lib/storage: limit the number of concurrent calls to storage.searchTSIDs to GOMAXPROCS*2
This should limit the maximum memory usage and reduce CPU trashing on vmstorage
when multiple heavy queries are executed.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648
2020-08-05 18:27:21 +03:00
Aliaksandr Valialkin
3149af624d lib/storage: reduce the maximum number of concurrent merge workers to GOMAXPROCS/2
Previously the limit has been raised to GOMAXPROCS, but it has been appeared that this
increases query latencies since more CPUs are busy with merges.

While at it, substitute `*MergeConcurrencyLimitCh` channels with simple integer limits.
2020-07-31 17:53:13 +03:00
Aliaksandr Valialkin
29bbab0ec9 lib/storage: remove prioritizing of merging small parts over merging big parts, since it doesn't work as expected
The prioritizing could lead to big merge starvation, which could end up in too big number of parts that must be merged into big parts.

Multiple big merges may be initiated after the migration from v1.39.0 or v1.39.1. It is OK - these merges should be finished soon,
which should return CPU and disk IO usage to normal levels.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/618
2020-07-30 20:02:22 +03:00
Aliaksandr Valialkin
96039dcb40 lib/storage: properly update vm_slow_row_inserts_total metric when importing multiple data points per time series at once
Previously the `vm_slow_row_inserts_total` metric may be incremented multiple times for different data points per a single time series,
while only a single increment is needed when inserting the first data point for this time series.
2020-07-30 16:17:19 +03:00
Sasasu
96bc476e53 lib/storage: metaindexRow use memroy more efficiently (#655)
due to memory align the metaindexRow structure use 64-byte pre object.
this commit changes the order of field, make metaindexRow use 56-byte pre
object.

Signed-off-by: Sasasu <su@sasasu.me>
2020-07-27 23:23:25 +03:00
Aliaksandr Valialkin
94cc677b0c lib/storage: slightly reduce code difference between single-node and cluster versions 2020-07-24 01:18:05 +03:00
Aliaksandr Valialkin
fb3d1380ac lib/storage: respect -search.maxQueryDuration when searching for time series in inverted index
Previously the time spent on inverted index search could exceed the configured `-search.maxQueryDuration`.
This commit stops searching in inverted index on query timeout.
2020-07-23 21:22:05 +03:00
Aliaksandr Valialkin
dbf3038637 lib/storage: add more fine-grained pace limiting for search 2020-07-23 19:21:49 +03:00
Aliaksandr Valialkin
b8303afcd8 lib/storage: improve prioritizing of data ingestion over querying
Prioritize also small merges over big merges.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/291
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648
2020-07-23 01:40:38 +03:00
Aliaksandr Valialkin
7d0743422b lib/storage: properly calculate global metrics in UpdateStats() 2020-07-23 00:35:31 +03:00
Aliaksandr Valialkin
23fa44e56e lib/storage: reorder mergeBlockStreams() args in order to make them more consistent 2020-07-22 21:58:25 +03:00
Aliaksandr Valialkin
754eac676d lib/storage: prevent possible race condition when all the goroutines exit Storage.AddRows, before goroutines other goroutines are blocked on searchTSIDsCond inside Storage.searchTSIDs
This condition may occur after the following sequence of events:

1) A goroutine enters the loop body when len(addRowsConcurrencyCh) == cap(addRowsConcurrencyCh) inside Storage.searchTSIDs.
2) All the goroutines return from Storage.AddRows.
3) The goroutine from step 1 blocks on searchTSIDsCond.Wait() inside the loop body.

The goroutine remains blocked until the next call to Storage.AddRows, which calls searchTSIDsCond.Signal().
This may take indefinite time.
2020-07-22 21:52:42 +03:00
Aliaksandr Valialkin
67be79a0bc lib/uint64set: optimize adding items to the set via Set.AddMulti 2020-07-21 20:57:05 +03:00
Aliaksandr Valialkin
be0ab4fbfe lib/storage: reset MetricName->TSID cache after marking metricIDs as deleted
This is a follow-up commit after 12b16077c4 ,
which didn't reset the `tsidCache` in all the required places.
This could result in indefinite errors like:

    missing metricName by metricID ...; this could be the case after unclean shutdown; deleting the metricID, so it could be re-created next time

Fix this by resetting the cache inside deleteMetricIDs function.
2020-07-14 14:05:19 +03:00
Aliaksandr Valialkin
7335743d57 lib/storage: limit the maximum concurrency for data ingestion to GOMAXPROCS
Previously the concurrency has been limited to GOMAXPROCS*2. This had little sense,
since every call to Storage.AddRows is bound to CPU, so the maximum ingestion bandwidth
is achieved when the number of concurrent calls to Storage.AddRows is limited to the number of CPUs,
i.e. to GOMAXPROCS.
2020-07-08 17:34:27 +03:00
Aliaksandr Valialkin
fad008df7e lib/storage: clarify out of retention period error message by mentioning -retentionPeriod command-line flag 2020-07-08 13:54:13 +03:00
Aliaksandr Valialkin
fe58462bef lib/storage: reset MetricName->TSID cache after deleting time series
This should prevent from adding new data points to deleted time series
without the need to check for the deleted time series.

This improves ingestion performance a bit when the `deleted time series ids` aka `dmis` set
contains big number of time series.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/596

Based on the idea from @n4mine at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/604
2020-07-06 22:01:24 +03:00
Aliaksandr Valialkin
0bff96fe4b lib/storage: prioritize data ingestion over heavy queries
Heavy queries could result in the lack of CPU resources for processing the current data ingestion stream.
Prevent this by delaying queries' execution until free resources are available for data ingestion.

Expose `vm_search_delays_total` metric, which may be used in for alerting when there is no enough CPU resources
for data ingestion and/or for executing heavy queries.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/291
2020-07-05 19:44:04 +03:00
Aliaksandr Valialkin
8bb3622e9d app/vminsert: prevent from adding and/or selecting labels with empty values
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/600
2020-07-02 23:17:12 +03:00
Aliaksandr Valialkin
4cb3e7595c app/vmstorage: add -denyQueriesOutsideRetention command-line flag for denying queries outside the configured retention 2020-07-01 00:58:42 +03:00