Commit Graph

577 Commits

Author SHA1 Message Date
Aliaksandr Valialkin
544ea89f91
lib/{mergeset,storage}: add start background workers via startBackgroundWorkers() function 2022-12-04 00:01:04 -08:00
Aliaksandr Valialkin
33dda2809b
lib/mergeset: panic when too long item is passed to Table.AddItems() 2022-12-03 23:32:16 -08:00
Aliaksandr Valialkin
932c1f90ae
lib/storage: remove duplicate logging for filepath on errors 2022-12-03 23:15:22 -08:00
Aliaksandr Valialkin
044a304adb
lib/storage: pass a single arg - rowsPerBlock - to getCompressLevel() function instead of two args 2022-12-03 23:10:16 -08:00
Aliaksandr Valialkin
cb44976716
lib/{storage,mergeset}: use a single sync.WaitGroup for all background workers
This simplifies the code
2022-12-03 23:03:08 -08:00
Aliaksandr Valialkin
28e6d9e1ff
lib/storage: properly pass retentionMsecs to OpenStorage() at TestIndexDBRepopulateAfterRotation 2022-12-03 23:02:10 -08:00
Aliaksandr Valialkin
343c69fc15
lib/{mergeset,storage}: pass compressLevel to blockStreamWriter.InitFromInmemoryPart
This allows packing in-memory blocks with different compression levels
depending on its contents. This may save memory usage.
2022-12-03 22:46:48 -08:00
Aliaksandr Valialkin
f3e3a3daeb
lib/{mergeset,storage}: take into account byte slice capacity when returning the size of in-memory part
This results in more correct reporting of memory usage for in-memory parts
2022-12-03 22:30:36 -08:00
Aliaksandr Valialkin
45299efe22
lib/{storage,mergeset}: consistency rename: `flushRaw{Rows,Items} -> flushPending{Rows,Items} 2022-12-03 22:17:46 -08:00
Aliaksandr Valialkin
5ca58cc4fb
lib/storage: optimization: do not scan block for rows outside retention if it is covered by the retention 2022-12-03 22:14:12 -08:00
Aliaksandr Valialkin
152ac564ab
lib/storage: remove logging redundant path values in a single error message 2022-12-03 22:13:13 -08:00
Aliaksandr Valialkin
05c65bd83f
lib/storage: speed up search for data block for the given tsids
Use binary search instead of linear scan for looking up the needed
data block inside index block.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3425
2022-12-03 20:58:32 -08:00
Aliaksandr Valialkin
299285b147
lib/storage: fix TestUpdateCurrHourMetricIDs test when it runs on the first hour of the day by UTC 2022-12-02 18:52:37 -08:00
Aliaksandr Valialkin
e9636b4c69
lib/{mergeset,storage}: re-use the code for removing isInMerge flag at parts
Move the common code into releasePartsToMerge() method and consistently use it throughout the code.
2022-12-02 18:52:37 -08:00
匠心零度
fa0ce10275
lib/storage: remove extra error check (#3396) 2022-11-28 16:43:31 -08:00
Aliaksandr Valialkin
daa70e6560
lib/storage: follow-up for 790768f20b
- Document the bugfix at docs/CHANGELOG.md
- Simplify the bugfix a bit
2022-11-07 14:04:08 +02:00
Aliaksandr Valialkin
f9dc3da9e2
lib/storage: typo fix after 32d48f8dfbb03174858c00bdfe6d9d22431dc8d8 2022-11-07 13:58:27 +02:00
Aliaksandr Valialkin
dd88c628aa
lib/storage: remove unused isFull field from hourMetricIDs struct 2022-11-07 13:58:26 +02:00
Łukasz Marszał
790768f20b
Fix issue-3309 - currHourMetricIDs shouldn't contain metrics from prev hour (#3320)
* fix issue-3309 currHourMetricIDs shouldn't contain metrics from prev hour

* Update storage.go
2022-11-07 13:55:37 +02:00
Aliaksandr Valialkin
c4265322f4
lib/fs: add canOverwrite arg to WriteFileAtomically when it is allowed to overwrite the file atomically if it already exists 2022-10-26 01:07:34 +03:00
Aliaksandr Valialkin
8e998aa1a1
lib/storage: add support for retention filters (aka multiple retentions for distinct sets of time series)
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/143
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/289
2022-10-24 16:40:20 +03:00
Aliaksandr Valialkin
dba218a8ce
lib/storage: skip blocks outside the configured retention during search
Blocks outside the configured retention are eventually deleted during background merge.
But such blocks may reside in the storage for long time until background merge.
Previously VictoriaMetrics could spend additional CPU time on processing such blocks
during search queries. Now these blocks are skipped.
2022-10-24 02:52:44 +03:00
Aliaksandr Valialkin
e2f0b76ebf
lib/storage: do not pass retentionMsecs and isReadOnly args explicitly - access them via Storage arg
This makes code easier to read.

This is a follow-up after d2d30581a0
2022-10-24 01:31:04 +03:00
Aliaksandr Valialkin
89a1108b1a
lib/storage: small code cleanups 2022-10-24 01:17:47 +03:00
Aliaksandr Valialkin
05512fdd74
lib/storage: re-use newTestStorage() instead of manually initializing Storage mock
This is a follow-up for d2d30581a0
2022-10-23 16:24:00 +03:00
Aliaksandr Valialkin
d2d30581a0
lib/storage: pass Storage to table and partition instead of getDeletedMetricIDs callback
This improves code readability a bit.
2022-10-23 16:10:04 +03:00
Aliaksandr Valialkin
54f35c175c
lib/storage: small refactoring: move retentionDeadline to blockStreamMerger
This allows defining per-block retention in the future by updating the getRetentionDeadline function
2022-10-23 16:10:02 +03:00
Aliaksandr Valialkin
187e294a53
lib/storage: use a single reference to the currently merged block - bsm.Block during the block merge loop 2022-10-23 14:08:57 +03:00
Aliaksandr Valialkin
d0a9ca1bc2
lib/storage: properly pass uint64 constant to fmt.Errorf on 32-bit platforms 2022-10-23 12:48:00 +03:00
Aliaksandr Valialkin
5e4dfe50c6
lib/storage: subsitute searchTSIDs functions with more lightweight searchMetricIDs function
The searchTSIDs function was searching for metricIDs matching the the given tag filters
and then was locating the corresponding TSID entries for the found metricIDs.

The TSID entries aren't needed when searching for time series names (aka MetricName),
so this commit removes the uneeded TSID search from the implementation of /api/v1/series API.
This improves perfromance of /api/v1/series calls.

This commit also improves performance a bit for /api/v1/query and /api/v1/query_range calls,
since now these calls cache small metricIDs instead of big TSID entries
in the indexdb/tagFilters cache (now this cache is named indexdb/tagFiltersToMetricIDs)
without the need to compress the saved entries in order to save cache space.

This commit also removes concurrency limiter during searching for matching time series,
which was introduced in 8f16388428, since the concurrency
for all the read queries is already limited with -search.maxConcurrentRequests command-line flag.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648
2022-10-23 12:23:47 +03:00
Aliaksandr Valialkin
4128ad71e2
lib/storage: move common code to newRawRowsBlock() function 2022-10-21 14:46:55 +03:00
Aliaksandr Valialkin
b5674164c6
lib/storage: simplify code a bit after 3f5959c053 2022-10-21 14:39:27 +03:00
Aliaksandr Valialkin
fd7c86ae25
lib/{mergeset,storage}: simplify the code a bit after ae55ad8749 2022-10-21 14:33:03 +03:00
Aliaksandr Valialkin
99d67ac8ad
lib/storage: validate timestamps in the block only if they use encoding, which needs validation
This reduces CPU usage when there is no sense in validating timestamps.

This is a follow-up for 5fa9525498

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2998
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3011
2022-10-21 00:52:32 +03:00
Aliaksandr Valialkin
3f5959c053
lib/storage: try generating initial parts from inmemory rows with identical sizes under high ingestion rate
This should improve background merge rate under high load a bit
2022-10-20 23:28:24 +03:00
Aliaksandr Valialkin
150e99d403
lib/{mergeset,storage}: avoid unaligned 64-bit atomic operation panic on 32-bit platforms
The panic has been introduced in 68f3a02589

While at it, add padding to shard structs in order to avoid false sharing on mordern CPUs

This should improve scalability on systems with many CPU cores
2022-10-20 16:25:43 +03:00
Aliaksandr Valialkin
fb50730ba7
lib/storage: double the number of rawRows shards on multi-core systems
This should increase data ingestion scalability on multi-core systems at the cost of slightly higher memory usage
2022-10-17 18:19:51 +03:00
Aliaksandr Valialkin
ae55ad8749
lib/{storage,mergeset}: do not hold per-shard lock in fast path when adding per-shard items to the flush list 2022-10-17 18:01:26 +03:00
Aliaksandr Valialkin
db16759c68
lib/storage: optimize matching speed for non-trivial regexp filters
Wrap re.Match into bytesutil.FastStringMatcher.

This increases performance for `{foo=~"complex_regex_here"}` filters
by up to 4x.
2022-10-01 12:06:06 +03:00
Aliaksandr Valialkin
042a532f70
lib/storage: substitute remaining calls to fs.MustRemoveAll with fs.MustRemoveDirAtomic
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3038
2022-09-13 16:17:38 +03:00
Aliaksandr Valialkin
68e32b0764
lib/storage: atomically remove parts inside partitions
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3038
2022-09-13 16:17:38 +03:00
Aliaksandr Valialkin
340ada871d
lib/storage: atomically remove partitions, which went outside the configured retention
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3038
2022-09-13 16:17:37 +03:00
Aliaksandr Valialkin
978dcb4574
lib/storage: properly remove cache directory contents if reset_cache_on_startup file is located there
Previously the cache directory was removed. This could result in error when the cache directory
is mounted to a separate filesystem.
2022-09-13 16:17:36 +03:00
Aliaksandr Valialkin
5f28ca1f42
lib/storage: atomically remove snapshot directories
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3038
2022-09-13 16:17:36 +03:00
Aliaksandr Valialkin
5fa9525498
lib/storage: verify that timestamps in block are in the range specified by blockHeader.{Min,Max}Timestamp when upacking the block
This should reduce chances of unnoticed on-disk data corruption.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2998
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3011

This change modifies the format for data exported via /api/v1/export/native -
now this data contains MaxTimestamp and PrecisionBits fields from blockHeader.

This is OK, since the native export format is undocumented.
2022-09-06 13:08:09 +03:00
Aliaksandr Valialkin
0ad3bbadd3
lib/regexutil: add Simplify() function for simplifying the regular expression 2022-08-26 11:57:12 +03:00
Aliaksandr Valialkin
0d4ea03a73
lib/promrelabel: optimize action: {labeldrop,labelkeep,keep,drop} with regex containing alternate values
For example, the following relabeling rule must work much faster now:

- action: labeldrop
  regex: "foo|bar|baz"
2022-08-24 17:54:29 +03:00
Aliaksandr Valialkin
0d46e24af5
lib/storage: increase the maximum possible or values extracted from regexp from 20 to 100
This should improve time series search speed for regexp filters with big number of `or` values.
2022-08-24 17:15:25 +03:00
Aliaksandr Valialkin
fdbf5b5795
lib/storage: ignore start text and end text anchors in getOrValues(regexp) function
This is OK, since the anchors are implicitly applied to the whole regexp.
This optimization should improve the speed for regexp series filters with explicit $ and ^ anchors.
For example, `{label="^(foo|bar)$"}`
2022-08-24 17:12:52 +03:00
Aliaksandr Valialkin
796aa310c2
app/vmstorage: expose vm_{hourly,daily}_series_limit_{max,current}_series metrics if -storage.max{Hourly,Daily}Series limits are set
These metrics allow alerting when the number of unique series approach the limit.
For example, the following query alerts when the number of series reaches 90% of the configured limit:

    vm_hourly_series_limit_current_series / vm_hourly_series_limit_max_series > 0.9
2022-08-24 13:44:04 +03:00