VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-12-29 23:30:04 +01:00

Author	SHA1	Message	Date
Roman Khavronenko	27b958ba8b	lib/storage: check for free disk space before opening tables (#4035 ) * lib/storage: check for free disk space before opening tables We check for free disk space before call to `openTable`, so `Storage` can be set to ReadOnly before mergeWorkers start. Before the change, there was a chance that merges will start even if Storage has to start in ReadOnly mode because of `-storage.minFreeDiskSpaceBytes` limit. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4023 Signed-off-by: hagen1778 <roman@victoriametrics.com> * lib/storage: chore Signed-off-by: hagen1778 <roman@victoriametrics.com> * Update lib/storage/storage.go --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-03-31 23:50:27 -07:00
Aliaksandr Valialkin	c8f2febaa1	lib/storage: consistently use OS-independent separator in file paths This is needed for Windows support, which uses `\` instead of `/` as file separator Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70	2023-03-25 14:33:58 -07:00
Aliaksandr Valialkin	b14d96618c	all: follow-up after `34634ec357` - Use windows.FlushFileBuffers() instead of windows.Fsync() at streamTracker.adviseDontNeed() for consistency with implementations for other architectures. - Use filepath.Base() instead of filepath.Split(), since the dir part isn't used. This simplifies the code a bit. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70	2023-03-25 11:57:39 -07:00
Nikolay	34634ec357	lib/fs: adds memory map for windows (#3988 ) This is a follow-up for `43b24164ef` * lib/fs: adds memory map for windows it should improve performance for file reading * lib/storage: replace '/' with os specific separator it must fix an errors for windows * lib/fs: mention windows fsync support * lib/filestream: adds fdatasync for windows writes Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70	2023-03-25 11:43:19 -07:00
Dmytro Kozlov	5c92022cc6	lib/storage: fix collect downsampling metrics (#489 ) * lib/storage: fix downsampling * lib/storage: update logic * lib/storage: fix comments, removed unneeded check	2023-03-19 23:34:46 -07:00
Aliaksandr Valialkin	43b24164ef	all: add Windows build for VictoriaMetrics This commit changes background merge algorithm, so it becomes compatible with Windows file semantics. The previous algorithm for background merge: 1. Merge source parts into a destination part inside tmp directory. 2. Create a file in txn directory with instructions on how to atomically swap source parts with the destination part. 3. Perform instructions from the file. 4. Delete the file with instructions. This algorithm guarantees that either source parts or destination part is visible in the partition after unclean shutdown at any step above, since the remaining files with instructions is replayed on the next restart, after that the remaining contents of the tmp directory is deleted. Unfortunately this algorithm doesn't work under Windows because it disallows removing and moving files, which are in use. So the new algorithm for background merge has been implemented: 1. Merge source parts into a destination part inside the partition directory itself. E.g. now the partition directory may contain both complete and incomplete parts. 2. Atomically update the parts.json file with the new list of parts after the merge, e.g. remove the source parts from the list and add the destination part to the list before storing it to parts.json file. 3. Remove the source parts from disk when they are no longer used. This algorithm guarantees that either source parts or destination part is visible in the partition after unclean shutdown at any step above, since incomplete partitions from step 1 or old source parts from step 3 are removed on the next startup by inspecting parts.json file. This algorithm should work under Windows, since it doesn't remove or move files in use. This algorithm has also the following benefits: - It should work better for NFS. - It fits object storage semantics. The new algorithm changes data storage format, so it is impossible to downgrade to the previous versions of VictoriaMetrics after upgrading to this algorithm. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3236 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3821 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70	2023-03-19 01:36:51 -07:00
Aliaksandr Valialkin	6460475e3b	lib/{mergeset,storage}: prevent from long wait time when creating a snapshot under high data ingestion rate Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3551 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3873	2023-03-19 00:15:30 -07:00
Aliaksandr Valialkin	a26c6628fd	lib/{fs,mergeset,storage}: substitute os.Open()+os.File.Readdir() with os.ReadDir() This simplifies code a bit	2023-03-17 21:03:37 -07:00
Zakhar Bessarab	6a5d236245	lib/storage: log original labels set when label value is truncated (#3952 ) lib/storage: log original labels set when label value is truncated	2023-03-14 10:59:40 +01:00
Nikolay	927d9da270	lib/storage: correctly handle io.EOF error for pre-fetched metrics (#3946 ) io.EOF shouldn't be returned from this function. It breaks all search API logic and may result in empty query results.	2023-03-11 23:29:43 -08:00
Nikolay	6bfe9cc733	lib{mergset,storage}: prevent possible race condition with logging st… (#3900 ) lib{mergset,storage}: prevent possible race condition with logging stats for merges Previously partwrapper could be release by background process and reference for part may be invalid during logging stats. It will lead to panic at vmstorage https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3897	2023-03-03 12:33:42 +01:00
Aliaksandr Valialkin	0d3f31f60e	lib/storage: follow-up for `39cdc546dd` - Use flag.Duration instead of flagutil.Duration for -snapshotCreateTimeout, since the flagutil.Duration is intended mostly for big durations, e.g. days, months and years, while the -snapshotCreateTimeout is usually smaller than one hour. - Add links to https://docs.victoriametrics.com/#how-to-work-with-snapshots in docs/CHANGELOG.md, so readers could easily find the corresponding docs when reading the changelog. - Properly remove all the created directories on unsuccessful attempt to create snapshot in Storage.CreateSnapshot(). Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3551	2023-02-27 13:07:38 -08:00
Zakhar Bessarab	39cdc546dd	lib/storage: enhancements for snapshots process (#3873 ) * lib/{fs,mergeset,storage}: skip `.must-remove.` dirs when creating snapshot (#3858) * lib/{mergeset,storage}: add timeout configuration for snapshots creation, remove incomplete snapshots from storage * docs: fix formatting * app/vmstorage: add metrics to track status of snapshots * app/vmstorage: use `vm_http_requests_total` metric for snapshot endpoints metrics, rename new flag to make name more clear Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * app/vmstorage: update flag name in docs Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * app/vmstorage: reflect new metrics names change in docs Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-02-27 12:12:03 -08:00
Zakhar Bessarab	d8eaa511b0	lib/{fs,mergeset,storage}: skip `.must-remove.` dirs when creating snapshot (#3858 ) (#3867 )	2023-02-24 12:38:42 -08:00
Oleksandr Redko	9fff48c3e3	app,lib: fix typos in comments (#3804 )	2023-02-13 13:27:13 +01:00
Aliaksandr Valialkin	3ec8a4dc80	lib/{mergeset,storage}: allow at least 3 concurrent flushes during background merges on systems with 1 or 2 CPU cores This should prevent from data ingestion slowdown and query performance degradation on systems with small number of CPU cores (1 or 2), when big merge is performed. This should help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3790 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3337	2023-02-11 12:08:52 -08:00
Nikolay	9254e494f9	lib/storage: fixes finalDedup for backfilled data (#3737 ) previously historical data backfilling may trigger force merge for previous month every hour it consumes cpu, disk io and decrease cluster performance. Following commit fixes it by applying deduplication for InMemoryParts	2023-02-01 09:54:21 -08:00
Nikolay	465a285324	lib/storage: properly release parts inMerge lock (#3711 ) if storage doesn't have enough disk space, finalDedupWatcher holds inMerge lock for all parts and never release it until storage restart	2023-01-26 08:05:20 -08:00
Aliaksandr Valialkin	ba5a6c851c	lib/storage: use deterministic random generator in tests Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3683	2023-01-23 20:10:32 -08:00
Aliaksandr Valialkin	2ac530eb28	lib/{storage,mergeset}: wake up background merges as soon as there is a potential work for them Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3647	2023-01-18 01:10:18 -08:00
Aliaksandr Valialkin	b8409d6600	lib/{storage,mergeset}: do not run assisted merges when flushing pending samples to parts Assisted merges are intended to be performed by goroutines, which accept the incoming samples, in order to limit the data ingestion rate. The worker, which converts pending samples to parts, shouldn't be penalized by assisted merges, since this may result in increased number of pending rows as seen at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3647#issuecomment-1385039142 when the assisted merge takes too much time.	2023-01-18 00:20:58 -08:00
Aliaksandr Valialkin	1ac025bbc9	lib/storage: use better naming for a function returning new []rawRows - newRawRowsBlock() -> newRawRows()	2023-01-18 00:01:03 -08:00
Aliaksandr Valialkin	09d7fa2737	lib/{mergeset,storage}: do not slow down concurrently executed queries during assisted merges Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3647 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3641 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/291	2023-01-16 14:31:52 -08:00
Aliaksandr Valialkin	2f3ddd4884	app/vmselect/promql: avoid memory allocations and copying from source timeseries to the returned result at timeseriesToResult()	2023-01-09 22:38:59 -08:00
Aliaksandr Valialkin	7afcca0c51	all: use metricsql.CompileRegexp instead of regexp.Compile for compiling regexps used in graphite queries This should speed up repeated queries, since metricsql.CompileRegexp returns regexps from the cache on subsequent calls for the same input regexp.	2023-01-09 21:43:08 -08:00
Aliaksandr Valialkin	41e00a0df7	lib/storage: simplify the fix from `488940502c` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3566	2023-01-07 01:04:43 -08:00
Dmytro Kozlov	488940502c	lib/storage: fix returning camelcase label names (#3608 ) * lib/storage: fix returning camelcase label names * doc: add change log * Update docs/CHANGELOG.md * Update docs/CHANGELOG.md Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-01-07 00:50:14 -08:00
Aliaksandr Valialkin	c63755c316	lib/writeconcurrencylimiter: improve the logic behind -maxConcurrentInserts limit Previously the -maxConcurrentInserts was limiting the number of established client connections, which write data to VictoriaMetrics. Some of these connections could be idle. Such connections do not consume big amounts of CPU and RAM, so there is a little sense in limiting the number of such connections. So now the -maxConcurrentInserts command-line option limits the number of concurrently executed insert requests, not including idle connections. It is recommended removing -maxConcurrentInserts command-line option, since the default value for this option should work good for most cases.	2023-01-06 22:20:19 -08:00
Roman Khavronenko	5bdd880142	vmstorage: add more context to the flock acquiring msg (#3584 ) See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3578 Signed-off-by: hagen1778 <roman@victoriametrics.com> Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-01-05 18:30:42 -08:00
Aliaksandr Valialkin	1b16118e17	lib/{storage,mergeset}: tune the threshold for assisted merge The https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3425#issuecomment-1359117221 reveals that CPU usage for incoming queries may significantly increase when the number of in-memory parts becomes too big. This commit reduces the maximum number of in-memory parts before starting the assisted merge during data ingestion. This should reduce CPU usage for incoming queries, since they need to inspect lower number of in-memory parts. This should help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3425	2022-12-28 14:39:24 -08:00
Aliaksandr Valialkin	4e55b67a44	lib/storage: clear the err if it is set to io.EOF when searching for the TSID by metricID This is expected error after when recently added indexdb data isn't available for search yet or wasn't flushed to disk after unclean shutdown of VictoriaMetrics. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3515	2022-12-20 14:05:29 -08:00
Aliaksandr Valialkin	944effca54	lib/storage: do not check for the result returned by db.doExtDB() where this isn't necessary This simplifies the code a bit	2022-12-19 13:23:13 -08:00
Aliaksandr Valialkin	6c98b56935	lib/storage: search for TSIDs for the given metricIDs in the previous indexdb if they aren't found in the current indexdb The issue triggers after the indexdb rotation for time series, which stop receiving new samples. This results in missing data for such time series in query responses. This commit should address the https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3502 The issue has been introduced in `2dd93449d8`	2022-12-19 12:03:09 -08:00
Aliaksandr Valialkin	dc0b08efb0	lib/storage: optimize partSearch.searchBHS() for common case when the TSID for the current block header is bigger or equal to the current tsid This should help improving performance at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3425	2022-12-19 10:28:03 -08:00
Aliaksandr Valialkin	057fb2120b	lib/storage: properly set buf capacity inside marshalMetricID Previously it was always set to 0. In theory this could result into incorrect marshaling of metricIDs. The issue has been introduced in `5e4dfe50c6`	2022-12-19 10:14:38 -08:00
Aliaksandr Valialkin	ad8852759d	lib/storage: skip missing tsids in the current block header by using binary search This improves performance by up to 10x when big number of the requested TSIDs are missing in the searched parts. This should help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3425	2022-12-14 22:06:51 -08:00
Aliaksandr Valialkin	4de9d35458	lib/flagutil/bytes.go: properly handle values bigger than 2GiB on 32-bit architectures This fixes handling of values bigger than 2GiB for the following command-line flags: - -storage.minFreeDiskSpaceBytes - -remoteWrite.maxDiskUsagePerURL	2022-12-14 19:26:31 -08:00
Aliaksandr Valialkin	0d41d933e9	lib/mergeset: reduce the parts threshold before starting assisted merges This should improve query speed in general case. This is a follow-up for `d1af6046c7`	2022-12-13 09:13:49 -08:00
Aliaksandr Valialkin	d1af6046c7	lib/{mergeset,storage}: do not block small merges by pending big merges - assist with small merges instead Blocked small merges may result into big number of small parts, which, in turn, may result in increased CPU and memory usage during queries, since queries need to inspect all the existing small parts. The issue has been introduced in `8189770c50` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3337	2022-12-12 17:00:50 -08:00
Aliaksandr Valialkin	5b9e6b9d24	lib/storage: follow-up after `7c0ae3a86a` - Update docs at https://docs.victoriametrics.com/#deduplication - Optimize the deduplication loop a bit Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3333	2022-12-08 18:16:57 -08:00
Roman Khavronenko	7c0ae3a86a	lib/storage: keep sample with the biggest value on timestamp conflict (#3421 ) The change leaves raw sample with the biggest value for identical timestamps per each `-dedup.minScrapeInterval` discrete interval when the deduplication is enabled. ``` benchstat old.txt new.txt name old time/op new time/op delta DeduplicateSamples/minScrapeInterval=1s-10 817ns ± 2% 832ns ± 3% ~ (p=0.052 n=10+10) DeduplicateSamples/minScrapeInterval=2s-10 1.56µs ± 1% 2.12µs ± 0% +35.19% (p=0.000 n=9+7) DeduplicateSamples/minScrapeInterval=5s-10 1.32µs ± 3% 1.65µs ± 2% +25.57% (p=0.000 n=10+10) DeduplicateSamples/minScrapeInterval=10s-10 1.13µs ± 2% 1.50µs ± 1% +32.85% (p=0.000 n=10+10) name old speed new speed delta DeduplicateSamples/minScrapeInterval=1s-10 10.0GB/s ± 2% 9.9GB/s ± 3% ~ (p=0.052 n=10+10) DeduplicateSamples/minScrapeInterval=2s-10 5.24GB/s ± 1% 3.87GB/s ± 0% -26.03% (p=0.000 n=9+7) DeduplicateSamples/minScrapeInterval=5s-10 6.22GB/s ± 3% 4.96GB/s ± 2% -20.37% (p=0.000 n=10+10) DeduplicateSamples/minScrapeInterval=10s-10 7.28GB/s ± 2% 5.48GB/s ± 1% -24.74% (p=0.000 n=10+10) ``` https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3333 Signed-off-by: hagen1778 <roman@victoriametrics.com> Signed-off-by: hagen1778 <roman@victoriametrics.com>	2022-12-08 18:06:11 -08:00
Aliaksandr Valialkin	d99d222f0a	lib/{storage,mergeset}: log the duration for flushing in-memory parts on graceful shutdown	2022-12-05 21:30:48 -08:00
Aliaksandr Valialkin	8189770c50	all: add `-inmemoryDataFlushInterval` command-line flag for controlling the frequency of saving in-memory data to disk The main purpose of this command-line flag is to increase the lifetime of low-end flash storage with the limited number of write operations it can perform. Such flash storage is usually installed on Raspberry PI or similar appliances. For example, `-inmemoryDataFlushInterval=1h` reduces the frequency of disk write operations to up to once per hour if the ingested one-hour worth of data fits the limit for in-memory data. The in-memory data is searchable in the same way as the data stored on disk. VictoriaMetrics automatically flushes the in-memory data to disk on graceful shutdown via SIGINT signal. The in-memory data is lost on unclean shutdown (hardware power loss, OOM crash, SIGKILL). Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3337	2022-12-05 15:16:14 -08:00
Aliaksandr Valialkin	544ea89f91	lib/{mergeset,storage}: add start background workers via startBackgroundWorkers() function	2022-12-04 00:01:04 -08:00
Aliaksandr Valialkin	33dda2809b	lib/mergeset: panic when too long item is passed to Table.AddItems()	2022-12-03 23:32:16 -08:00
Aliaksandr Valialkin	932c1f90ae	lib/storage: remove duplicate logging for filepath on errors	2022-12-03 23:15:22 -08:00
Aliaksandr Valialkin	044a304adb	lib/storage: pass a single arg - rowsPerBlock - to getCompressLevel() function instead of two args	2022-12-03 23:10:16 -08:00
Aliaksandr Valialkin	cb44976716	lib/{storage,mergeset}: use a single sync.WaitGroup for all background workers This simplifies the code	2022-12-03 23:03:08 -08:00
Aliaksandr Valialkin	28e6d9e1ff	lib/storage: properly pass retentionMsecs to OpenStorage() at TestIndexDBRepopulateAfterRotation	2022-12-03 23:02:10 -08:00
Aliaksandr Valialkin	343c69fc15	lib/{mergeset,storage}: pass compressLevel to blockStreamWriter.InitFromInmemoryPart This allows packing in-memory blocks with different compression levels depending on its contents. This may save memory usage.	2022-12-03 22:46:48 -08:00

1 2 3 4 5 ...

620 Commits