VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-12-21 07:56:26 +01:00

Author	SHA1	Message	Date
Aliaksandr Valialkin	53c2135d2a	lib/storage: tune the logic for pre-populating of the per-day inverted index for the next day - Postpone the pre-poulation to the last hour of the current day. This should reduce the number of useless entries in the next per-day index, which shouldn't be created there, when the corresponding time series are stopped to be pushed during the current day. - Make the pre-population more smooth in time by using the hash of MetricID instead of MetricID itself when calculating the need for for the given MetricID pre-population. - Sync the logic for pre-population of the next day inverted index with the logic of pre-populating tsid cache after indexdb rotation. This should improve code maintainability. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/430 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401	2022-02-12 16:39:33 +02:00
Roman Khavronenko	d107f86fbc	lib/index: reduce read/write load after indexDB rotation (#2177 ) * lib/index: reduce read/write load after indexDB rotation IndexDB in VM is responsible for storing TSID - ID's used for identifying time series. The index is stored on disk and used by both ingestion and read path. IndexDB is stored separately to data parts and is global for all stored data. It can't be deleted partially as VM deletes data parts. Instead, indexDB is rotated once in `retention` interval. The rotation procedure means that `current` indexDB becomes `previous`, and new freshly created indexDB struct becomes `current`. So in any time, VM holds indexDB for current and previous retention periods. When time series is ingested or queried, VM checks if its TSID is present in `current` indexDB. If it is missing, it checks the `previous` indexDB. If TSID was found, it gets copied to the `current` indexDB. In this way `current` indexDB stores only series which were active during the retention period. To improve indexDB lookups, VM uses a cache layer called `tsidCache`. Both write and read path consult `tsidCache` and on miss the relad lookup happens. When rotation happens, VM resets the `tsidCache`. This is needed for ingestion path to trigger `current` indexDB re-population. Since index re-population requires additional resources, every index rotation event may cause some extra load on CPU and disk. While it may be unnoticeable for most of the cases, for systems with very high number of unique series each rotation may lead to performance degradation for some period of time. This PR makes an attempt to smooth out resource usage after the rotation. The changes are following: 1. `tsidCache` is no longer reset after the rotation; 2. Instead, each entry in `tsidCache` gains a notion of indexDB to which they belong; 3. On ingestion path after the rotation we check if requested TSID was found in `tsidCache`. Then we have 3 branches: 3.1 Fast path. It was found, and belongs to the `current` indexDB. Return TSID. 3.2 Slow path. It wasn't found, so we generate it from scratch, add to `current` indexDB, add it to `tsidCache`. 3.3 Smooth path. It was found but does not belong to the `current` indexDB. In this case, we add it to the `current` indexDB with some probability. The probability is based on time passed since the last rotation with some threshold. The more time has passed since rotation the higher is chance to re-populate `current` indexDB. The default re-population interval in this PR is set to `1h`, during which entries from `previous` index supposed to slowly re-populate `current` index. The new metric `vm_timeseries_repopulated_total` was added to identify how many TSIDs were moved from `previous` indexDB to the `current` indexDB. This metric supposed to grow only during the first `1h` after the last rotation. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 Signed-off-by: hagen1778 <roman@victoriametrics.com> * wip * wip Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2022-02-12 00:34:44 +02:00
Aliaksandr Valialkin	4e05298756	lib/storage: properly limit cardinality when ingesting multiple samples for the same time series in a single request	2022-01-21 12:38:22 +02:00
Nikolay	6cdc934c3d	adds restore.lock (#1988 ) * adds restore.lock it must prevent from running storage after incomplete restore process https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1958 * return back flock file deletion * Apply suggestions from code review * wip * docs/CHANGELOG.md: document https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1958 Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2021-12-22 13:10:56 +02:00
Aliaksandr Valialkin	727797a6fd	all: use logger.WithThrottler() where appropriate	2021-12-21 17:10:54 +02:00
Aliaksandr Valialkin	c922c7af9a	lib/storage: convert alternate regexps into Graphite wildcards inside `__graphite__` pseudo-label For example, `{__graphite__=~"foo.(bar\|baz)"}` is automatically converted to `{__graphite__=~"foo.{bar,baz}"}` before execution. This allows using multi-value Grafana template variables such as `{__graphite__=~"foo.($app)"}`.	2021-12-14 19:55:59 +02:00
Aliaksandr Valialkin	ab4be24397	app/vmstorage: export vm_cache_size_max_bytes metrics for determining capacity of various caches The vm_cache_size_max_bytes metric can be used for determining caches which reach their capacity via the following query: vm_cache_size_bytes / vm_cache_size_max_bytes > 0.9	2021-12-02 10:30:01 +02:00
Aliaksandr Valialkin	93511b4be7	lib/storage: log a warning when the -storageDataPath has less than -storage.minFreeDiskSpaceBytes This should improve the debuggability of the readonly feature. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1727	2021-10-19 23:58:09 +03:00
Aliaksandr Valialkin	a7a1305395	lib/storage: fix unaligned access on 32-bit architectures. The bug has been introduced at `a171916ef5`	2021-10-08 19:38:20 +03:00
Aliaksandr Valialkin	4fddcf4c83	app/{vminsert,vmstorage}: follow-up after `a171916ef5` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/269	2021-10-08 14:09:51 +03:00
Nikolay	a171916ef5	Adds read-only mode for vmstorage node (#1680 ) * adds read-only mode for vmstorage https://github.com/VictoriaMetrics/VictoriaMetrics/issues/269 * changes order a bit * moves isFreeDiskLimitReached var to storage struct renames functions to be consistent change protoparser api - with optional storage limit check for given openned storage * renames freeSpaceLimit to ReadOnly	2021-10-08 12:52:56 +03:00
Aliaksandr Valialkin	c4df601f43	lib/promscrape: add the ability to limit the number of unique series per each scrape target The number of series per target can be limited with the following options: * Global limit with `-promscrape.maxSeriesPerTarget` command-line option. * Per-target limit with `max_series: N` option in `scrape_config` section. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1561	2021-09-01 16:08:12 +03:00
Aliaksandr Valialkin	c1f81f08d4	all: add support for Prometheus staleness markers Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1526 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/748 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1509 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1530 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/845	2021-08-13 12:13:15 +03:00
Aliaksandr Valialkin	e992754e79	lib/storage: remove cache directory if it contains reset_cache_on_startup file See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1447	2021-07-13 17:59:51 +03:00
Aliaksandr Valialkin	3f705fe8d7	lib/storage: properly limit the size of `storage/date_metricID` cache	2021-07-12 14:25:28 +03:00
Aliaksandr Valialkin	74ace9340d	lib/storage: periodically reset prefetchedMetricIDs cache in order to limit its size under high churn rate	2021-07-07 10:59:39 +03:00
Aliaksandr Valialkin	51516b96e6	lib/storage: tune cache sizes according to production workload	2021-07-05 15:14:45 +03:00
Aliaksandr Valialkin	b84aea1e6e	lib/protoparser/clusternative: do not pool unmarshalWork structs, since they can occupy big amounts of memory (more than 100MB per each struct) This should reduce memory usage for vmstorage under high ingestion rate when the vmstorage runs on a system with big number of CPU cores	2021-06-23 15:45:08 +03:00
Aliaksandr Valialkin	b133de1e37	lib/storage: move deletedMetricIDs set from indexDB to Storage This makes consitent the list of deleted metricIDs when it is used from both the current indexDB and the previous indexDB (aka extDB). This should fix the issue, which could lead to storing new samples under deleted metricIDs after indexDB rotation. See more details at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1347#issuecomment-861232136 . Thanks to @tangqipengleoo for the initial analysis and the pull request - https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1383 . This commit resolves the issue in more generic way compared to https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1383 . The downside of the commit is the deletedMetricIDs set isn't cleaned from the metricIDs outside the retention. It needs app restart. This should be OK in most cases.	2021-06-15 15:07:54 +03:00
Aliaksandr Valialkin	ce10bdc82a	lib/storage: reset cache on disk during series deletion and during indexdb rotation This should prevent from inconsistent behavior (aka partially missing data for some time series) after unclean shutdown. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1347	2021-06-11 12:54:36 +03:00
Aliaksandr Valialkin	fc2565b4ee	lib/storage: reduce memory allocations when syncing dateMetricIDCache	2021-06-03 16:20:02 +03:00
Aliaksandr Valialkin	10b2855949	lib/storage: fix spelling typo: `borken->broken` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1336	2021-05-27 12:09:17 +03:00
Aliaksandr Valialkin	1c16cbacf5	lib/storage: do not stop data ingestion on the first error in Storage.AddRows Continue data ingestion for the rest of blocks.	2021-05-24 15:32:24 +03:00
Aliaksandr Valialkin	2601844de3	lib/storage: limit the number of rows per each block in Storage.AddRows() This should reduce memory usage when ingesting big blocks or rows.	2021-05-24 15:32:24 +03:00
Aliaksandr Valialkin	402a8ca710	lib/storage: do not populate MetricID->MetricName cache during data ingestion This cache isn't needed during data ingestion, so there is no need in spending RAM on it. This reduces RAM usage on data ingestion path by 30%	2021-05-24 03:06:40 +03:00
Aliaksandr Valialkin	165a9f9200	app/vmstorage: add ability to limit series cardinality via `-storage.maxHourlySeries` and `-storage.maxDailySeries` command-line flags	2021-05-20 15:31:57 +03:00
Aliaksandr Valialkin	2839055513	lib/storage: substitute GetTSDBStatusForDate with GetTSDBStatusWithFiltersForDate with nil tfss	2021-05-13 09:01:05 +03:00
Nikolay	be87be34a4	Adds tsdb match filters (#1282 ) * init work on filters * init propose for status filters * fixes tsdb status adds test * fix bug * removes checks from test	2021-05-12 17:16:58 +03:00
Aliaksandr Valialkin	4a5f45c77e	app/vminsert: add support for data ingestion via other vminsert nodes	2021-05-08 19:53:45 +03:00
Aliaksandr Valialkin	512addc608	app/{vminsert,vmagent}: add `-sortLabels` command-line option for sorting time series labels before ingesting them in the storage This option can be useful when samples for the same time series are ingested with distinct order of labels. For example, metric{k1="v1",k2="v2"} and metric{k2="v2",k1="v1"}.	2021-03-31 23:27:21 +03:00
Aliaksandr Valialkin	ae1c653d55	lib/storage: reduce memory usage when ingesting samples for the same time series with distinct order of labels	2021-03-31 21:22:40 +03:00
Aliaksandr Valialkin	9c2be144cf	app/vmselect: log the metric which trigger rollup result cache reset This should help finding the source of stale metrics	2021-03-25 21:32:28 +02:00
Aliaksandr Valialkin	12ca0efc19	lib/storage: respect the deadline passed to Storage.SearchMetricNames	2021-03-22 23:03:00 +02:00
Aliaksandr Valialkin	40e47935e7	lib/storage: improve Search.NextMetricBlock performance by using MetricID->MetricName cache	2021-03-22 23:02:59 +02:00
Aliaksandr Valialkin	7503111feb	lib/storage: small code simplification after `6cee5338b2`	2021-03-18 15:22:39 +02:00
Aliaksandr Valialkin	4443254fb9	lib/storage: prevent from infinite loop if `{__graphite__="..."}` filter matches a metric name with `*`, `[` or `{` chars The idea has been borrowed from https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1137	2021-03-18 14:57:39 +02:00
John Belmonte	edf39aa225	spelling fix: adjacent (#1115 )	2021-03-09 09:19:16 +02:00
Aliaksandr Valialkin	1a19702d92	lib/storage: make sure that nobody uses partitions when closing the table	2021-02-17 15:02:18 +02:00
Aliaksandr Valialkin	46e98ed490	vendor: update github.com/VictoriaMetrics/metrics from v1.13.1 to v1.14.0 The new version switches from log-linear histograms to log-based histograms, which provide up to 3.6 times better accuracy.	2021-02-15 15:11:15 +02:00
Aliaksandr Valialkin	93ff866e91	lib/storage: reduce the minimum supported retention for inverted index from one month to one day	2021-02-15 15:11:15 +02:00
Aliaksandr Valialkin	eeb92eb7fc	lib/storage: load metadata before loading indexdb, since indexdb depends on the metadata	2021-02-10 17:55:51 +02:00
Aliaksandr Valialkin	681dfb7485	lib/storage: fix inconsistencies in error logs	2021-02-10 16:32:21 +02:00
Aliaksandr Valialkin	148422bcba	lib/storage: disable composite index usage when querying old data	2021-02-10 14:57:58 +02:00
Aliaksandr Valialkin	5c9715a89a	lib/storage: reduce lock contention in dateMetricIDCache when registering new time series for the current day This should help systems with multiple CPU cores	2021-02-10 00:04:19 +02:00
Aliaksandr Valialkin	9ed7789fef	optimize Storage.updatePerDateData()	2021-02-09 02:59:53 +02:00
Aliaksandr Valialkin	2dbb12563b	lib/storage: optimize data ingestion in the beginning of every hour Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1046	2021-02-08 12:04:51 +02:00
Aliaksandr Valialkin	c6a7288109	lib/storage: check for prevHourMetricIDs cache before falling back to checking for (date, metricID) entries during data ingestion This should reduce possible CPU usage spikes at the beginning of every hour. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1046	2021-02-04 18:46:23 +02:00
Aliaksandr Valialkin	8249f13104	app/vmselect,lib/storage: properly parse Graphite selectors with inner wildcards Example: foo{bar{x,yz},a[b-c],*de}	2021-02-03 20:16:28 +02:00
Aliaksandr Valialkin	2976ec89b8	lib/storage: fix a bug, which breaks searching by Graphite wildcard filters	2021-02-03 20:15:50 +02:00
Aliaksandr Valialkin	4b930b9ffe	app/vmselect: add ability to set Graphite-compatible filter via `{__graphite__="foo.*.bar"}` syntax	2021-02-03 01:17:19 +02:00
Aliaksandr Valialkin	1a237c6903	all: properly handle CPU limits set on the host system/container This can reduce memory usage on systems with enabled CPU limits. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/946	2020-12-08 21:07:03 +02:00
Aliaksandr Valialkin	03002f1fe1	lib/storage: log metric name plus all its labels when the metric timestamp is outside the configured retention This should simplify debugging when the source of the metric with unexpected timestamp must be found.	2020-11-25 14:44:29 +02:00
Aliaksandr Valialkin	4848a05924	lib/storage: typo fix in error message: allowd->allowed	2020-11-25 14:15:54 +02:00
Aliaksandr Valialkin	a9287cf564	lib/storage: do not pass (accountID, projectID) to SearchTagNames(), since they are already passed via tfss	2020-11-16 18:04:30 +02:00
Aliaksandr Valialkin	eea1be0d5c	app/vmselect/graphite: add /tags/findSeries handler from Graphite Tags API See https://graphite.readthedocs.io/en/stable/tags.html#exploring-tags	2020-11-16 12:52:23 +02:00
Aliaksandr Valialkin	4be5b5733a	app/vminsert: add `/tags/tagSeries` and `/tags/tagMultiSeries` handlers from Graphite Tags API See https://graphite.readthedocs.io/en/stable/tags.html#adding-series-to-the-tagdb	2020-11-16 02:40:04 +02:00
immerrr again	1ec1a9f27f	app/vmstorage: add "/internal/force_flush" endpoint (#893 )	2020-11-11 14:46:37 +02:00
Aliaksandr Valialkin	c5e6c5f5a6	app/vmselect: optimize querying for `/api/v1/labels` and `/api/v1/label/<name>/values` when `start` and `end` args are set	2020-11-05 01:19:29 +02:00
Aliaksandr Valialkin	f3a7e6f6e3	lib/storage: remove obsolete code	2020-11-02 19:17:30 +02:00
Aliaksandr Valialkin	9c5cd5a6c5	lib/storage: code cleanup after `5bfd4e6218`	2020-10-20 16:10:53 +03:00
Aliaksandr Valialkin	0db7c2b500	app/vmstorage: support for `-retentionPeriod` smaller than one month Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/173 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/17	2020-10-20 14:42:46 +03:00
Aliaksandr Valialkin	d2e917d1cb	app/vmstorage: add `vm_rows_added_to_storage_total` metric, which shows the total number of rows added to storage since app start	2020-10-09 13:36:17 +03:00
Aliaksandr Valialkin	fd7dd5064a	lib/storage: code cleanup after `10f2eedee0` Remove the code that uses metricIDs caches for the current and the previous hour during metricIDs search, since this code became unused after implementing per-day inverted index almost a year ago. While at it, fix a bug, which could prevent from finding time series with names containing dots (aka Graphite-like names such as `foo.bar.baz`).	2020-10-01 19:12:04 +03:00
Aliaksandr Valialkin	778ea183ca	lib/decimal: properly store Inf values Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/752	2020-09-18 19:08:53 +03:00
Aliaksandr Valialkin	d96858b921	lib/storage: add `/internal/force_merge` handler for running forced compactions on historical per-month partitions This may be useful for freeing up storage space after time series deletion. See https://victoriametrics.github.io/#force-merge for more details. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/686	2020-09-17 12:20:56 +03:00
Aliaksandr Valialkin	81c05f669b	lib/storage: do not store inf values, since they may lead to significant precision loss for previously stored values Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/752	2020-09-11 14:45:20 +03:00
Aliaksandr Valialkin	f307e6f432	app/vmselect: initial implementation of Graphite Metrics API See https://graphite-api.readthedocs.io/en/latest/api.html#the-metrics-api	2020-09-11 00:30:20 +03:00
Aliaksandr Valialkin	f5cb213ef9	lib/storage: reuse timestamp blocks for adjancent metric blocks with identical timestamps This should reduce disk space usage when scraping targets containing metrics with identical names such as `node_cpu_seconds_total`, histograms, quantiles, etc. Expose `vm_timestamps_blocks_merged_total` and `vm_timestamps_bytes_saved_total` metrics for monitoring the effectiveness of timestamp blocks merging.	2020-09-09 23:59:21 +03:00
Aliaksandr Valialkin	f92255e803	lib/storage: mention tag filters used in the query that led to error message This should improve detecting invalid or heavy queries that lead to errors.	2020-08-10 13:36:54 +03:00
Aliaksandr Valialkin	b3d4ff7ee2	app/vmstorage: improve error logging when the request times out	2020-08-10 13:17:24 +03:00
Aliaksandr Valialkin	307281e922	lib/storage: slow down concurrent searches when the number of concurrent inserts reaches the limit This should improve data ingestion performance when heavy searches are executed See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648 See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/618	2020-08-07 08:49:13 +03:00
Aliaksandr Valialkin	dd1d59f57a	lib/storage: properly check timeouts and pace limits Previously they were checked on every iteration for small number of iterations	2020-08-07 08:40:56 +03:00
Aliaksandr Valialkin	13f8644f8e	lib/storage: optimize prefetching metric names for the given metricIDs	2020-08-06 16:52:58 +03:00
Aliaksandr Valialkin	a3e91c593b	lib/storage: limit the number of concurrent calls to storage.searchTSIDs to GOMAXPROCS*2 This should limit the maximum memory usage and reduce CPU trashing on vmstorage when multiple heavy queries are executed. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648	2020-08-05 18:27:21 +03:00
Aliaksandr Valialkin	96039dcb40	lib/storage: properly update `vm_slow_row_inserts_total` metric when importing multiple data points per time series at once Previously the `vm_slow_row_inserts_total` metric may be incremented multiple times for different data points per a single time series, while only a single increment is needed when inserting the first data point for this time series.	2020-07-30 16:17:19 +03:00
Aliaksandr Valialkin	94cc677b0c	lib/storage: slightly reduce code difference between single-node and cluster versions	2020-07-24 01:18:05 +03:00
Aliaksandr Valialkin	fb3d1380ac	lib/storage: respect `-search.maxQueryDuration` when searching for time series in inverted index Previously the time spent on inverted index search could exceed the configured `-search.maxQueryDuration`. This commit stops searching in inverted index on query timeout.	2020-07-23 21:22:05 +03:00
Aliaksandr Valialkin	dbf3038637	lib/storage: add more fine-grained pace limiting for search	2020-07-23 19:21:49 +03:00
Aliaksandr Valialkin	b8303afcd8	lib/storage: improve prioritizing of data ingestion over querying Prioritize also small merges over big merges. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/291 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648	2020-07-23 01:40:38 +03:00
Aliaksandr Valialkin	754eac676d	lib/storage: prevent possible race condition when all the goroutines exit Storage.AddRows, before goroutines other goroutines are blocked on searchTSIDsCond inside Storage.searchTSIDs This condition may occur after the following sequence of events: 1) A goroutine enters the loop body when len(addRowsConcurrencyCh) == cap(addRowsConcurrencyCh) inside Storage.searchTSIDs. 2) All the goroutines return from Storage.AddRows. 3) The goroutine from step 1 blocks on searchTSIDsCond.Wait() inside the loop body. The goroutine remains blocked until the next call to Storage.AddRows, which calls searchTSIDsCond.Signal(). This may take indefinite time.	2020-07-22 21:52:42 +03:00
Aliaksandr Valialkin	67be79a0bc	lib/uint64set: optimize adding items to the set via Set.AddMulti	2020-07-21 20:57:05 +03:00
Aliaksandr Valialkin	be0ab4fbfe	lib/storage: reset `MetricName->TSID` cache after marking metricIDs as deleted This is a follow-up commit after `12b16077c4` , which didn't reset the `tsidCache` in all the required places. This could result in indefinite errors like: missing metricName by metricID ...; this could be the case after unclean shutdown; deleting the metricID, so it could be re-created next time Fix this by resetting the cache inside deleteMetricIDs function.	2020-07-14 14:05:19 +03:00
Aliaksandr Valialkin	7335743d57	lib/storage: limit the maximum concurrency for data ingestion to GOMAXPROCS Previously the concurrency has been limited to GOMAXPROCS*2. This had little sense, since every call to Storage.AddRows is bound to CPU, so the maximum ingestion bandwidth is achieved when the number of concurrent calls to Storage.AddRows is limited to the number of CPUs, i.e. to GOMAXPROCS.	2020-07-08 17:34:27 +03:00
Aliaksandr Valialkin	fad008df7e	lib/storage: clarify `out of retention period` error message by mentioning `-retentionPeriod` command-line flag	2020-07-08 13:54:13 +03:00
Aliaksandr Valialkin	fe58462bef	lib/storage: reset MetricName->TSID cache after deleting time series This should prevent from adding new data points to deleted time series without the need to check for the deleted time series. This improves ingestion performance a bit when the `deleted time series ids` aka `dmis` set contains big number of time series. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/596 Based on the idea from @n4mine at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/604	2020-07-06 22:01:24 +03:00
Aliaksandr Valialkin	0bff96fe4b	lib/storage: prioritize data ingestion over heavy queries Heavy queries could result in the lack of CPU resources for processing the current data ingestion stream. Prevent this by delaying queries' execution until free resources are available for data ingestion. Expose `vm_search_delays_total` metric, which may be used in for alerting when there is no enough CPU resources for data ingestion and/or for executing heavy queries. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/291	2020-07-05 19:44:04 +03:00
Aliaksandr Valialkin	4cb3e7595c	app/vmstorage: add `-denyQueriesOutsideRetention` command-line flag for denying queries outside the configured retention	2020-07-01 00:58:42 +03:00
Aliaksandr Valialkin	d962568e93	all: use %w instead of %s for wrapping errors in `fmt.Errorf` This will simplify examining the returned errors such as httpserver.ErrorWithStatusCode . See https://blog.golang.org/go1.13-errors for details.	2020-06-30 23:33:46 +03:00
Aliaksandr Valialkin	2a8f1e6931	lib/storage: do not increment `vm_slow_metric_name_loads_total` counter for metric_ids which shouldnt be prefetched, since this may mislead users	2020-05-16 10:23:39 +03:00
Aliaksandr Valialkin	1e5c1d7eaa	app/vmstorage: add `vm_slow_metric_name_loads_total` metric, which could be used as an indicator when more RAM is needed for improving query performance	2020-05-15 14:12:24 +03:00
Aliaksandr Valialkin	d6b9a49481	app/vmstorage: add `vm_slow_row_inserts_total` and `vm_slow_per_day_index_inserts_total` metrics for determining whether VictoriaMetrics required more RAM for the current number of active time series	2020-05-15 13:46:57 +03:00
Aliaksandr Valialkin	67e331ac62	lib/storage: optimize ingestion pefrormance for new time series	2020-05-15 12:12:19 +03:00
Aliaksandr Valialkin	1b5d272e07	lib/storage: reduce indentation in Storage.add	2020-05-14 23:23:56 +03:00
Aliaksandr Valialkin	71d29a8fa1	lib/storage: return the first error instead of the last error, since the first error usually points to the root cause	2020-05-14 23:18:59 +03:00
Aliaksandr Valialkin	3845420a8f	lib: extract common code for returning fast unix timestamp into lib/fasttime	2020-05-14 23:06:50 +03:00
Aliaksandr Valialkin	f7753b1469	lib/storage: gradually pre-populate per-day inverted index for the next day This should prevent from CPU usage spikes at 00:00 UTC every day when inverted index for new day must be quickly created for all the active time series. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/430	2020-05-12 12:13:32 +03:00
Aliaksandr Valialkin	f9526809e5	app/vmselect: add `/api/v1/status/tsdb` page with useful stats for locating root cause for high cardinality issues See https://prometheus.io/docs/prometheus/latest/querying/api/#tsdb-stats Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/425 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/268	2020-04-22 22:03:23 +03:00
Aliaksandr Valialkin	7cdac6634c	lib/storage: serialize snapshot creation process with mutex This guarantees that the snapshot contains all the recently added data from inmemory buffers when multiple concurrent calls to Storage.CreateSnapshot are performed.	2020-03-24 22:27:28 +02:00
Aliaksandr Valialkin	cf9aee4ec3	all: properly split `vm_deduplicated_samples_total` among cluster components Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/345	2020-02-27 23:47:51 +02:00
Aliaksandr Valialkin	a2b81b71b9	lib/storage: typo fix	2020-02-16 15:53:48 +02:00
Aliaksandr Valialkin	ad4cb9f3ca	lib/storage: prevent from clobbering nin-nil lastError in Storage.add	2020-02-16 15:51:35 +02:00
Aliaksandr Valialkin	347aaba79d	lib/{storage,mergeset}: use time.Ticker instead of time.Timer where appropriate It has been appeared that time.Timer was used in places where time.Ticker must be used instead. This could result in blocked goroutines as in the https://github.com/VictoriaMetrics/VictoriaMetrics/issues/316 .	2020-02-13 13:21:48 +02:00
Aliaksandr Valialkin	a45f25699c	lib/storage: re-use indexSearch inside Storage.prefetchMetricNames	2020-01-31 01:18:53 +02:00
Aliaksandr Valialkin	1332ddc15e	lib/storage: pass missing AccountID and ProjectID to searchMetricName	2020-01-30 15:16:16 +02:00
Aliaksandr Valialkin	4ed5e9a7ce	lib/storage: pre-fetch metricNames for the found metricIDs in Search.Init This should speed up Search.NextMetricBlock loop for big number of found time series.	2020-01-30 15:16:16 +02:00
Aliaksandr Valialkin	ea53a21b02	all: consistently log durations in seconds with millisecond precision This should improve logs readability	2020-01-22 18:35:24 +02:00
Aliaksandr Valialkin	7d429e2806	lib/uint64set: reduce memory usage in Union, Intersect and Subtract methods Iterate items with newly added Set.ForEach method instead of allocating `[]uint64` slice for all the items before the iteration.	2020-01-15 12:15:48 +02:00
Aliaksandr Valialkin	8d79412b26	lib/storage: optimize bulk import performance when multiple data points are inserted for the same time series This should speed up `/api/v1/import` and make it more scalable on multi-core systems.	2019-12-19 15:13:36 +02:00
Aliaksandr Valialkin	5a62415bec	lib/storage: protect from time drift during indexdb rotation Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/248	2019-12-02 14:43:11 +02:00
Aliaksandr Valialkin	d297b65089	lib/storage: add `vm_cache_size_bytes{type="storage/hour_metric_ids"}` metric	2019-11-13 20:26:05 +02:00
Aliaksandr Valialkin	494ad0fdb3	lib/storage: remove inmemory index for recent hour, since it uses too much memory Production workload shows that the index requires ~4Kb of RAM per active time series. This is too much for high number of active time series, so let's delete this index. Now the queries should fall back to the index for the current day instead of the index for the recent hour. The query performance for the current day index should be good enough given the 100M rows/sec scan speed per CPU core.	2019-11-13 18:08:58 +02:00
Aliaksandr Valialkin	633dd81bb5	lib/storage: add `-disableRecentHourIndex` flag for disabling inmemory index for recent hour This may be useful for saving RAM on high number of time series aka high cardinality	2019-11-13 15:10:12 +02:00
Aliaksandr Valialkin	f1620ba7c0	lib/storage: fix inmemory inverted index issues found in v1.29 Issues fixed: - Slow startup times. Now the index is loaded from cache during start. - High memory usage related to superflouos index copies every 10 seconds.	2019-11-13 13:35:38 +02:00
Oleg Kovalov	74ba42d111	fix misspelled words (#229 )	2019-11-12 00:18:24 +02:00
Aliaksandr Valialkin	c48e39eea9	lib/storage: add tests for dateMetricIDCache	2019-11-11 13:21:05 +02:00
Aliaksandr Valialkin	6bdde0d6d4	lib/storage: eliminate data race when updating lastSyncTime in dateMetricIDCache.Has	2019-11-10 22:04:23 +02:00
Aliaksandr Valialkin	9ea2bd822e	lib/storage: implement per-day inverted index	2019-11-10 00:20:32 +02:00
Aliaksandr Valialkin	dea2f3efed	lib/storage: use specialized cache for (date, metricID) entries This improves ingestion performance.	2019-11-09 23:09:18 +02:00
Aliaksandr Valialkin	0063c857f5	lib/storage: add inmemory inverted index for the last hour It should improve performance for `last N hours` dashboards with update intervals smaller than 1 hour.	2019-11-08 19:37:46 +02:00
Aliaksandr Valialkin	6a22727676	lib/storage: optimize getMetricIDsForRecentHours for per-tenant lookups	2019-10-31 15:51:09 +02:00
Aliaksandr Valialkin	ca480915ca	lib/storage: small cleanup in Storage.add	2019-10-31 14:30:22 +02:00
hanzai	52778da1f3	warns during rows addition (#214 )	2019-10-20 23:38:51 +03:00
Aliaksandr Valialkin	5b01b7fb01	all: add support for GOARCH=386 and fix all the issues related to 32-bit architectures such as GOARCH=arm Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/212	2019-10-17 18:27:49 +03:00
Aliaksandr Valialkin	de0e4eee2c	lib/storage: create and use `lib/uint64set` instead of `map[uint64]struct{}` This should improve inverted index search performance for filters matching big number of time series, since `lib/uint64set.Set` is faster than `map[uint64]struct{}` for both `Add` and `Has` calls. See the corresponding benchmarks in `lib/uint64set`.	2019-09-24 21:18:04 +03:00
Aliaksandr Valialkin	7734fc8012	lib/storage: share tsids across all the partSearch instances This should reduce memory usage when big number of time series matches the given query.	2019-09-23 22:36:16 +03:00
Aliaksandr Valialkin	e2eac858b5	lib/storage: calculate the maximum number of rows per small part from `-memory.allowedPercent` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/159 This simplifies error detection additionally to the `vm_rows_ignored_total` counters.	2019-08-25 15:29:09 +03:00
Aliaksandr Valialkin	8c2158af24	all: use workingsetcache instead of fastcache This should reduce the amount of RAM required for processing time series with non-zero churn rate. The previous cache behavior can be restored with `-cache.oldBehavior` command-line flag.	2019-08-13 21:40:28 +03:00
Aliaksandr Valialkin	39f3f3a517	lib: move common code for creating flock.lock file into fs.CreateFlockFile	2019-08-13 01:46:20 +03:00
Aliaksandr Valialkin	f56c1298ad	app/vmstorage: add `vm_concurrent_addrows_*` metrics for tracking concurrency for Storage.AddRows calls Track also the number of dropped rows due to the exceeded timeout on concurrency limit for Storage.AddRows. This number is tracked in `vm_concurrent_addrows_dropped_rows_total`	2019-08-06 15:08:43 +03:00
Aliaksandr Valialkin	c6bec48927	lib/storage: add metrics for calculating skipped rows outside the retention The metrics are: - vm_too_big_timestamp_rows_total - vm_too_small_timestamp_rows_total	2019-07-26 14:11:56 +03:00
Aliaksandr Valialkin	4ca66344ee	lib/storage: do not pollute inverted index with data for samples outside the retention period	2019-07-11 17:11:33 +03:00
Aliaksandr Valialkin	ba8195c58e	all: consistency renaming: bytesSize -> sizeBytes	2019-07-10 00:47:42 +03:00
Aliaksandr Valialkin	a0c22a6830	app/vmstorage: add `vm_cache_entries{type="storage/hour_metric_ids"}` metric for tracking active time series count	2019-06-19 18:37:38 +03:00
Aliaksandr Valialkin	f9e1d32168	lib/storage: persist metric ids for the current and the previous hour on graceful shutdown This should improve performance after restart when the db contains a lot of time series with high time series churn (i.e. metrics from Kubernetes with many pods and frequent deployments)	2019-06-14 07:55:09 +03:00
Aliaksandr Valialkin	18d6f293f7	lib/fs: consolidate RemoveAll funcs into a single MustRemoveAll func The func syncs parent dir in order to persist directory removal in the event of power loss	2019-06-12 01:55:18 +03:00
Aliaksandr Valialkin	51e2e255a6	lib/fs: consistency renaming SyncPath -> MustSyncPath, since it doesnt return error	2019-06-11 23:13:45 +03:00
Aliaksandr Valialkin	3437c30180	all: try hard removing directory with contents Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/61	2019-06-11 01:58:08 +03:00
Aliaksandr Valialkin	75a0acf72d	app/vmselect: add `/api/v1/labels/count` handler for quick detection of labels with the maximum number of distinct values	2019-06-10 19:54:55 +03:00
Aliaksandr Valialkin	d882afa905	lib/storage: optimize time series lookup for recent hours when the db contains many millions of time series with high churn rate (aka frequent deployments in Kubernetes)	2019-06-09 19:14:04 +03:00
Aliaksandr Valialkin	a2986cde70	lib/storage: tune updating a map with today`s metric ids - Increase update iterval from 1s to 10s. This should reduce CPU usage for large amounts of metric ids with constant churn. - Reduce pendingTodayMetricIDsLock lock duration during the update.	2019-06-02 22:00:13 +03:00
Aliaksandr Valialkin	e27fd5148a	lib/storage: speed up checking metricID existence in the list for the current date	2019-06-02 18:34:20 +03:00
Aliaksandr Valialkin	a6d02ff275	lib/timerpool: use timer pool in concurrency limiters This should reduce the number of memory allocations in highly loaded system	2019-05-28 17:30:10 +03:00
Aliaksandr Valialkin	bdf696ef18	all: fix misspellings	2019-05-25 21:51:24 +03:00
Aliaksandr Valialkin	24578b4bb1	all: open-sourcing cluster version	2019-05-23 00:25:38 +03:00
Aliaksandr Valialkin	1836c415e6	all: open-sourcing single-node version	2019-05-23 00:18:06 +03:00

1 2 3 4 5

245 Commits