VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-12-19 15:06:25 +01:00

Author	SHA1	Message	Date
Aliaksandr Valialkin	e83fd4a117	lib/logstorage: work-in-progress	2024-05-29 01:52:34 +02:00
Aliaksandr Valialkin	79c03fc35f	lib/logstorage: work-in-progress	2024-05-28 19:29:50 +02:00
Aliaksandr Valialkin	ce5e4c842a	lib/logstorage: fix golangci-lint warnings	2024-05-26 02:02:41 +02:00
Aliaksandr Valialkin	afa597ce2a	lib/logstorage: work-in-progress	2024-05-26 01:56:12 +02:00
Aliaksandr Valialkin	6427b3c3c0	lib/logstorage: work-in-progress	2024-05-25 22:59:21 +02:00
Aliaksandr Valialkin	9edbeca46b	lib/logstorage: re-use per-shard fields across processed blocks in pipePackJSON and pipeUnroll	2024-05-25 22:13:44 +02:00
Aliaksandr Valialkin	03fe4c8963	lib/logstorage: work-in-progress	2024-05-25 21:36:24 +02:00
Aliaksandr Valialkin	3152df2bce	lib/logstorage: work-in-progress	2024-05-25 00:31:55 +02:00
Aliaksandr Valialkin	7a2a2f173e	lib/logstorage: work-in-progress	2024-05-24 03:07:07 +02:00
Alexander Marshalov	0b70c4c1f1	[vmlogs] fixed time parsing with millisecond precision time (#6293 ) (#6295 ) fix for #6293 Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-05-22 21:54:50 +02:00
Aliaksandr Valialkin	04d0dd2542	lib/logstorage: work-in-progress	2024-05-22 21:01:28 +02:00
Aliaksandr Valialkin	45fbcc74e0	lib/logstorage: fix golangci-lint warnings	2024-05-20 11:04:37 +02:00
Aliaksandr Valialkin	582e7d5439	lib/logstorage: work-in-progress	2024-05-20 04:09:15 +02:00
Aliaksandr Valialkin	28626db066	lib/logstorage: work-in-progress (cherry picked from commit `0aa19a2837`)	2024-05-16 09:35:55 +02:00
Aliaksandr Valialkin	b1ee7bca1a	lib/logstorage: work-in-progress	2024-05-14 03:06:02 +02:00
Aliaksandr Valialkin	f52275bbd7	lib/logstorage: work-in-progress Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6258	2024-05-14 01:49:58 +02:00
Aliaksandr Valialkin	32193b6059	lib/encoding: optimize UnmarshalVarUint64, UnmarshalVarInt64 and UnmarshalBytes a bit Change the return values for these functions - now they return the unmarshaled result plus the size of the unmarshaled result in bytes, so the caller could re-slice the src for further unmarshaling. This improves performance of these functions in hot loops of VictoriaLogs a bit.	2024-05-14 01:30:25 +02:00
hagen1778	84a896cd6e	lib/logstorage: make linter happy Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `17283fab6c`)	2024-05-13 16:49:37 +02:00
Aliaksandr Valialkin	147704aab0	lib/logstorage: initial implementation of pipes in LogsQL See https://docs.victoriametrics.com/victorialogs/logsql/#pipes	2024-05-12 16:36:01 +02:00
Aliaksandr Valialkin	87338633b1	lib/slicesutil: add helper functions for setting slice length and extending its capacity The added helper functions - SetLength() and ExtendCapacity() - replace error-prone code with simple function calls.	2024-05-12 11:33:49 +02:00
wanshuangcheng	52a4ae0b28	chore: fix function names in comment (#6076 ) Signed-off-by: wanshuangcheng <wanshuangcheng@outlook.com>	2024-04-08 15:38:51 +02:00
Aliaksandr Valialkin	00f59d6ddf	all: fix golangci-lint(revive) warnings after `0c0ed61ce7` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6001	2024-04-03 03:00:45 +03:00
XLONG96	88b9088499	lib/logstorage: avoid panic when parsing regex with stream filter (#5897 )	2024-02-29 15:32:25 +02:00
Aliaksandr Valialkin	ca1e78bd16	lib/logstorage: consistently use atomic.* types instead of atomic.* functions on regular types See `ea9e2b19a5`	2024-02-24 00:29:39 +02:00
Aliaksandr Valialkin	d0538d11d3	lib/mergeset: consistently use atomic.* types instead of atomic.* function calls on ordinary types See `ea9e2b19a5`	2024-02-24 00:29:12 +02:00
Aliaksandr Valialkin	92e098012a	lib/logstorage: consistently use atomic.* type for refCount and mustDrop fields in datadb and storage structs in the same way as it is used in lib/storage See `ea9e2b19a5` and `a204fd69f1`	2024-02-24 00:28:56 +02:00
Aliaksandr Valialkin	b58c429044	app/vlselect: follow-up for `451d2abf50` - Consistently return the first `limit` log entries if the total size of found log entries doesn't exceed 1Mb. See app/vlselect/logsql/sort_writer.go . Previously random log entries could be returned with each request. - Document the change at docs/VictoriaLogs/CHANGELOG.md - Document the `limit` query arg at docs/VictoriaLogs/querying/README.md - Make the change less intrusive. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5674 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5778	2024-02-18 23:06:08 +02:00
Dmytro Kozlov	2d674f98d4	Enable the `limit` query param for the `/select/logsql/query` (#5778 ) * app/vlselect: add limit for logs query * app/vlselect: CHANGELOG.md * app/vlselect: stop search process if limit is reached, update logic, remove default limit * app/vlselect: fix tests * app/vlselect: fix filter tests * app/vlselect: fix tests	2024-02-18 22:59:16 +02:00
noodles2hg	60a8e59366	lib/logstorage: proper exit during block search (#5400 )	2024-02-01 14:11:20 +02:00
Jiajing LU	9c75e3ee15	count inmemoryParts that have not been taken for merge (#5447 )	2024-02-01 14:07:13 +02:00
Aliaksandr Valialkin	230ef43a32	lib/logstorage: make sure that WaitGroup.Add isnt called after stopCh is closed and WaitGroup.Wait is called This protects from rare panic, which may occur during graceful shutdown of VictoriaLogs	2024-01-26 21:18:07 +01:00
Aliaksandr Valialkin	d52fd73f18	all: add up to 10% random jitter to the interval between periodic tasks performed by various components This should smooth CPU and RAM usage spikes related to these periodic tasks, by reducing the probability that multiple concurrent periodic tasks are performed at the same time.	2024-01-22 18:39:16 +02:00
Aliaksandr Valialkin	9760221214	lib/logstorage: always check the previous indexBlockHeader for blocks with matching tenantID and/or streamID The previous indexBlockHeader may contain blocks for the matching tenantID and/or streamID, so it must be scanned unconditionally during the search. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5295 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4856 This is a follow-up for `89dcbc2fe7`	2023-11-14 01:02:02 +01:00
XLONG96	77033dbfb6	lib/logstorage: fix streamID and tenantID search (#4856 ) (#5295 )	2023-11-14 01:02:02 +01:00
Aliaksandr Valialkin	36a1fdca6c	all: consistently use %w instead of %s in when error is passed to fmt.Errorf() This allows consistently using errors.Is() for verifying whether the given error wraps some other known error.	2023-10-26 09:44:40 +02:00
Zakhar Bessarab	85b604d414	lib/logstorage: fix free space check (#5113 ) Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2023-10-03 17:50:16 +02:00
Aliaksandr Valialkin	7bb5f75a2a	lib/logstorage: follow-up for `94627113db` - Move uniqueFields from rows to blockStreamMerger struct. This allows localizing all the references to uniqueFields inside blockStreamMerger.mustWriteBlock(), which should improve readability and maintainability of the code. - Remove logging of the event when blocks cannot be merged because they contain more than maxColumnsPerBlock, since the provided logging didn't provide the solution for the issue with too many columns. I couldn't figure out the proper solution, which could be helpful for end user, so decided to remove the logging until we find the solution. This commit also contains the following additional changes: - It truncates field names longer than 128 chars during logs ingestion. This should prevent from ingesting bogus field names. This also should prevent from too big columnsHeader blocks, which could negatively affect search query performance, since columnsHeader is read on every scan of the corresponding data block. - It limits the maximum length of const column value to 256. Longer values are stored in an ordinary columns. This helps limiting the size of columnsHeader blocks and improving search query performance by avoiding reading too long const columns on every scan of the corresponding data block. - It deduplicates columns with identical names during data ingestion and background merging. Previously it was possible to pass columns with duplicate names to block.mustInitFromRows(), and they were stored as is in the block. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4762 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4969	2023-10-02 21:06:49 +02:00
Aliaksandr Valialkin	120f3bc467	lib/logstorage: follow-up for `8a23d08c21` - Compare the actual free disk space to the value provided via -storage.minFreeDiskSpaceBytes directly inside the Storage.IsReadOnly(). This should work fast in most cases. This simplifies the logic at lib/storage. - Do not take into account -storage.minFreeDiskSpaceBytes during background merges, since it results in uncontrolled growth of small parts when the free disk space approaches -storage.minFreeDiskSpaceBytes. The background merge logic uses another mechanism for determining whether there is enough disk space for the merge - it reserves the needed disk space before the merge and releases it after the merge. This prevents from out of disk space errors during background merge. - Properly handle corner cases for flushing in-memory data to disk when the storage enters read-only mode. This is better than losing the in-memory data. - Return back Storage.MustAddRows() instead of Storage.AddRows(), since the only case when AddRows() can return error is when the storage is in read-only mode. This case must be handled by the caller by calling Storage.IsReadOnly() before adding rows to the storage. This simplifies the code a bit, since the caller of Storage.MustAddRows() shouldn't handle errors returned by Storage.AddRows(). - Properly store parsed logs to Storage if parts of the request contain invalid log lines. Previously the parsed logs could be lost in this case. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4737 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4945	2023-10-02 20:38:00 +02:00
Aliaksandr Valialkin	cbbdf9cdf5	lib/logstorage: run up to GOMAXPROCS flushers of old in-memory parts to disk One flusher isn't enough under high data ingestion rate. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4775	2023-10-02 20:36:53 +02:00
Aliaksandr Valialkin	78e9cda4b1	lib/logstorage: assist merging in-memory parts at data ingestion path if their number starts exceeding maxInmemoryPartsPerPartition This is a follow-up for `9310e9f584` , which removed data ingestion pacing. This can result in uncontrolled growth of in-memory parts under high data ingestion rate, which, in turn, can result in unbounded RAM usage, OOM crashes and slow query performance. While at it, consistently reset isInMerge field for parts passed to mergeParts() before returning from this function. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4775 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4828	2023-10-02 20:35:20 +02:00
Zakhar Bessarab	876bce5a57	lib/logstorage: prevent from panic during background merge (#4969 ) * lib/logstorage: prevent from panic during background merge Fixes panic during background merge when resulting block would contain more columns than maxColumnsPerBlock. Buffered data will be flushed and replaced by the next block. See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4762 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/logstorage: clarify field description and comment Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2023-10-02 19:29:31 +02:00
Zakhar Bessarab	dfdada055c	lib/logstorage: switch to read-only mode when running out of disk space (#4945 ) * lib/logstorage: switch to read-only mode when running out of disk space Added support of `--storage.minFreeDiskSpaceBytes` command-line flag to allow graceful handling of running out of disk space at `--storageDataPath`. See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4737 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/logstorage: fix error handling logic during merge Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/logstorage: fix log level Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Co-authored-by: Nikolay <nik@victoriametrics.com>	2023-10-02 17:09:57 +02:00
Zakhar Bessarab	53268ebc66	lib/logstorage/datadb: remove parts merge cond (#4828 ) It was added in order to limit number of goroutines performing assisted merges during ingestion. It turned out that blocking ingestion goroutines lower ingestion performance and limits overall ingestion around 40k items per seconds because of lock contention. Removing parts merge sync.Cond allows to remove lock contention at write path and significantly improves write performance. See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4775 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2023-10-02 17:09:12 +02:00
Zakhar Bessarab	47d9e82b52	lib/storage/partition: add check to ensure parts exist on disk (#5017 ) * lib/storage/partition: add check to ensure parts exist on disk If part exists in parts.json but is missing on disk there will be a misleading error similar to "unexpected number of substrings in the part name". This change forces verification of part existence and throws a correct error in case it is missing on disk. Such issue can be result of https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5005 or disk corruption. Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/storage/partition: use filepath.Join instead of string concatenation Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/storage/partition: add action points for error message Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * all: add a check for missing part in lib/mergeset and lib/logstorage --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-09-19 11:18:21 +02:00
Aliaksandr Valialkin	d8afd7fe98	Makefile: update golangci-lint from v1.51.2 to v1.54.2 See https://github.com/golangci/golangci-lint/releases/tag/v1.54.2	2023-09-01 10:25:49 +02:00
crossoverJie	64b27c9217	lib/logstorage: Set ptwHot to nil when the partition pointed by ptwHot is dropped (#4902 )	2023-08-29 11:22:53 +02:00
Aliaksandr Valialkin	1e1aa94ffb	lib/logstorage: eliminate data race when clearing s.ptwHot after deleting the corresponding partition The previous code could result in the following data race: 1. The s.ptwHot partition is marked to be deleted 2. ptw.decRef() is called on it 3. ptw.pt is set to nil 4. s.ptwHot.pt is accessed from concurrent goroutine, which leads to panic. The change clears s.ptwHot under s.partitionsLock in order to prevent from the data race. This is a follow-up for `8d50032dd6` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4895	2023-08-29 11:12:07 +02:00
crossoverJie	db0ae3fffb	lib/logstorage: add nil check for ptwHot.pt (#4896 ) (cherry picked from commit `cde5029bce`)	2023-08-27 09:14:28 +02:00
Aliaksandr Valialkin	8b4bf5d269	app/vlstorage: expose vl_data_size_bytes metric at /metrics page for tracking the on-disk data size (both indexdb and the data itself)	2023-07-31 07:56:16 -07:00
Aliaksandr Valialkin	30098ac8bd	app/vlinsert/loki: follow-up after `09df5b66fd` - Parse protobuf if Content-Type isn't set to `application/json` - this behavior is documented at https://grafana.com/docs/loki/latest/api/#push-log-entries-to-loki - Properly handle gzip'ped JSON requests. The `gzip` header must be read from `Content-Encoding` instead of `Content-Type` header - Properly flush all the parsed logs with the explicit call to vlstorage.MustAddRows() at the end of query handler - Check JSON field types more strictly. - Allow parsing Loki timestamp as floating-point number. Such a timestamp can be generated by some clients, which store timestamps in float64 instead of int64. - Optimize parsing of Loki labels in Prometheus text exposition format. - Simplify tests. - Remove lib/slicesutil, since there are no more users for it. - Update docs with missing info and fix various typos. For example, it should be enough to have `instance` and `job` labels as stream fields in most Loki setups. - Allow empty of missing timestamps in the ingested logs. The current timestamp at VictoriaLogs side is then used for the ingested logs. This simplifies debugging and testing of the provided HTTP-based data ingestion APIs. The remaining MAJOR issue, which needs to be addressed: victoria-logs binary size increased from 13MB to 22MB after adding support for Loki data ingestion protocol at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4482 . This is because of shitty protobuf dependencies. They must be replaced with another protobuf implementation similar to the one used at lib/prompb or lib/prompbmarshal .	2023-07-20 21:52:11 -07:00

1 2

65 Commits