VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-12-24 03:06:48 +01:00

Author	SHA1	Message	Date
Aliaksandr Valialkin	1000ae437c	lib/logstorage: optimize performance for `top` pipe when it is applied to a field with millions of unique values - Use parallel merge of per-CPU shard results. This improves merge performance on multi-CPU systems. - Use topN heap sort of per-shard results. This improves performance when results contain millions of entries. (cherry picked from commit `192c07f76a`)	2024-10-18 11:42:15 +02:00
Aliaksandr Valialkin	54ccf09fdd	lib/logstorage: follow-up for `72941eac36` - Allow dropping metrics if the query result contains at least a single metric. - Allow copying by(...) fields. - Disallow overriding by(...) fields via `math` pipe. - Allow using `format` pipe in stats query. This is useful for constructing some labels from the existing by(...) fields. - Add more tests. - Remove the check for time range in the query filter according to https://github.com/VictoriaMetrics/VictoriaMetrics/pull/7254/files#r1803405826 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/7254	2024-10-17 11:09:16 -03:00
Hui Wang	21864de527	victorialogs: add more checks for stats query APIs (#7254 ) 1. Verify if field in [fields pipe](https://docs.victoriametrics.com/victorialogs/logsql/#fields-pipe) exists. If not, it generates a metric with illegal float value "" for prometheus metrics protocol. 2. check if multiple time range filters produce conflicted query time range, for instance: ``` query: _time: 5m \| stats count(), start:2024-10-08T10:00:00.806Z, end: 2024-10-08T12:00:00.806Z, time: 2024-10-10T10:02:59.806Z ``` must give no result due to invalid final time range. --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-10-17 11:09:16 -03:00
Aliaksandr Valialkin	3346576a3a	lib/logstorage: refactor storage format to be more efficient for querying wide events It has been appeared that VictoriaLogs is frequently used for collecting logs with tens of fields. For example, standard Kuberntes setup on top of Filebeat generates more than 20 fields per each log. Such logs are also known as "wide events". The previous storage format was optimized for logs with a few fields. When at least a single field was referenced in the query, then the all the meta-information about all the log fields was unpacked and parsed per each scanned block during the query. This could require a lot of additional disk IO and CPU time when logs contain many fields. Resolve this issue by providing an (field -> metainfo_offset) index per each field in every data block. This index allows reading and extracting only the needed metainfo for fields used in the query. This index is stored in columnsHeaderIndexFilename ( columns_header_index.bin ). This allows increasing performance for queries over wide events by 10x and more. Another issue was that the data for bloom filters and field values across all the log fields except of _msg was intermixed in two files - fieldBloomFilename ( field_bloom.bin ) and fieldValuesFilename ( field_values.bin ). This could result in huge disk read IO overhead when some small field was referred in the query, since the Operating System usually reads more data than requested. It reads the data from disk in at least 4KiB blocks (usually the block size is much bigger in the range 64KiB - 512KiB). So, if 512-byte bloom filter or values' block is read from the file, then the Operating System reads up to 512KiB of data from disk, which results in 1000x disk read IO overhead. This overhead isn't visible for recently accessed data, since this data is usually stored in RAM (aka Operating System page cache), but this overhead may become very annoying when performing the query over large volumes of data which isn't present in OS page cache. The solution for this issue is to split bloom filters and field values across multiple shards. This reduces the worst-case disk read IO overhead by at least Nx where N is the number of shards, while the disk read IO overhead is completely removed in best case when the number of columns doesn't exceed N. Currently the number of shards is 8 - see bloomValuesShardsCount . This solution increases performance for queries over large volumes of newly ingested data by up to 1000x. The new storage format is versioned as v1, while the old storage format is version as v0. It is stored in the partHeader.FormatVersion. Parts with the old storage format are converted into parts with the new storage format during background merge. It is possible to force merge by querying /internal/force_merge HTTP endpoint - see https://docs.victoriametrics.com/victorialogs/#forced-merge .	2024-10-17 11:09:16 -03:00
Aliaksandr Valialkin	0881e5fd5c	app/vlselect: do not show empty fields in query results Empty fields are treated as non-existing fields by VictoriaLogs data model. So there is no sense in returning empty fields in query results, since they may mislead and confuse users. (cherry picked from commit `bac193e50b`)	2024-10-15 11:49:32 +02:00
Aliaksandr Valialkin	f627d7f686	app/vlstorage: add support for forced merge via /internal/force_merge HTTP endpoint (cherry picked from commit `3c73dbbacc`)	2024-10-15 11:49:31 +02:00
Aliaksandr Valialkin	ac2b6e8704	lib/logstorage: make a copy of s.partitions slice when performing queries over the selected partitions s.partitions can be changed when new partition is registered or when old partition is dropped. This could lead to data races and panics when s.partitions slice is accessed by concurrently executed queries. The fix is to make a copy of the selected partitions under s.partitionsLock before performing the query. (cherry picked from commit `b4b79a4961`)	2024-10-15 11:49:31 +02:00
Aliaksandr Valialkin	b694ca4952	lib/logstorage: move getConstColumnValue() and getColumnHeader() methods from columnsHeader to blockSearch This localizes blockSearch.getColumnsHeader() call at block_search.go . This call is going to be optimized in the next commits in order to avoid unmarshaling of header data for unneeded columns, which weren't requested by getConstColumnValue() / getColumnHeader(). (cherry picked from commit `507b206a7d`)	2024-10-15 11:49:30 +02:00
Aliaksandr Valialkin	beeb80e4f8	lib/logstorage: avoid redundant copying of column names and column values for dictionary-encoded columns during querying Refer the original byte slice with the marshaled columnsHeader for columns names and dictionary-encoded column values. This improves query performance a bit when big number of blocks with big number of columns are scanned during the query. (cherry picked from commit `279e25e7c8`)	2024-10-15 11:49:30 +02:00
Aliaksandr Valialkin	afe5158443	lib/logstorage: avoid calling columnsHeader.initFromBlockHeader() multiple times for the same blockSearch This should improve performance when blockSearch.getColumnsHeader() is called multiple times from different places of the code. (cherry picked from commit `9e48074b59`)	2024-10-15 11:49:30 +02:00
Aliaksandr Valialkin	e581338b84	lib/logstorage: make sure that bs.br is non-nil before checking br.bs.bsw.bh.rowsCount there br.bs may be nil when br contains the block with additional filters applied during pipe calculations. For example, `* \| count() if (error) errors`. (cherry picked from commit `867f671cc4`)	2024-10-15 11:49:29 +02:00
Aliaksandr Valialkin	b3bbf94310	lib/logstorage: disallow using pipe names as the first unquoted words in `filter` pipe Improperly written pipes could be silently parsed as filter pipe. For example, the following query: * \| by (x) was silently parsed to: * \| filter "by" x It is better to return error, so the user could identify and fix invalid pipe instead of silently executing invalid query with `filter` pipe. (cherry picked from commit `7b475ed95d`)	2024-10-11 14:27:46 +02:00
Aliaksandr Valialkin	834e2ad855	lib/logstorage: disallow using by as the first word in log filters, since it frequently clashes with `stats by(...)` pipe where `stats` word is omitted (cherry picked from commit `6acf543b90`)	2024-10-11 14:27:46 +02:00
Aliaksandr Valialkin	0a61222627	lib/logstorage: quote logfmt strings only if they contain special chars, which could break logfmt parsing and/or reading Some checks are pending build / Build (push) Waiting to run Details CodeQL Go / Analyze (push) Waiting to run Details main / lint (push) Waiting to run Details main / test (test-full) (push) Blocked by required conditions Details main / test (test-full-386) (push) Blocked by required conditions Details main / test (test-pure) (push) Blocked by required conditions Details (cherry picked from commit `462b7cd597`)	2024-10-07 14:47:22 +02:00
Aliaksandr Valialkin	7a44614e0b	lib/logstorage: add `len` pipe for calculating byte length of log field values (cherry picked from commit `364f084b43`)	2024-10-04 10:42:51 +02:00
Aliaksandr Valialkin	81f3e07e1e	lib/logstorage: do not count dictionary values which have no matching logs in `count_uniq` stats function Create blockResultColumn.forEachDictValue* helper functions for visiting matching dictionary values. These helper functions should prevent from counting dictionary values without matching logs in the future. This is a follow-up for `0c0f013a60` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7152	2024-10-01 13:36:27 +02:00
Aliaksandr Valialkin	8c55b699f4	app/vlogscli: add interactive command-line tool for querying VictoriaLogs	2024-10-01 12:24:53 +02:00
Aliaksandr Valialkin	dbcf06cd85	lib/logstorage: skip values with zero hits for 'uniq', 'top' and 'field_values' pipes See https://github.com/VictoriaMetrics/victorialogs-datasource/issues/72#issuecomment-2352078483	2024-09-30 14:16:21 +02:00
Aliaksandr Valialkin	7456cbc653	lib/logstorage: allow using `!` in unescaped phrase Previously the phrase filter with `!` was treated unexpectedly. For example, `foo!bar` filter was treated at `foo AND NOT bar`, while most users expect that it matches "foo!bar" phrase. This commit aligns with users' expectations.	2024-09-29 11:18:04 +02:00
Aliaksandr Valialkin	b7a3d575da	lib/logstorage: allow using `-` instead of `!` in front of `(...)`	2024-09-29 11:18:04 +02:00
Aliaksandr Valialkin	fce6fc745a	lib/logstorage: return the expected `hits` results from `uniq` pipe when the number of unique values reaches the specified limit Previously `uniq` pipe could return zero `hits` if the number of found unique values equals the specified limit. This wasn't expected in most cases.	2024-09-29 10:53:44 +02:00
Aliaksandr Valialkin	58d1e517de	lib/logstorage: clear hits slice obtained from encoding.GetUint64s() before updating it with hits for valueTypeDict column encoding.GetUint64s() returns uninitialized slice, which may contain arbitrary values. So values in this slice must be reset to zero before using it for counting hits in `uniq` and `top` pipes.	2024-09-29 10:29:50 +02:00
Aliaksandr Valialkin	b5d94f06f5	lib/logstorage: postpone initialization of per-shard stateSizeBudget until the first call to pipeProcessor.writeBlock() This simplifies pipeProcessor initialization logic a bit. This also doesn't mangle the original maxStateSize value, which is used in error messages when the state size exceeds maxStateSize.	2024-09-29 10:29:49 +02:00
Aliaksandr Valialkin	7f8b1300a9	lib/logstorage: add non-empty `if (...)` condition to automatically generated result names in `stats` pipe This allows executing queries with `stats` pipe, which calculate multiple results with the same functions, but with different `if (...)` conditions. For example: _time:5m \| count(), count() if (error) Previously such queries couldn't be executed becasue automatically generated name for the second result didn't include `if (error)`, so names for both results were identical - `count(*)`.	2024-09-29 09:52:19 +02:00
Aliaksandr Valialkin	04c73d54d4	lib/logstorage: support `order` alias for `sort` pipe Now the following queries are equivalents: _time:5s \| sort by (_time) _time:5s \| order by (_time) This is needed for convenience, since `order by` is commonly used in other query languages such as SQL.	2024-09-29 09:52:18 +02:00
Aliaksandr Valialkin	1a6313ca68	lib/logstorage: allow using `-` instead of `!` as a shorthand for `NOT` operator in LogsQL	2024-09-27 13:15:55 +02:00
Aliaksandr Valialkin	b60cb98377	lib/logstorage: support skipping _stream: prefix for stream filters '_stream:{...}' can be written as '{...}' This simplifies writing queries with stream filters, and makes them more familier to Loki users.	2024-09-27 13:15:55 +02:00
Aliaksandr Valialkin	bc0bb0c36a	lib/logstorage: consistently sort stream contexts belonging to different streams by the minimum time seen in the matching logs This should simplify debugging of stream_context output, since it remains stable over repeated requests.	2024-09-27 11:21:28 +02:00
Aliaksandr Valialkin	bce56d430d	lib/logstorage: add _msg="---" delimiter between different log streams in stream_context output This should help investigating contexts, which belong to different log streams.	2024-09-27 11:21:27 +02:00
Aliaksandr Valialkin	e4e14697fa	lib/logstorage: improve performance for stream_context pipe over streams with big number of log entries Do not read timestamps for blocks, which cannot contain surrounding logs. This should improve peformance for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6730 . Also optimize min(_time) and max(_time) calculations a bit by avoiding conversion of timestamp to string when it isn't needed. This should improve performance for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7070 .	2024-09-26 22:31:05 +02:00
Aliaksandr Valialkin	f5dfe1cacd	lib/logstorage: properly return surrounding logs outside the selected time range by stream_context pipe Previously only logs inside the selected time range could be returned by stream_context pipe. For example, the following query could return up to 10 surrounding logs only for the last 5 minutes, while most users expect this query should return up to 10 surrounding logs without restrictions on the time range. _time:5m panic \| stream_context before 10 This enables the ability to implement stream context feature at VictoriaLogs web UI: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7063 . Reduce memory usage when returning stream context over big log streams with millions of entries. The new logic scans over all the log messages for the selected log stream, while keeping in memory only the given number of surrounding logs. Previously all the logs for the given log stream on the selected time range were loaded in memory before selecting the needed surrounding logs. This should help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6730 . Reduce the scan performance for big log streams by fetching only the requested fields. For example, the following query should be executed much faster than before if logs contain many fields other than _stream, _msg and _time: panic \| stream_context after 30 \| fields _stream, _msg, _time	2024-09-26 17:04:39 +02:00
Aliaksandr Valialkin	4d27933041	app/vlinsert: support `_time` field without timezone information during data ingestion Use local timezone of the host server in this case. The timezone can be overridden with TZ environment variable if needed. While at it, allow using whitespace instead of T as a delimiter between data and time in the ingested _time field. For example, '2024-09-20 10:20:30' is now accepted during data ingestion. This is valid ISO8601 format, which is used by some log shippers, so it should be supported. This format is also known as SQL datetime format. Also assume local time zone when time without timezone information is passed to querying APIs. Previously such a time was parsed in UTC timezone. Add `Z` to the end of the time string if the old behaviour is preferred. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6721	2024-09-26 12:50:14 +02:00
Aliaksandr Valialkin	3a556bd15a	app/vlselect/logsql: clone the query with the current timestamp when performing live tailing requests in the loop Previously the original timestamp was used in the copied query, so _time:duration filters were applied to the original time range: (timestamp-duration ... timestamp]. This resulted in stopped live tailing, since new logs have timestamps bigger than the original time range. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7028	2024-09-26 08:57:48 +02:00
Aliaksandr Valialkin	55ecf4f766	lib/logstorage: add `blocks_count` pipe This pipe is useful for debugging purposes when the number of processed blocks must be calculated for the given query: <query> \| blocks_count This helps detecting the root cause of query performance slowdown in cases like https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7070	2024-09-25 19:18:38 +02:00
Aliaksandr Valialkin	66d6514e2e	lib/logstorage: lazily read column headers metadata during queries This improves performance for analytical queries, which do not need column headers metadata. For example, the following query doesn't need column headers metadata, since _stream and min(_time) are stored in block header, which is read separately from colum headers metadata: _time:1w \| stats by (_stream) min(_time) min_time This commit significantly improves the performance for this query. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7070	2024-09-25 19:18:37 +02:00
Aliaksandr Valialkin	246c339e3d	lib/logstorage: read timestamps column when it is really needed during query execution Previously timestamps column was read unconditionally on every query. This could significantly slow down queries, which do not need reading this column like in https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7070 .	2024-09-25 19:18:37 +02:00
Aliaksandr Valialkin	180137a377	lib/logstorage: improve the performance of obtaining _stream column value Substitute global streamTagsCache with per-blockSearch cache for ((stream.id) -> (_stream value)) entries. This improves scalability of obtaining _stream values on a machine with many CPU cores, since every CPU has its own blockSearch instance. This also should reduce memory usage when querying logs over big number of streams, since per-blockSearch cache of ((stream.id) -> (_stream value)) entries is limited in size, and its lifetime is bounded by a single query.	2024-09-24 20:57:39 +02:00
Aliaksandr Valialkin	9d11a21541	lib/logstorage/consts.go: document that it isn't recommended setting maxColumnsPerBlock constant to too big values This should help avoiding cases like this one - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6425#issuecomment-2337446083	2024-09-24 18:52:54 +02:00
Aliaksandr Valialkin	1264350566	lib/logstorage: improve performance for streamID.marshalString() by more than 2x The streamID.marshalString() is executed in hot path if the query selects _stream_id field. Command to run the benchmark: go test ./lib/logstorage/ -run=NONE -bench=BenchmarkStreamIDMarshalString -benchtime=5s Results before the commit: BenchmarkStreamIDMarshalString-16 438480714 14.04 ns/op 71.23 MB/s 0 B/op 0 allocs/op Results after the commit: BenchmarkStreamIDMarshalString-16 982459660 6.049 ns/op 165.30 MB/s 0 B/op 0 allocs/op	2024-09-24 18:38:21 +02:00
Aliaksandr Valialkin	d944c162da	lib/logstorage: add benchmark for streamID.marshalString	2024-09-24 18:38:21 +02:00
Aliaksandr Valialkin	472b6b326e	lib/logstorage: make sure that getCommonTokens returns common tokens in the original order of tokens inside tokenSets arg This fixes flaky test TestGetCommonTokensForOrFilters: filter_or_test.go:143: unexpected tokens for field "_msg"; got ["foo" "bar"]; want ["bar" "foo"]	2024-09-19 16:00:21 +02:00
Aliaksandr Valialkin	cad236003b	app/vlselect: consistently reuse the original query timestamp when executing /select/logsql/query with positive limit=N query arg Previously the query could return incorrect results, since the query timestamp was updated with every Query.Clone() call during iterative search for the time range with up to limit=N rows. While at it, optimize queries, which find low number of matching logs, while spend a lot of CPU time for searching across big number of logs. The optimization reduces the upper bound of the time range to search if the current time range contains zero matching rows. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6785	2024-09-08 14:34:46 +02:00
Aliaksandr Valialkin	297301e8c0	lib/logstorage: preserve the order of tokens to check against bloom filters in AND filters Previously tokens from AND filters were extracted in random order. This could slow down checking them agains bloom filters if the most specific tokens go at the beginning of the AND filters. Preserve the original order of tokens when matching them against bloom filters, so the user could control the performance of the query by putting the most specific AND filters at the beginning of the query. While at it, add tests for getCommonTokensForAndFilters() and getCommonTokensForOrFilters(). Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6554 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6556	2024-09-08 12:28:34 +02:00
Aliaksandr Valialkin	4b49b62a58	lib/logstorage: improve error logging for incorrect queries passed to /select/logsql/stats_query and /select/logsql/stats_query_range functions	2024-09-08 12:28:33 +02:00
Aliaksandr Valialkin	edb1afe804	lib/logstorage: properly extract common tokens from unsupported OR filters Previously the following query could miss rows matching !bar if these rows do not contain foo: foo OR !bar This is because of incorrect detection of common tokens for OR filters - all the unsupported filters were skipped (including the NOT filter (aka `!`)), while in this case zero common tokens must be returned. While at it, move repetiteve code in TestFilterAnd and TestFilterOr into f function. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6554 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6556	2024-09-08 12:28:33 +02:00
Aliaksandr Valialkin	c448189f69	app/vlselect: add /select/logsql/stats_query_range endpoint for building time series panels in VictoriaLogs plugin for Grafana Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6943 Updates https://github.com/VictoriaMetrics/victorialogs-datasource/issues/61	2024-09-07 00:44:34 +02:00
Aliaksandr Valialkin	01c8e12370	app/vlselect: add /select/logsql/stats_query endpoint, which is going to be used by vmalert Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6942 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6706	2024-09-06 23:00:58 +02:00
Aliaksandr Valialkin	c5badeea08	lib/logstorage: substitute `\|` operator with `or` operator at `math` pipe This is needed for avoiding confusion between the `\|` operator at `math` pipe and `\|` pipe delimiter. For example, the following query was parsed unexpectedly: * \| math foo / bar \| fields x as * \| math foo / (bar \| fields) as x Substituting `\|` with `or` inside `math` pipe fixes this ambiguity.	2024-09-06 22:43:29 +02:00
Aliaksandr Valialkin	08fe7949d1	lib/logstorage: consistently use nsecsPerDay constant and remove nsecPerDay constant	2024-09-06 16:18:15 +02:00
Aliaksandr Valialkin	7dcce1ca02	lib/logstorage: pre-calculate hashes from tokens used in bloom filter search Previously per-token hashes for per-block bloom filters were re-calculated on every scanned block. This could be slow when the number of tokens is big or when the number of blocks to scan is big. Pre-calculate hashes for bloom filters and then use them for searching in bloom filters. This improves performance by 2.5x for in(...) filters with many values to search inside `in()`.	2024-09-05 19:44:42 +02:00

1 2 3

146 Commits