VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-12-16 00:41:24 +01:00

Author	SHA1	Message	Date
Aliaksandr Valialkin	e51190a34c	Revert "app/vmselect: make vmselect resilient to absence of cache folder (#5987 )" This reverts commit `cb23685681`. Reason for revert: the "fix" may hide programming bugs related to incorrect creation of folders before their use. This may complicate detecting and fixing such bugs in the future. There are the following fixes for the issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5985 : - To configure the OS to do not drop data from the system-wide temporary directory (aka /tmp). - To run VictoriaMetrics with -cacheDataPath command-line flag, which points to the directory, which cannot be removed automatically by the OS. The case when the user accidentally deletes the directory with some files created by VictoriaMetrics shouldn't be considered as expected, so VictoriaMetrics shouldn't try resolving this case automatically. It is much better from operation and debuggability PoV is to crash with the clear `directory doesn't exist` error in this case.	2024-04-03 02:44:00 +03:00
Aliaksandr Valialkin	7edb5f77f1	app/vmagent: properly shutdown when -maxIngestionRate limit is reached The remotewrite.Stop() expects that there are no pending calls to TryPush(). This means that the ingestionRateLimiter.Register() must be unblocked inside TryPush() when calling remotewrite.Stop(). Provide remotewrite.StopIngestionRateLimiter() function for unblocking the rate limiter before calling the remotewrite.Stop(). While at it, move the rate limiter into lib/ratelimiter package, since it has two users. Also move the description of the feature to the correct place at docs/CHANGELOG.md. Also cross-reference -remoteWrite.rateLimit and -maxIngestionRate command-line flags. This is a follow-up for `02bccd1eb9` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5900	2024-04-03 02:41:11 +03:00
Zakhar Bessarab	3d15a31c6d	lib/storage: add ability to use downsampling for the given series filter (#733 ) * lib/storage: add ability to use downsampling for the given series filter Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * docs: add information about downsampling filters Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * docs: fix MetricsQL filter Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/storage/downsampling: treat missing downsampling filter as a bug Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/storage/part_header: verify correctness of downsampling filters when opening partition Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/storage/downsampling: save only appliable rules in part metadata Filter and save only rules which are appliable to partition based on MinTimestamp of stored data. Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/storage/downsampling: update log messages for final dedup Properly specify a reason of re-running deduplication for partition. Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/storage: consistently use MaxTimestamp to determine deduplication/downsampling rules Using MinTimestamp leads to applying downsampling to parts which are only partially covered by downsampling rule. For example, partition covers range [1000-2000]. At t=2100 and rule offset 500 data with t=2100-500 => 1600 must be downsampled. The range check against MinTimestamp evaluates to true even though partition contains range which must not be downsampled - [1600:2000]. Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * Follow-up - Apply the first matching downsampling period if multiple filters match the given time series. This allows fine-tuning the downsampling config for the specific needs. - Take into account downsampling filters during search queries. - Reduce the difference between community and enterprise branches. This should simplify further maintenance of these branches. - Properly parse series filters with colons inside them. - Document the feature at docs/CHANGELOG.md. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4960 --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-04-03 02:38:37 +03:00
Aliaksandr Valialkin	ae6190f0b5	lib/storage/table.go: reduce the difference with enterprise branch	2024-04-03 02:37:05 +03:00
Aliaksandr Valialkin	b6d1d6982e	lib/storage/partition.go: reduce code difference a bit with enterprise branch	2024-04-03 02:36:49 +03:00
Nikolay	c457f7de69	lib/storage: adds metrics for downsampling (#382 ) * lib/storage: adds metrics for downsampling vm_downsampling_partitions_scheduled - shows the number of parts, that must be downsampled vm_downsampling_partitions_scheduled_size_bytes - shows total size in bytes for parts, the must be donwsampled These two metrics answer the questions - is downsampling running? how many parts scheduled for downsampling and how many of them currently downsampled? Storage space that it occupies. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2612 * wip Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-04-03 02:36:05 +03:00
Andrii Chubatiuk	0799834aaa	opentelemetry: added cmd flag to sanitize metric names (#6035 )	2024-04-03 02:31:39 +03:00
Aliaksandr Valialkin	b8d37ad747	lib/storage: follow-up for `76f00cea6b` Store the deadline when the metricID entries must be deleted from indexdb if metricID->metricName entry isn't found after the deadline. This should make the code more clear comparing the the previous version, where the timestamp of the first metricID->metricName lookup miss was stored in missingMetricIDs. Remove the misleading comment about the importance of the order for creating entries in the inverted index when registering new time series. The order doesn't matter, since any subset of the created entries can become visible for search before any other subset after registering in indexdb. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5948 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5959	2024-04-02 23:46:21 +03:00
Zakhar Bessarab	7c1ee69205	lib/storage/table: wait for merges to be completed when closing a table (#5965 ) * lib/storage/table: properly wait for force merges to be completed during shutdown Properly keep track of running background merges and wait for merges completion when closing the table. Previously, force merge was not in sync with overall storage shutdown which could lead to holding ptw ref. Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * docs: add changelog entry Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2024-04-02 21:25:30 +03:00
Andrii Chubatiuk	914b23f1e8	app/{vmagent,vminsert}: fixed firehose response (#6016 )	2024-04-02 18:03:12 +03:00
Roman Khavronenko	548bf31dd2	app/vmselect: make vmselect resilient to absence of cache folder (#5987 ) vmselect uses a cache folder in file system for two purposes: 1. Storing rollup cache results on shutdown; 2. Storing temporary search results from vmstorage during query executions. It could happen that cache folder is deleted accidentally by user, or by OS during cleanup routines. This would cause vmselect to: 1. panic on /metrics call, because `MustGetFreeSpace` will fail; 2. return query error user, as it won't be able to store temporary search results. The changes in this commit are the following: 1. Make `MustGetFreeSpace` to try re-creating the cache folder if it is missing; 2. Make vmselect to try re-creating the cache folder if it can't persist tmp search results. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5985 Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Nikolay <nik@victoriametrics.com> (cherry picked from commit `cb23685681`)	2024-03-26 15:27:32 +01:00
hagen1778	c0659800d5	lib/promauth: follow-up `b577413d3b` Convert test result expectations to canonical form. Starting from `b577413d3b` specified header keys are forced into canonical form https://pkg.go.dev/net/http#CanonicalHeaderKey Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `e6dd52b04c`)	2024-03-25 15:42:22 +01:00
Aliaksandr Valialkin	cd222d6502	lib/streamaggr: ignore out of order samples for `last` output This is a follow-up for `6a465f6e29` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5931	2024-03-18 01:03:58 +02:00
Aliaksandr Valialkin	eecc5e8463	lib/storage: wait for up to 60 seconds before deciding to delete metricID entries from indexdb if metricID->metricName entry is missing during search The metricID->metricName entry can remain invisible for search for some time after registering new metricName. This is expected condition. So wait for up to 60 seconds in the hope that the metricID->metricName entry will become visible before deleting all the entries from indexdb, which are associated with the given metricID. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5959 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5948 See also `20812008a7`	2024-03-18 00:37:11 +02:00
Aliaksandr Valialkin	05236e7cd4	lib/httputils: rename CAFile -> caFile in order to be consistent with local var naming in Go This is a follow-up for `83e55456e2`	2024-03-17 23:31:53 +02:00
Aliaksandr Valialkin	111d0aa2bf	app/{vmagent,vminsert}: add an ability to ignore input samples outside the current aggregation interval for stream aggregation See https://docs.victoriametrics.com/stream-aggregation.html#ignoring-old-samples	2024-03-17 23:30:46 +02:00
Aliaksandr Valialkin	e70b644f1f	lib/streamaggr: ignore out of order samples when calculating increase, increase_prometheus, total and total_prometheus outputs Out of order samples may result in unexpected spikes for these outputs. So it is better to ignore such samples. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5931	2024-03-17 23:24:14 +02:00
Aliaksandr Valialkin	1f753a049a	lib/streamaggr: follow-up for `15e33d56f1` - Properly set pushSample.timestamp when flushing de-duplicated samples to stream aggregation This is needed for https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5931 - Re-classify this change as feature instead of bugfix at docs/CHANGELOG.md - Verify de-duplication logic for samples with different timestamps Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5643 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5939	2024-03-17 23:23:57 +02:00
Aliaksandr Valialkin	7c4d7dc6dd	lib/promauth: properly set `Host` header in requests to scrape targets. The `Host` header must be set via net/http.Request.Host field, since net/http.Client ignores this header if it is set via Request.Header.Set(). Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5969 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5970	2024-03-17 23:22:54 +02:00
Andrii Chubatiuk	a58a81d80b	lib/streamaggr: pick sample with bigger timestamp or value on deduplicator (#5939 ) Apply the same deduplication logic as in https://docs.victoriametrics.com/#deduplication This would require more memory for deduplication, since we need to track timestamp for each record. However, deduplication should become more consistent. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5643 --------- Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>	2024-03-17 23:06:37 +02:00
Aliaksandr Valialkin	f92d4609e2	lib/storage: optimize /api/v1/labels and /api/v1/label/.../values when match[] contains metric name Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2978 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5055	2024-03-12 03:01:47 +02:00
Aliaksandr Valialkin	540f65cc49	lib/storage: move the conversion of tag filters to composite tag filters into indexSearch.searchMetricIDsInternal This makes the code less fragile - it is harder to skip the convertToCompositeTagFilterss() call now. While at it, call indexSearch.containsTimeRange() inside indexSearch.searchMetricIDsInternal() in order to quickly terminate search of time series in the old indexdb for new time ranges. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5055 This is a follow-up for `2d31fd7855`	2024-03-12 02:59:04 +02:00
Aliaksandr Valialkin	293f03f2dd	lib/storage: use composite indexes (metricName, label=value) when searching for matching time series at /api/v1/labels, /api/v1/label/.../values and /api/v1/status/tsdb This should improve query performance when match[], extra_filters[] or extra_label args are passed to these APIs Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5055	2024-03-12 02:56:35 +02:00
Aliaksandr Valialkin	d9e3670627	lib/promauth: set the Host header to tlsServerName if itsn't empty If tlsServerName isn't empty, then it is likely the https request is sent to IP instead of hostname. In this case the request will fail, since Go automatically sets the Host header to the IP instead of the desired hostname at tlsServerName. So set the Host header to tlsServerName if itsn't empty. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5802	2024-03-07 01:23:40 +02:00
Aliaksandr Valialkin	5817bf1c59	lib/streamaggr: add tests for keep_metric_names and drop_input_labels options	2024-03-06 20:06:23 +02:00
Aliaksandr Valialkin	1d7efc4c64	app/vmagent/remotewrite: clarify the reason behind the default value for -remoteWrite.queues in the same way as the reason for -maxConcurrentInserts is defined at `73f5fb0f0c`	2024-03-06 13:57:53 +02:00
hagen1778	a364de3cfe	lib/writeconcurrencylimiter: mention dependency on CPU cores for `-maxConcurrentInserts` flag The change also removes misleading `default` value from README for `maxConcurrentInserts` cmd-line flag. Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `73f5fb0f0c`) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-03-05 18:56:38 +01:00
Aliaksandr Valialkin	27b9e8ed3e	app/{vmagent,vminsert}: add `-streamAggr.dropInputSamples` command-line flag for dropping the specified labels from input samples before deduplication and streaming aggregation	2024-03-05 02:27:27 +02:00
Aliaksandr Valialkin	c38c45d71f	app/{vminsert,vmagent}: allow using -streamAggr.dedupInterval without -streamAggr.config This allows performing online de-duplication of incoming samples	2024-03-05 00:47:23 +02:00
Aliaksandr Valialkin	4352544d61	lib/streamaggr: do not reset aggregation state after the aggregation took longer than the configured interval It is better from user PoV preserving this state until the next flush	2024-03-04 20:03:45 +02:00
Aliaksandr Valialkin	9f81450b38	lib/streamaggr: add missing "s" suffix in the warning message when the de-duplication or aggregation couldnt be finished in a timely manner	2024-03-04 19:38:39 +02:00
Aliaksandr Valialkin	d10932bd99	lib/streamaggr: benchmark only flush routines in BenchmarkDedupAggrFlushSerial and BenchmarkAggregatorsFlushSerial	2024-03-04 19:13:50 +02:00
Aliaksandr Valialkin	36ee08cad4	Revert "lib/streamaggr: do not flush dedup shards in parallel" This reverts commit `eb40395a1c`. Reason for revert: it has been appeared that the performance gain on multiple CPU cores wasn't visible because the benchmark was generating incorrect pushSample.key. See a207e0bf687d65f5198207477248d70c69284296	2024-03-04 19:13:50 +02:00
Aliaksandr Valialkin	9728aaf5d9	lib/streamaggr: properly generate pushSample.key in benchmarks	2024-03-04 19:13:49 +02:00
Aliaksandr Valialkin	93a057e4e6	lib/streamaggr: reduce the number of pointers at "total" aggregation state This should reduce load on GC when scanning heap objects.	2024-03-04 19:13:49 +02:00
Aliaksandr Valialkin	9e00d8ad60	lib/streamaggr: use multiple job label values in BenchmarkAggregatorsPush instead of single value This should make the benchmark closer to production cases	2024-03-04 19:13:48 +02:00
Aliaksandr Valialkin	9773ad200e	lib/streamaggr: use multiple job labels in BenchmarkAggregatorsPush	2024-03-04 19:13:48 +02:00
Aliaksandr Valialkin	482560a1f3	lib/streamaggr: do not flush dedup shards in parallel This significantly increases CPU usage on systems with many CPU cores, while doesn't reduce flush latency too much	2024-03-04 17:01:42 +02:00
Aliaksandr Valialkin	d7252fce79	lib/streamaggr: reduce memory allocations when registering new series in deduplication and aggregation structs	2024-03-04 17:01:41 +02:00
Aliaksandr Valialkin	402dc14ec0	lib/streamaggr: make aggregate.runFlusher() more roubst and clear	2024-03-04 17:01:41 +02:00
Aliaksandr Valialkin	2ffef39bb3	lib/streamaggr: properly drop samples on the first incomplete interval Previously samples were dropped on the first incomplete interval and the next complete interval. Also make sure that the de-duplication is performed just before flushing the aggregate state. This should help the case then dedup_interval = interval.	2024-03-04 17:01:40 +02:00
Aliaksandr Valialkin	c2dae136b3	lib/streamaggr: explicitly call resetSeries after flushSeries This makes the code less fragile	2024-03-04 06:23:36 +02:00
Aliaksandr Valialkin	48a425898a	lib/streamaggr: enable time alignment for aggregate flushed to multiples of interval For example, if `interval: 1m`, then data flush occurs at the end of every minute, while `interval: 1h` leads to data flush at the end of every hour. Add `no_align_flush_to_interval` option, which can be used for disabling the alignment.	2024-03-04 06:23:35 +02:00
Aliaksandr Valialkin	d80deaeaf4	lib/streamaggr: ignore the first sample in new time series during staleness_interval seconds after the stream aggregation start for total and increase outputs	2024-03-04 03:04:58 +02:00
Aliaksandr Valialkin	5e9cbfd4db	lib/streamaggr: flush dedup state and aggregation state in parallel on all the available CPU cores This should reduce the time needed for aggregation state flush on systems with many CPU cores	2024-03-04 01:22:41 +02:00
Aliaksandr Valialkin	1e741ed6db	lib/streamaggr: add a benchmark for flushing dedup state	2024-03-04 01:22:40 +02:00
Aliaksandr Valialkin	5205972b83	lib/streamaggr: add a benchmark for measuring the performance of aggregator.flush	2024-03-04 01:22:40 +02:00
Aliaksandr Valialkin	8daf7a3f43	lib/streamaggr: add a benchmark for de-duplicating of 1M samples	2024-03-04 01:22:39 +02:00
Aliaksandr Valialkin	d4a425af87	lib/prompbmarshal: use clear() instead of a loop for clearing tss inside ResetTimeSeries()	2024-03-03 23:40:47 +02:00
Aliaksandr Valialkin	b958135677	lib/promutils: optimize LabelsCompressor.Decompress by using a specialized labelsMap struct instead of sync.Map The labelsMap struct employs the fact that label indexes are condensed around 0, so it stores the referred labels in a slice instead of map and uses slice index as label key. This allows increasing the LabelsCompressor.Decompress performance by up to 3x. This also reduces the latency of data flush in stream aggregation.	2024-03-03 23:25:27 +02:00
Aliaksandr Valialkin	0d5d46f9db	lib/streamaggr: huge pile of changes - Reduce memory usage by up to 5x when de-duplicating samples across big number of time series. - Reduce memory usage by up to 5x when aggregating across big number of output time series. - Add lib/promutils.LabelsCompressor, which is going to be used by other VictoriaMetrics components for reducing memory usage for marshaled []prompbmarshal.Label. - Add `dedup_interval` option at aggregation config, which allows setting individual deduplication intervals per each aggregation. - Add `keep_metric_names` option at aggregation config, which allows keeping the original metric names in the output samples. - Add `unique_samples` output, which counts the number of unique sample values. - Add `increase_prometheus` and `total_prometheus` outputs, which ignore the first sample per each newly encountered time series. - Use 64-bit hashes instead of marshaled labels as map keys when calculating `count_series` output. This makes obsolete https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5579 - Expose various metrics, which may help debugging stream aggregation: - vm_streamaggr_dedup_state_size_bytes - the size of data structures responsible for deduplication - vm_streamaggr_dedup_state_items_count - the number of items in the deduplication data structures - vm_streamaggr_labels_compressor_size_bytes - the size of labels compressor data structures - vm_streamaggr_labels_compressor_items_count - the number of entries in the labels compressor - vm_streamaggr_flush_duration_seconds - a histogram, which shows the duration of stream aggregation flushes - vm_streamaggr_dedup_flush_duration_seconds - a histogram, which shows the duration of deduplication flushes - vm_streamaggr_flush_timeouts_total - counter for timed out stream aggregation flushes, which took longer than the configured interval - vm_streamaggr_dedup_flush_timeouts_total - counter for timed out deduplication flushes, which took longer than the configured dedup_interval - Actualize docs/stream-aggregation.md The memory usage reduction increases CPU usage during stream aggregation by up to 30%. This commit is based on https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5850 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5898	2024-03-02 03:15:43 +02:00
Aliaksandr Valialkin	31f0dc4b97	lib/streamaggr: allow one second aggregation interval	2024-03-01 21:35:43 +02:00
Aliaksandr Valialkin	7533070a52	lib/promrelabel: use clear() function inside CleanLabels()	2024-03-01 21:34:47 +02:00
Aliaksandr Valialkin	052f2177a4	lib/fs: fix GOOS=windows build after `f8baf29b6e`	2024-03-01 01:46:44 +02:00
Aliaksandr Valialkin	816202bca7	lib/protoparser/opentelemetry/firehose: verify that the full response is parsed properly in ProcessRequestBody This is a follow-up for `bf9cb84575` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5899	2024-03-01 00:39:47 +02:00
Andrii Chubatiuk	e575fb1aeb	opentelemetry: fix firehose message parsing (#5899 ) Co-authored-by: Andrii Chubatiuk <wachy@Andriis-MBP-2.lan>	2024-03-01 00:24:14 +02:00
Aliaksandr Valialkin	01d8bee14c	lib/mergeset: use unsafe.Slice and unsafe.String instead of deprecated reflect.SliceHeader with unsafe conversion from slice header to string header	2024-02-29 17:29:40 +02:00
Aliaksandr Valialkin	99269ea640	lib/bytesutil: use unsafe.String instead of unsafe conversion of slice header to string header	2024-02-29 17:28:04 +02:00
Aliaksandr Valialkin	ddc61e2309	lib/fs: properly handle the case when data=nil is passed to mUnmap	2024-02-29 17:26:26 +02:00
Aliaksandr Valialkin	22acd84019	lib/storage: use unsafe.Slice instead of deprecated reflect.SliceHeader	2024-02-29 17:24:44 +02:00
Aliaksandr Valialkin	a9fb2e91a6	lib/protoparser/csvimport: unse unsafe.Slice instead of deprecated reflect.SliceHeader	2024-02-29 17:20:05 +02:00
Aliaksandr Valialkin	9bc4c51ceb	lib/fs: use unsafe.Slice instead of deprecated reflect.SliceHeader	2024-02-29 17:18:42 +02:00
Aliaksandr Valialkin	4b1a262475	lib/fastnum: use unsafe.Slice() instead of deprecated reflect.SliceHeader	2024-02-29 17:17:24 +02:00
Aliaksandr Valialkin	3383f73191	lib/bytesutil: make BenchmarkToUnsafeString and BenchmarkToUnsafeBytes more reliable This is needed for https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5880	2024-02-29 17:12:30 +02:00
helen	74b0605232	Optimize TouUnsafeBytes to make it leaner, more standards-compliant and (#5880 ) slightly faster.	2024-02-29 17:12:04 +02:00
XLONG96	88b9088499	lib/logstorage: avoid panic when parsing regex with stream filter (#5897 )	2024-02-29 15:32:25 +02:00
Aliaksandr Valialkin	7832d0800e	app/{vminsert,vmagent}: follow-up after `67a55b89a4` - Document the ability to read OpenTelemetry data from Amazon Firehose at docs/CHANGELOG.md - Simplify parsing Firehose data. There is no need in trying to optimize the parsing with fastjson and byte slice tricks, since OpenTelemetry protocol is really slooow because of over-engineering. It is better to write clear code for better maintanability in the future. - Move Firehose parser from /lib/protoparser/firehose to lib/protoparser/opentelemetry/firehose, since it is used only by opentelemetry parser. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5893	2024-02-29 14:47:20 +02:00
Andrii Chubatiuk	60cf0c9656	{vmagent,vminsert}: added firehose http destination opentelemetry data ingestion support (#5893 ) Co-authored-by: Andrii Chubatiuk <wachy@Andriis-MBP-2.lan> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-02-29 14:46:16 +02:00
Aliaksandr Valialkin	8187244153	lib/streamaggr: make the BenchmarkAggregatorsPushByJobAvg closer to production case with long list of labels per sample	2024-02-29 02:41:48 +02:00
Hui Wang	d6ecfffa17	chore: add actual request size in error message (#5889 )	2024-02-29 02:40:57 +02:00
Aliaksandr Valialkin	d845edc24b	lib: consistently use atomic.* types instead of atomic.* functions See `ea9e2b19a5`	2024-02-24 02:10:04 +02:00
Aliaksandr Valialkin	61519f6c22	lib/backup/actions: expose vm_backups_downloaded_bytes_total metric in order to be consistent with vm_backups_uploaded_bytes_total metric	2024-02-24 01:14:57 +02:00
Aliaksandr Valialkin	510e3d9cda	lib/backup/actions: update vm_backups_uploaded_bytes_total metric along the file upload instead of after the file upload This solves two issues: 1. The vm_backups_uploaded_bytes_total metric will grow more smoothly 2. This prevents from int overflow at metrics.Counter.Add() when uploading files bigger than 2GiB	2024-02-24 01:08:34 +02:00
Aliaksandr Valialkin	0ac1c533dc	lib/backup/actions: consistently use atomic.* types instead of atomic.* functions See `ea9e2b19a5`	2024-02-24 01:02:37 +02:00
Aliaksandr Valialkin	6fd6d4c2de	lib/storage: replace the remaining atomic.* functions with atomic.* types for the sake of consistency See `ea9e2b19a5`	2024-02-24 00:51:03 +02:00
Aliaksandr Valialkin	a1baf25c2e	lib/storage: consistently use atomic.* types instead of atomic.* function calls on ordinary types See `ea9e2b19a5`	2024-02-24 00:33:07 +02:00
Aliaksandr Valialkin	ca1e78bd16	lib/logstorage: consistently use atomic.* types instead of atomic.* functions on regular types See `ea9e2b19a5`	2024-02-24 00:29:39 +02:00
Aliaksandr Valialkin	d0538d11d3	lib/mergeset: consistently use atomic.* types instead of atomic.* function calls on ordinary types See `ea9e2b19a5`	2024-02-24 00:29:12 +02:00
Aliaksandr Valialkin	92e098012a	lib/logstorage: consistently use atomic.* type for refCount and mustDrop fields in datadb and storage structs in the same way as it is used in lib/storage See `ea9e2b19a5` and `a204fd69f1`	2024-02-24 00:28:56 +02:00
Aliaksandr Valialkin	7fa700a41c	lib/mergeset: consistently use atomic.* type for refCount and mustDrop fields in table struct in the same way as it is used in lib/storage See `ea9e2b19a5` and `a204fd69f1`	2024-02-24 00:28:37 +02:00
Aliaksandr Valialkin	e7dfcdfff6	lib/storage: consistently use atomic.* type for refCount and mustDrop fields in indexDB, table and partition structs See `ea9e2b19a5`	2024-02-24 00:26:26 +02:00
Aliaksandr Valialkin	e2b0cc873b	lib/storage: convert dedupsDuringMerge from uint64 to atomic.Uint64 This should simplify code maintenance by gradually converting to atomic.* types instead of calling atomic.* functions on int and bool types. See `ea9e2b19a5`	2024-02-24 00:25:44 +02:00
Aliaksandr Valialkin	1eb3346ecc	lib/{storage,mergeset}: properly fix 'unaligned 64-bit atomic operation' panic on 32-bit architectures The issue has been introduced in `bace9a2501` The improper fix was in the `d4c0615dcd` , since it fixed the issue just by an accident, because Go comiler aligned the rawRowsShards field by 4-byte boundary inside partition struct. The proper fix is to use atomic.Int64 field - this guarantees that the access to this field won't result in unaligned 64-bit atomic operation. See https://github.com/golang/go/issues/50860 and https://github.com/golang/go/issues/19057	2024-02-24 00:25:08 +02:00
Aliaksandr Valialkin	dc5b1e4dc1	lib/httpserver: return back the default value for -http.connTimeout to 2 minutes It has been appeared that there are VictoriaMetrics users, who rely on the fact that VictoriaMetrics components were closing incoming connections to -httpListenAddr every 2 minutes by default. So let's return back this value by default in order to fix the breaking change made at `d8c1db7953` . See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1304#issuecomment-1961891450 .	2024-02-24 00:20:11 +02:00
hagen1778	ab4fae9dc2	lib/storage: cleanup after `d4c0615dcd` Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `c8d1d2ab72`)	2024-02-23 18:55:40 +01:00
Dmytro Kozlov	eb22083924	lib/storage: fix aligning (#5860 ) (cherry picked from commit `d4c0615dcd`)	2024-02-23 18:55:39 +01:00
Aliaksandr Valialkin	2a5c6e1cd5	app/vmstorage: deprecate -snapshotCreateTimeout command-line flag Creating snapshot shouldn't time out under normal conditions. The timeout was related to the bug, which has been fixed in `6460475e3b` . Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3551	2024-02-23 04:51:57 +02:00
Aliaksandr Valialkin	42437e05c7	lib/storage: do not drop (date, metricID) entries for the date older than 2 days if samples are ingested at this date Previously the (date, metricID) entries for dates older than the last 2 days were removed. This could lead to slow check for the (date, metricID) entry in the indexdb during ingesting historical data (aka backfilling). The issue has been introduced in `431aa16c8d`	2024-02-23 04:06:54 +02:00
Aliaksandr Valialkin	83217b7473	app/vmselect: add -search.maxLabelsAPIDuration and -search.maxLabelsAPISeries options for fine-tuning CPU and RAM usage for /api/v1/series , /api/v1/labels and /api/v1/label/.../values This commit returns back limits for these endpoints, which have been removed at `5d66ee88bd` , since it has been appeared that missing limits result in high CPU usage, while the introduced concurrency limiter results in failed lightweight requests to these endpoints because of timeout when heavyweight requests are executed. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5055	2024-02-23 02:56:58 +02:00
Aliaksandr Valialkin	21170e558c	lib/promutils: hide the math.Round() logic inside ParseTimeMsec() function This should prevent from bugs similar to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5801 in the future This is a follow-up for `ce3ec3ff2e`	2024-02-23 01:21:42 +02:00
Aliaksandr Valialkin	dfcbcf4368	lib/mergeset: run `go fmt` after `bace9a2501`	2024-02-23 01:21:31 +02:00
Aliaksandr Valialkin	19032f9913	lib/{mergeset,storage}: convert bufferred items to searchable parts more optimally Do not convert shard items to part when a shard becomes full. Instead, collect multiple full shards and then convert them to a searchable part at once. This reduces the number of searchable parts, which, in turn, should increase query performance, since queries need to scan smaller number of parts.	2024-02-23 01:21:03 +02:00
Nikolay	22762d7a69	app/vmselect: change export/csv timestamp format for rfc3339 to respect milliseconds (#5853 ) * app/vmselect: adds milliseconds to the csv export response for rfc3339 * milliseconds is a standard prescion for VictoriaMetrics query request responses https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5837 * app/victoria-metrics: adds tests for csv export/import follow-up after 3541a8d0cf96dd4f8563624c4aab6816615d0756 --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com>	2024-02-23 01:16:08 +02:00
Aliaksandr Valialkin	08c5250a7b	lib/storage: handle common case when the number of rows passed to flushRowsToInmemoryParts() doesnt exceed maxRawRowsPerShard	2024-02-23 01:12:18 +02:00
Aliaksandr Valialkin	8669584e9f	lib/{storage,mergeset}: convert beffered items into searchable in-memory parts exactly once per the given flush interval Previously the interval between item addition and its conversion to searchable in-memory part could vary significantly because of too coarse per-second precision. Switch from fasttime.UnixTimestamp() to time.Now().UnixMilli() for millisecond precision. It is OK to use time.Now() for tracking the time when buffered items must be converted to searchable in-memory parts, since time.Now() calls aren't located in hot paths. Increase the flush interval for converting buffered samples to searchable in-memory parts from one second to two seconds. This should reduce the number of blocks, which are needed to be processed during high-frequency alerting queries. This, in turn, should reduce CPU usage. While at it, hardcode the maximum size of rawRows shard to 8Mb, since this size gives the optimal data ingestion pefromance according to load tests. This reduces memory usage and CPU usage on systems with big amounts of RAM under high data ingestion rate.	2024-02-23 01:11:57 +02:00
Aliaksandr Valialkin	5f1fa8e7f7	lib/storage: avoid superflouos copy of block header data	2024-02-23 01:11:31 +02:00
Aliaksandr Valialkin	a982ab6bfb	app/vmstorage: expose vm_snapshots metric, which shows the current number of snapshots While at it, refresh docs about snapshots - https://docs.victoriametrics.com/#how-to-work-with-snapshots	2024-02-23 01:07:04 +02:00
Aliaksandr Valialkin	3f9022bc08	lib/storage: do not pool rawRowsBlock when flushing rawRows to in-memory blocks The pooled rawRowsBlock objects occupies big amounts of memory between flushes, and the flushes are relatively rare. So it is better to don't use the pool and to allocate rawRow blocks on demand. This should reduce the average memory usage between flushes.	2024-02-23 01:06:28 +02:00
Aliaksandr Valialkin	bf07e2ac87	lib/storage: do not keep rawRows buffer across flush() calls The buffer can be quite big under high ingestion rate (e.g. more than 100MB). This leads to increased memory usage between buffer flushes. So it is better to re-create the buffer on every flush in order to reduce memory usage between buffer flushes.	2024-02-23 01:06:09 +02:00
Alexander Marshalov	8322425364	[lib/httputils] fixed floating-point error when parsing time in RFC3339 format (#5814 ) * [lib/promutils, lib/httputils] fixed floating-point error when parsing time in RFC3339 format (#5801) * fixed tests * fixed test * Revert "fixed test" This reverts commit `8a29764806`. * Revert "fixed tests" This reverts commit `9ce13d1042`. * Revert "[lib/promutils, lib/httputils] fixed floating-point error when parsing time in RFC3339 format (#5801)" This reverts commit `a7a04bd4` * [lib/httputils] fixed floating-point error when parsing time in RFC3339 format (#5801) --------- Co-authored-by: Nikolay <nik@victoriametrics.com>	2024-02-23 00:58:26 +02:00

1 2 3 4 5 ...

2503 Commits