VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-12-22 16:36:27 +01:00

Author	SHA1	Message	Date
Aliaksandr Valialkin	cf64597878	all: add support for specifying multiple -httpListenAddr options	2024-02-09 03:22:49 +02:00
Aliaksandr Valialkin	ae7da12280	lib/httpserver: do not close client connections every 2 minutes by default Closing client connections every 2 minutes doesn't help load balancing - this just leads to "jumpy" connections between multiple backend servers, e.g. the load isn't spread evenly among backend servers, and instead jumps between the servers every 2 minutes. It is still possible periodically closing client connections by specifying non-zero -http.connTimeout command-line flag. This should help with https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1304#issuecomment-1636997037 This is a follow-up for `d387da142e`	2024-02-08 21:10:54 +02:00
Khushi Jain	a076cb4a93	app/vmbackup: support client-side TLS configuration for create/delete snapshot API (#5738 ) (cherry picked from commit `83e55456e2`)	2024-02-08 15:58:34 +01:00
Aliaksandr Valialkin	1856c9fcc1	lib/mergeset: add a test for too long item passed to Table.AddItems()	2024-02-08 14:14:23 +02:00
Aliaksandr Valialkin	d2a846eddd	lib/mergeset: typo fix: indexdb/indexBlock -> indexdb/indexBlocks	2024-02-08 14:14:23 +02:00
Aliaksandr Valialkin	950b126a09	lib/{storage,mergeset}: do not create index blocks with sizes exceeding 64Kb in common case This should reduce memory fragmentation and memory usage for indexdb/indexBlocks and storage/indexBlocks caches	2024-02-08 14:14:22 +02:00
Aliaksandr Valialkin	1c3eac5c1e	lib/mergeset: verify that the index block for in-memory part doesnt exceed the 3*maxIndexBlockSize	2024-02-08 14:14:22 +02:00
Aliaksandr Valialkin	9a3a88b321	lib/mergeset: do not store commonPrefix in blockHeader if the block contains only a single item There is no sense in storing commonPrefix for blockHeader containing only a single item, since this only increases blockHeader size without any benefits.	2024-02-08 14:14:22 +02:00
Aliaksandr Valialkin	ae2a9c8195	lib/mergeset: prevent from possible `too big indexBlockSize` panic This panic could occur when samples with too long label values are ingested into VictoriaMetrics. This could result in too long fistItem and commonPrefix values at blockHeader (up to 64kb each). This may inflate the maximum index block size by 4 * maxIndexBlockSize.	2024-02-08 12:55:58 +02:00
Aliaksandr Valialkin	ec02e9ba19	lib/protoparser/datadogsketches: use math.RoundToEven() for calculating the rank The original code uses this function - see `48d52eeea6/pkg/quantile/sparse.go (L138)` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5775	2024-02-07 21:45:05 +02:00
Aliaksandr Valialkin	28fffdfcc7	lib/protoparser/datadogsketches: add more permalinks to the original source code These permalinks should help verifying the correctness of the code This is a follow-up after `07213f4e0c` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5775	2024-02-07 21:45:05 +02:00
Andrii Chubatiuk	3aa439a618	added ddsketch permalink (#5775 ) Co-authored-by: Andrew Chubatiuk <andrew.chubatiuk@motional.com>	2024-02-07 21:45:04 +02:00
Aliaksandr Valialkin	5d9e0ab71e	docs/CHANGELOG.md: support empty command-line flag values in short array notation For example, -fooDuration=',10s,' is now supported - it sets three command-line flag values: - the first and the last one are set to the default value for `-fooDuration` - the second one is set to 10s	2024-02-07 20:55:01 +02:00
Aliaksandr Valialkin	82f4e4e070	app/{vmagent,vminsert}: follow-up after `a1d1ccd6f2` - Document the change at docs/CHANGELOG.md - Copy changes from docs/Single-server-VictoriaMetrics.md to README.md - Add missing handler for processing multitenant requests ( https://docs.victoriametrics.com/vmagent/#multitenancy ) - Substitute github.com/stretchr/testify dependency with 3 lines of code in the added tests - Comment unclear code at lib/protoparser/datadogsketches/parser.go , so @AndrewChubatiuk could update it and add permalinks to the original source code there. - Various code cleanups Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5584 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3091	2024-02-07 01:31:52 +02:00
Andrii Chubatiuk	c634859c4f	support datadog /api/beta/sketches API (#5584 ) Co-authored-by: Andrew Chubatiuk <andrew.chubatiuk@motional.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-02-07 01:30:00 +02:00
Aliaksandr Valialkin	293617028d	lib/storage: move fixupTimestamps() call to Block.Init() This is a follow-up for `0bf7921721`	2024-02-06 22:44:09 +02:00
Zakhar Bessarab	fdbc44d813	lib/storage/raw_row: properly initialize TS for tmp blocks (#5762 ) Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2024-02-06 22:44:08 +02:00
Aliaksandr Valialkin	e19b53748a	lib/fs: lazily open the file at ReaderAt on the first access This should significantly reduce the number of open ReaderAt files on VictoriaMetrics and VictoriaLogs startup. The open files can be tracked via vm_fs_readers metric	2024-02-06 21:10:00 +02:00
Aliaksandr Valialkin	bace92fab6	lib/httpserver: add support for mTLS for requests to -httpListenAddr	2024-02-06 17:47:27 +02:00
Aliaksandr Valialkin	f222cf9200	lib/cgroup: remove SetGOGC() function GOGC can be already set via environment variable. There is no need in adding new approaches for setting the GOGC (such as command-line flag), since they complicate operations.	2024-02-05 12:13:08 +02:00
Aliaksandr Valialkin	8148cc52c9	lib/prompbmarshal: code cleanup after `8aaa828ba3`	2024-02-01 21:41:10 +02:00
Aliaksandr Valialkin	7a9f0b32a2	app/vmselect/netstorage: prevent from disk write IO when closing temporary files Remove temporary file before closing it in order to signal the OS that it shouldn't store the file contents from page cache to disk when the file is closed. Gracefully handle the case when the file cannot be removed before being closed - in this case remove the file after closing it. This allows working on Windows. Also remove superflouos opening of temporary file for reading - re-use already opened file handle for writing. This is a follow-up for `9b1e002287` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4020 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70	2024-02-01 19:54:48 +02:00
Dima Lazerka	d561f506cd	Improve docs on security http headers (#5262 ) * Improve docs on security http headers * Apply suggestions from code review --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-02-01 14:40:57 +02:00
noodles2hg	60a8e59366	lib/logstorage: proper exit during block search (#5400 )	2024-02-01 14:11:20 +02:00
Jiajing LU	9c75e3ee15	count inmemoryParts that have not been taken for merge (#5447 )	2024-02-01 14:07:13 +02:00
Aliaksandr Valialkin	6c56f49f9c	lib/prompbmarshal: return back custom protobuf marshaler for lib/prompbmarshal.WriteRequest The easyproto-based marshaler is 2x slower than the previous custom marshaler, so let's stick with it. This improves the performance for sending data to remote storage at vmagent and reduces CPU usage to pre-v1.97.0 levels.	2024-02-01 06:34:46 +02:00
Aliaksandr Valialkin	faeabfc730	lib/encoding: follow-up for `49e3665d6d` Improve performance for typical cases of varint marshaling / unmarshaling further. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5721	2024-02-01 05:38:58 +02:00
Fuchun Zhang	78af9b3e30	make encoding.MarshalVarInt64s faster (#5721 ) * make encoding.MarshalVarInt64s faster * add fast path for MarshalVarInt64s * make UnmarshalVarUint64s faster * remove comment	2024-02-01 03:33:59 +00:00
Aliaksandr Valialkin	eee210810e	lib/encoding: added benchmarks for marshaling / unmarshaling of varints This is needed for https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5721	2024-02-01 05:11:35 +02:00
helen	99ea84f0fd	clean unused code (#5735 ) Signed-off-by: helen <haitao.zhang@daocloud.io>	2024-01-31 19:51:35 +02:00
Aliaksandr Valialkin	cc626ae3b5	lib/promauth: follow-up for `fca3b14b7b` - Simplify the code for handling BasicAuthConfig at lib/promauth/config.go - Move the description of the change into correct place at docs/CHANGELOG.md - Put tests for username in front of tests for password at lib/promauth/config_test.go Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5720 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5511	2024-01-31 19:47:53 +02:00
Nihal	bcd094ac8b	Support for username_file in scrape config (basic_auth) similar to Prometheus for having config compatibility (#5720 ) * adding support for username_file in basic_auth of scrape config Signed-off-by: Syed Nihal <syed.nihal@nokia.com> * adding support for username_file in basic_auth of scrape config. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5511 Signed-off-by: Syed Nihal <syed.nihal@nokia.com> * adding support for username_file in basic_auth of scrape config Signed-off-by: Syed Nihal <syed.nihal@nokia.com> * adding support for username_file in basic_auth of scrape config Signed-off-by: Syed Nihal <syed.nihal@nokia.com> * adding support for username_file in basic_auth of scrape config Signed-off-by: Syed Nihal <syed.nihal@nokia.com> --------- Signed-off-by: Syed Nihal <syed.nihal@nokia.com>	2024-01-31 19:47:50 +02:00
Aliaksandr Valialkin	09c388a8e4	lib/promscrape: use the standard net/http.Client instead of fasthttp.Client for scraping targets in non-streaming mode While fasthttp.Client uses less CPU and RAM when scraping targets with small responses (up to 10K metrics), it doesn't work well when scraping targets with big responses such as kube-state-metrics. In this case it could use big amounts of additional memory comparing to net/http.Client, since fasthttp.Client reads the full response in memory and then tries re-using the large buffer for further scrapes. Additionally, fasthttp.Client-based scraping had various issues with proxying, redirects and scrape timeouts like the following ones: - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1945 - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5425 - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2794 - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1017 This should help reducing memory usage for the case when target returns big response and this response is scraped by fasthttp.Client at first before switching to stream parsing mode for subsequent scrapes. Now the switch to stream parsing mode is performed on the first scrape after reading the response body in memory and noticing that its size exceeds the value passed to -promscrape.minResponseSizeForStreamParse command-line flag. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5567 Overrides https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4931	2024-01-30 18:39:55 +02:00
Aliaksandr Valialkin	645365b2d1	lib/promscrape: fix BenchmarkScrapeWorkScrapeInternal, which has been broken by the commit `65bc460323`	2024-01-30 16:07:40 +02:00
Aliaksandr Valialkin	61562cdee9	lib/storage: keep (date, metricID) entries only for the last two dates Entries for the previous dates is usually not used, so there is little sense in keeping them in memory. This should reduce the size of storage/date_metricID cache, which can be monitored via vm_cache_entries{type="storage/date_metricID"} metric.	2024-01-29 18:44:27 +01:00
hagen1778	2ff94b2bfa	lib/streamaggr: fix incorrect err message for min `interval` value Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-01-29 17:27:23 +01:00
Aliaksandr Valialkin	f5559c038c	lib/storage: do not check the limit for -search.maxUniqueTimeseries when performing /api/v1/labels and /api/v1/label/.../values requests This limit has little sense for these APIs, since: - Thses APIs frequently result in scanning of all the time series on the given time range. For example, if extra_filters={datacenter="some_dc"} . - Users expect these APIs shouldn't hit the -search.maxUniqueTimeseries limit, which is intended for limiting resource usage at /api/v1/query and /api/v1/query_range requests. Also limit the concurrency for /api/v1/labels, /api/v1/label/.../values and /api/v1/series requests in order to limit the maximum memory usage and CPU usage for these API. This limit shouldn't affect typical use cases for these APIs: - Grafana dashboard load when dashboard labels should be loaded - Auto-suggestion list load when editing the query in Grafana or vmui Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5055	2024-01-29 16:44:46 +01:00
Aliaksandr Valialkin	412f872597	lib/decimal: follow-up for `e6bad5174f` - Add a benchmark for CalbirateAndScale. - Reduce the decimal multipliers table size from 256Kb to 192bytes. - Use more clear naming for variables. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5672	2024-01-27 00:08:32 +01:00
Fuchun Zhang	e6bad5174f	Optimize the performance of data merge: decimal.CalibrateScale() (#5672 ) * Optimize the performance of data merge: decimal.CalibrateScale() from 49633 ns/op to 9146 ns/op * Optimize the performance of data merge: decimal.CalibrateScale()	2024-01-27 00:05:04 +01:00
Hui Wang	f579adf05f	add inserting comma inside value instruction to flag description (#5666 )	2024-01-26 22:47:33 +01:00
Roman Khavronenko	9e9f170fe7	lib/streamaggr: skip unfinished aggregation state on shutdown by default (#5689 ) Sending unfinished aggregate states tend to produce unexpected anomalies with lower values than expected. The old behavior can be restored by specifying `flush_on_shutdown: true` setting in streaming aggregation config Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-01-26 22:45:45 +01:00
Aliaksandr Valialkin	7a8b92b590	lib/{mergeset,storage}: make background merge more responsive and scalable - Maintain a separate worker pool per each part type (in-memory, file, big and small). Previously a shared pool was used for merging all the part types. A single merge worker could merge parts with mixed types at once. For example, it could merge simultaneously an in-memory part plus a big file part. Such a merge could take hours for big file part. During the duration of this merge the in-memory part was pinned in memory and couldn't be persisted to disk under the configured -inmemoryDataFlushInterval . Another common issue, which could happen when parts with mixed types are merged, is uncontrolled growth of in-memory parts or small parts when all the merge workers were busy with merging big files. Such growth could lead to significant performance degradataion for queries, since every query needs to check ever growing list of parts. This could also slow down the registration of new time series, since VictoriaMetrics searches for the internal series_id in the indexdb for every new time series. The third issue is graceful shutdown duration, which could be very long when a background merge is running on in-memory parts plus big file parts. This merge couldn't be interrupted, since it merges in-memory parts. A separate pool of merge workers per every part type elegantly resolves both issues: - In-memory parts are merged to file-based parts in a timely manner, since the maximum size of in-memory parts is limited. - Long-running merges for big parts do not block merges for in-memory parts and small parts. - Graceful shutdown duration is now limited by the time needed for flushing in-memory parts to files. Merging for file parts is instantly canceled on graceful shutdown now. - Deprecate -smallMergeConcurrency command-line flag, since the new background merge algorithm should automatically self-tune according to the number of available CPU cores. - Deprecate -finalMergeDelay command-line flag, since it wasn't working correctly. It is better to run forced merge when needed - https://docs.victoriametrics.com/#forced-merge - Tune the number of shards for pending rows and items before the data goes to in-memory parts and becomes visible for search. This improves the maximum data ingestion rate and the maximum rate for registration of new time series. This should reduce the duration of data ingestion slowdown in VictoriaMetrics cluster on e.g. re-routing events, when some of vmstorage nodes become temporarily unavailable. - Prevent from possible "sync: WaitGroup misuse" panic on graceful shutdown. This is a follow-up for `fa566c68a6` . Thanks @misutoth to for the inspiration at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5212 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5190 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3790 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3551 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3337 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3425 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3647 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3641 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/291	2024-01-26 22:19:52 +01:00
Aliaksandr Valialkin	c067f3f288	lib/mergeset: remove inmemoryBlock pooling, since it wasn't effecitve This should reduce memory usage a bit when new time series are ingested at high rate (aka high churn rate)	2024-01-26 21:34:22 +01:00
Aliaksandr Valialkin	230ef43a32	lib/logstorage: make sure that WaitGroup.Add isnt called after stopCh is closed and WaitGroup.Wait is called This protects from rare panic, which may occur during graceful shutdown of VictoriaLogs	2024-01-26 21:18:07 +01:00
Aliaksandr Valialkin	0715f1efcd	lib/storage: rename AssistedMerges to AssistedMergesCount in order to make these field names less misleading These fields are counters, not gauges, so adding Count suffix to them makes easier to understand this while reading the code	2024-01-25 10:21:13 +02:00
Aliaksandr Valialkin	1cdef56d84	lib/mergeset: start assisted merge for file parts only if the number of file parts is bigger than maxFileParts The maxFileParts usage has been accidentally removed in `fa566c68a6` While at it, add Count suffix to *AssistedMerges counter names in order to make them less misleading. Previously their names were falsely suggesting that these are gauges, which show the number of concurrently executed assisted merges.	2024-01-24 15:10:48 +02:00
Aliaksandr Valialkin	b8c7f0d3bc	lib/promscrape/discovery/kubernetes: typo fix in the comment for ContainerStateTerminated struct This is a follow-up for `ef12598ad4`	2024-01-24 15:10:47 +02:00
Aliaksandr Valialkin	1e364c992d	lib/promscrape/discovery/kubernetes: do not generate targets for already terminated pods and containers Already terminated pods and containers cannot be scraped and will never resurrect, so there is zero sense in creating scrape targets for them.	2024-01-24 14:58:51 +02:00
Aliaksandr Valialkin	e6e5b97e1e	lib/streamaggr: expand `%{ENV}` placeholders in stream aggregation configs	2024-01-24 12:31:42 +02:00
Aliaksandr Valialkin	12698b9136	lib/mergeset: really limit the number of in-memory parts to 15 It has been appeared that the registration of new time series slows down linearly with the number of indexdb parts, since VictoriaMetrics needs to check every indexdb part when it searches for TSID by newly ingested metric name. The number of in-memory parts grows when new time series are registered at high rate. The number of in-memory parts grows faster on systems with big number of CPU cores, because the mergeset maintains per-CPU buffers with newly added entries for the indexdb, and every such entry is transformed eventually into a separate in-memory part. The solution has been suggested in https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5212 by @misutoth - to limit the number of in-memory parts with buffered channel. This solution is implemented in this commit. Additionally, this commit merges per-CPU parts into a single part before adding it to the list of in-memory parts. This reduces CPU load when searching for TSID by newly ingested metric name. The https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5212 recommends setting the limit on the number of in-memory parts to 100, but my internal testing shows that much lower limit 15 works with the same efficiency on a system with 16 CPU cores while reducing memory usage for `indexdb/dataBlocks` cache by up to 50%. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5190	2024-01-24 03:41:19 +02:00
Aliaksandr Valialkin	8dd73574ca	lib/encoding: remove uneeded re-slicing of byte slice before passing it to binary.BigEndian.Uint*	2024-01-23 22:50:11 +02:00
Aliaksandr Valialkin	5a97668ad6	lib/handshake: substitute time.Now() with fastttime.UnixTimestamp(), since profiling shows time.Now() is slow	2024-01-23 18:39:28 +02:00
Aliaksandr Valialkin	3199558da9	lib/{storage,mergeset}: reduce the maxium compression level for the stored data This reduces CPU usage a bit, while doesn't increase resulting file sizes according to synthetic tests.	2024-01-23 17:47:40 +02:00
Aliaksandr Valialkin	68d76b1436	lib/storage: compress metricIDs, which match the given filters, before storing them in tagFiltersToMetricIDsCache This allows reducing the indexdb/tagFiltersToMetricIDs cache size by 8 on average. The cache size can be checked via vm_cache_size_bytes{type="indexdb/tagFiltersToMetricIDs"} metric exposed at /metrics page.	2024-01-23 16:13:25 +02:00
Aliaksandr Valialkin	9b3217db61	lib/storage: do not sort metricIDs passed to Storage.prefetchMetricNames, since the caller is responsible for the sorting	2024-01-23 16:13:19 +02:00
Aliaksandr Valialkin	7ed7eb95b4	lib/filestream: do not measure read / write duration from / to in-memory buffers Measuring read / write duration from / to in-memory buffers has little sense, since it will be always fast. It is better to measure read / write duration from / to real files at vm_filestream_write_duration_seconds_total and vm_filestream_read_duration_seconds_total metrics. This also reduces overhead on time.Now() and Histogram.UpdateDuration() calls per each filestream.Reader.Read() and filestream.Writer.Write() call when the data is read / written from / to in-memory buffers. This is a follow-up for `2f63dec2e3`	2024-01-23 14:53:35 +02:00
Roman Khavronenko	8461add541	lib/promscrape: respect `0` value for `series_limit` param (#5663 ) * lib/promscrape: respect `0` value for `series_limit` param Respect `0` value for `series_limit` param in `scrape_config` even if global limit was set via `-promscrape.seriesLimitPerTarget`. Previously, `0` value will be ignored in favor of `-promscrape.seriesLimitPerTarget`. This behavior aligns with possibility to override `series_limit` value via relabeling with `__series_limit__` label. Signed-off-by: hagen1778 <roman@victoriametrics.com> * Update docs/CHANGELOG.md --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-01-23 13:09:36 +02:00
Aliaksandr Valialkin	c2927053ee	lib/mergeset: make sure that the first and the last items are in the original range after prepareBlock() Previously the checks were to strict by requiring to leave the same first and last items by prepareBlock() Thanks to @ahfuzhang for the suggestion at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5655	2024-01-23 12:59:04 +02:00
Aliaksandr Valialkin	389159767d	lib/mergeset: skip comparison for every item in the block during merge if the last item in the block is smaller than the first item in the next block Thanks to @ahfuzhang for the suggestion at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5651	2024-01-23 03:16:30 +02:00
Zakhar Bessarab	60ef978ffc	lib/storage: print tenant ID in log when discarding or truncating labels (#5658 ) Previously, it was not possible to determine which tenant sends metrics with excessive amount of labels of label values. Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-01-23 02:27:59 +02:00
Aliaksandr Valialkin	d52fd73f18	all: add up to 10% random jitter to the interval between periodic tasks performed by various components This should smooth CPU and RAM usage spikes related to these periodic tasks, by reducing the probability that multiple concurrent periodic tasks are performed at the same time.	2024-01-22 18:39:16 +02:00
Aliaksandr Valialkin	64e615e6cc	lib/storage: reduce the contention on dateMetricIDCache mutex when new time series are registered at high rate The dateMetricIDCache puts recently registered (date, metricID) entries into mutable cache protected by the mutex. The dateMetricIDCache.Has() checks for the entry in the mutable cache when it isn't found in the immutable cache. Access to the mutable cache is protected by the mutex. This means this access is slow on systems with many CPU cores. The mutabe cache was merged into immutable cache every 10 seconds in order to avoid slow access to mutable cache. This means that ingestion of new time series to VictoriaMetrics could result in significant slowdown for up to 10 seconds because of bottleneck at the mutex. Fix this by merging the mutable cache into immutable cache after len(cacheItems) / 2 cache hits under the mutex, e.g. when the entry is found in the mutable cache. This should automatically adjust intervals between merges depending on the addition rate for new time series (aka churn rate): - The interval will be much smaller than 10 seconds under high churn rate. This should reduce the mutex contention for mutable cache. - The interval will be bigger than 10 seconds under low churn rate. This should reduce the uneeded work on merging of mutable cache into immutable cache.	2024-01-22 18:14:30 +02:00
Aliaksandr Valialkin	c6f6f094c5	Revert "lib/promscrape: do not store last scrape response when stale markers … (#5577 )" This reverts commit `cfec258803`. Reason for revert: the original code already doesn't store the last scrape response when stale markers are disabled. The scrapeWork.areIdenticalSeries() function always returns true is stale markers are disabled. This prevents from storing the last response at scrapeWork.processScrapedData(). It looks like the reverted commit could also return back the issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3660 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5577	2024-01-22 01:46:12 +02:00
Aliaksandr Valialkin	d4a1a28543	app/vmselect: handle negative time range start in a generic manner inside NewSearchQuery() This is a follow-up for `cf03e11d89` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5553 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5630	2024-01-22 01:39:27 +02:00
Hui Wang	49fa92c1d0	lib/promscrape/discovery/kubernetes: fix watcher start order for roles endpoints and endpointslice (#5557 ) * lib/promscrape/discovery/kubernetes: fix watcher start order for roles endpoints and endpointslice Previously the groupWatcher could be mistakenly stopped when requests for pod or services resources take too long. * remove mislead comment * docs/sd_configs.md: mention -promscrape.kubernetes.attachNodeMetadataAll flag in the description for attach_metadata section Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4640 * wip * lib/promscrape/kubernetes: prevent from stopping groupWatcher when there are in-flight apiWatcher.mustStart() calls groupWatcher is stopped if it has zero registered apiWatchers during 14 seconds. But such a groupWatcher can be still in use if apiWatcher for `role: endpoints` or `role: endpointslice` is being registered and the discovery of the associated `pod` and/or `service` objects takes longer than 14 seconds - see the beginning of groupWatcher.startWatchersForRole() function for details. Track the number of in-flight calls to apiWatcher.mustStart() and prevent from stopping the associated groupWatcher if the number of in-flight calls is non-zero. P.S. postponing the discovery of `pod` and/or `service` objects associated with `endpoints` or `endpointslice` roles isn't the best solution, since it slows down initial discovery of `endpoints` and `endpointslice` targets. * typo fix --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-01-22 01:33:17 +02:00
Aliaksandr Valialkin	885ee160c2	all: allow dynamically reading *AuthKey flag values from files and urls Examples: 1) -metricsAuthKey=file:///abs/path/to/file - reads flag value from the given absolute filepath 2) -metricsAuthKey=file://./relative/path/to/file - reads flag value from the given relative filepath 3) -metricsAuthKey=http://some-host/some/path?query_arg=abc - reads flag value from the given url The flag value is automatically updated when the file contents changes.	2024-01-22 01:23:23 +02:00
Aliaksandr Valialkin	5f5fcab217	all: call atomic.Load* in front of atomic.CompareAndSwap* at places where the atomic.CompareAndSwap* returns false most of the time This allows avoiding slow inter-CPU synchornization induced by atomic.CompareAndSwap*	2024-01-22 01:13:41 +02:00
Aliaksandr Valialkin	be5faef552	lib/promscrape: code cleanup: send stale markers immediately after generating automatic metrics This cleanup has been extracted from https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5557/files#diff-6b205cf6637d7b65a5c45d9417d08822d4efad94227268cb196f61aa2a0fc0f7	2024-01-22 01:12:56 +02:00
Aliaksandr Valialkin	e15f07d989	all: consistently clear prompbmarshal.Label by assigning an empty struct instead of zeroing Name and Value individually	2024-01-22 01:11:59 +02:00
Aliaksandr Valialkin	2f94bef59c	lib/storage/partition.go: remove misleading comment, which falsely states that inmemoryParts isn't visible to search Thanks to @satjd for raising attention to this comment at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5410	2024-01-22 01:11:36 +02:00
Aliaksandr Valialkin	2c7c812a9d	lib/promscrape/discovery/kubernetes: add -promscrape.kubernetes.attachNodeMetadataAll command-line flag This flag allows setting attach_metadata.node=true for all the kubernetes_sd_configs defined at -promscrape.config Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4640 Thanks to wasim-nihal for the initial implementation at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5593	2024-01-22 01:08:52 +02:00
Nikolay	e196c61e36	app/vmselect: abort streaming connections for vmselect (#5650 ) * app/vmselect: abort streaming connections for vmselect due to streaming nature of export APIs, curl and simmilr tools cannot detect errors that happened after http.Header with status 200 was written to it. This PR tracks if body write was already started and closes connection. It allows client to detect not expected chunk sequence and return error to the caller. Mostly it affects vmselect at cluster version https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5645 * wip Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5645 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5650 --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-01-22 00:54:32 +02:00
Aliaksandr Valialkin	c05982bfa7	lib/promscrape/discovery/hetzner: follow-up after `03a97dc678` - docs/sd_configs.md: moved hetzner_sd_configs docs to the correct place according to alphabetical order of SD names, document missing __meta_hetzner_role label. - lib/promscrape/config.go: added missing MustStop() call for Hetzner SD, and moved the code to the correct place according to alphabetical order of SD names. - lib/promscrape/discovery/hetzner: properly handle pagination for hloud API responses, populate missing __meta_hetzner_role label like Prometheus does. - Properly populate __meta_hetzner_public_ipv6_network label like Prometheus does. - Remove unused SDConfig.Token. - Remove "omitempty" annotation from SDConfig.Role field, since this field is mandatory. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5550 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3154	2024-01-22 00:53:23 +02:00
Hui Wang	66eb013b54	lib/promscrape: do not store last scrape response when stale markers … (#5577 ) * lib/promscrape: do not store last scrape response when stale markers are disabled * update changelog	2024-01-22 00:52:25 +02:00
Aliaksandr Valialkin	41d6c8a7dd	lib/storage: do not prefetch metric names for small number of metricIDs This eliminates prefetchedMetricIDsLock lock contention for queries, which return less than 500 time series. This is a follow-up for `9d886a2eb0`	2024-01-17 13:50:01 +02:00
Aliaksandr Valialkin	09f23b0296	lib/promscrape: cosmetic changes after `3ac44baebe` - Rename mustLoadScrapeConfigFiles() to loadScrapeConfigFiles(), since now it may return error. - Split too long line with the error message into two lines in order to improve readability a bit. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5508 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5560	2024-01-17 01:07:16 +02:00
Aliaksandr Valialkin	75c58ab306	lib/httputils: handle step=undefined query arg as an empty value This is needed for Grafana, which may send step=undefined when working with alerting rules and instant queries.	2024-01-17 00:13:04 +02:00
Aliaksandr Valialkin	f673039e86	lib/storage: follow-up for `4b8088e377` - Clarify the bugfix description at docs/CHANGELOG.md - Simplify the code by accessing prefetchedMetricIDs struct under the lock instead of using lockless access to immutable struct. This shouldn't worsen code scalability too much on busy systems with many CPU cores, since the code executed under the lock is quite small and fast. This allows removing cloning of prefetchedMetricIDs struct every time new metric names are pre-fetched. This should reduce load on Go GC, since the cloning of uin64set.Set struct allocates many new objects.	2024-01-16 22:38:57 +02:00
Hui Wang	2f40ed3aac	exit vmagent if there is config syntax error in `scrape_config_files` when `-promscrape.config.strictParse=true` (#5560 )	2024-01-16 22:35:18 +02:00
Aliaksandr Valialkin	6ba2fd3312	app/vmselect/promql: follow-up for `ce4f26db02` - Document the bugfix at docs/CHANGELOG.md - Filter out NaN values before sorting as suggested at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5509#discussion_r1447369218 - Revert unrelated changes in lib/filestream and lib/fs - Use simpler test at app/vmselect/promql/exec_test.go Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5509 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5506	2024-01-16 22:13:13 +02:00
Zongyang	cb37df5723	FIX bottomk doesn't return any data when there are no time range overlap between timeseries (#5509 ) * FIX sort order in bottomk * Add lessWithNaNsReversed for bottomk * Add ut for TopK * Move lt from loop * FIX lint * FIX lint * FIX lint * Mod log format --------- Co-authored-by: xiaozongyang <xiaozngyang@kanyun.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-01-16 22:12:49 +02:00
Aliaksandr Valialkin	724223fad4	lib/prompbmarshal: move WriteRequest proto definition to the correct place	2024-01-16 21:57:03 +02:00
Aliaksandr Valialkin	0196902b2e	lib/promscrape/discovery/hetzner: fix golangci-lint warnings after `03a97dc678` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5550	2024-01-16 21:51:48 +02:00
Aliaksandr Valialkin	9e5e514faf	lib/pushmetrics: wait until the background goroutines, which push metrics, are stopped at pushmetrics.Stop() Previously the was a race condition when the background goroutine still could try collecting metrics from already stopped resources after returning from pushmetrics.Stop(). Now the pushmetrics.Stop() waits until the background goroutine is stopped before returning. This is a follow-up for https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5549 and the commit `fe2d9f6646` . Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5548	2024-01-16 21:18:22 +02:00
Aleksandr Stepanov	3a6e3adc7d	vmagent: added hetzner sd config (#5550 ) * added hetzner robot and hetzner cloud sd configs * remove gettoken fun and update docs * Updated CHANGELOG and vmagent docs * Updated CHANGELOG and vmagent docs --------- Co-authored-by: Nikolay <nik@victoriametrics.com>	2024-01-16 21:13:20 +02:00
Roman Khavronenko	d562d772a8	lib/storage: properly check for `storage/prefetchedMetricIDs` cache expiration deadline (#5607 ) Before, this cache was limited only by size. Cache invalidation by time happens with jitter to prevent thundering herd problem. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-01-16 21:08:59 +02:00
Aliaksandr Valialkin	d566aa7d78	lib/prompbmarshal: switch to github.com/VictoriaMetrics/easyproto	2024-01-16 20:48:30 +02:00
Aliaksandr Valialkin	f7b589e38a	lib/prompb: switch to github.com/VictoriaMetrics/easyproto	2024-01-16 20:43:09 +02:00
Aliaksandr Valialkin	7d40506744	lib/prompb: change type of Label.Name and Label.Value from []byte to string This makes it more consistent with lib/prompbmarshal.Label	2024-01-16 20:41:37 +02:00
Aliaksandr Valialkin	8cb138e8df	lib/protoparser/datadogv2: simplify code for parsing protobuf messages after `0597718435` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5094 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4451	2024-01-16 20:35:17 +02:00
Aliaksandr Valialkin	f8ae2abd88	lib/protoparser/opentelemetry: use github.com/VictoriaMetrics/easyproto for protobuf message unmarshaling and marshaling This reduces VictoriaMetrics binary size by 100KB. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2570 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2424	2024-01-16 20:34:18 +02:00
Aliaksandr Valialkin	9eef72bce9	lib/protoparser/datadogv2: add support for reading protobuf-encoded requests at /api/v2/series endpoint Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4451 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5094	2024-01-16 20:32:15 +02:00
Artem Navoiev	e1005209ba	docs: mention staleNaN handling during deduplication See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5587 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-01-16 20:11:45 +02:00
hagen1778	f301dc5cfb	lib/uint64: remove accidentally added test Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-01-09 13:32:22 +01:00
hagen1778	2a7207f38a	app/all: follow-up after `84d710beab` https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5548 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-01-09 13:17:09 +01:00
zhdd99	84d710beab	lib/pushmetrics: fix a panic caused by pushing metrics during the graceful shutdown process of vmstorage nodes. (#5549 ) Co-authored-by: zhangdongdong <zhangdongdong@kuaishou.com> Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>	2024-01-09 13:01:03 +01:00
Aliaksandr Valialkin	12de0d39eb	lib/protoparser/datadogv2: take into account source_type_name field, since it contains useful value such as kubernetes, docker, system, etc.	2023-12-21 23:05:52 +02:00
Aliaksandr Valialkin	6feef14095	lib/protoparser: add missing /datadog/ prefix to the /api/v2/series path in the description for -datadog.maxInsertRequestSize command-line flag	2023-12-21 21:05:24 +02:00
Aliaksandr Valialkin	62a105d9e9	app/{vminsert,vmagent}: preliminary support for /api/v2/series ingestion from new versions of DataDog Agent This commit adds only JSON support - https://docs.datadoghq.com/api/latest/metrics/#submit-metrics , while recent versions of DataDog Agent send data to /api/v2/series in undocumented Protobuf format. The support for this format will be added later. Thanks to @AndrewChubatiuk for the initial implementation at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5094 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4451	2023-12-21 20:50:27 +02:00
Aliaksandr Valialkin	426a451435	lib/promauth: add more context to errors returned by Options.NewConfig() in order to simplify troubleshooting	2023-12-20 21:58:19 +02:00

1 2 3 4 5 ...

2385 Commits