VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-11-23 20:37:12 +01:00

Author	SHA1	Message	Date
Aliaksandr Valialkin	7575f5c501	lib/protoparser/datadogv2: take into account source_type_name field, since it contains useful value such as kubernetes, docker, system, etc.	2023-12-21 23:05:41 +02:00
Aliaksandr Valialkin	b4ba8d0d76	lib/protoparser: add missing /datadog/ prefix to the /api/v2/series path in the description for -datadog.maxInsertRequestSize command-line flag	2023-12-21 21:04:53 +02:00
Aliaksandr Valialkin	fb90a56de2	app/{vminsert,vmagent}: preliminary support for /api/v2/series ingestion from new versions of DataDog Agent This commit adds only JSON support - https://docs.datadoghq.com/api/latest/metrics/#submit-metrics , while recent versions of DataDog Agent send data to /api/v2/series in undocumented Protobuf format. The support for this format will be added later. Thanks to @AndrewChubatiuk for the initial implementation at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5094 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4451	2023-12-21 20:50:55 +02:00
Aliaksandr Valialkin	01f9edda64	lib/promauth: add more context to errors returned by Options.NewConfig() in order to simplify troubleshooting	2023-12-20 21:58:12 +02:00
Aliaksandr Valialkin	160cc9debd	app/{vmagent,vmalert}: add the ability to set OAuth2 endpoint params via the corresponding *.oauth2.endpointParams command-line flags This is a follow-up for `5ebd5a0d7b` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5427	2023-12-20 21:35:28 +02:00
Morgan	5ebd5a0d7b	Expose OAuth2 Endpoint Parameters to cli (#5427 ) The user may which to control the endpoint parameters for instance to set the audience when requesting an access token. Exposing the parameters as a map allows for additional use cases without requiring modification.	2023-12-20 20:16:43 +02:00
Nikolay	7cfde237ec	lib/awsapi: properly assume role with webIdentity token (#5495 ) * lib/awsapi: properly assume role with webIdentity token introduce new irsaRoleArn param for config. It's only needed for authorization with webIdentity token. First credentials obtained with irsa role and the next sts assume call for an actual roleArn made with those credentials. Common use case for it - cross AWS accounts authorization https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3822 * wip --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-12-20 19:05:39 +02:00
Aliaksandr Valialkin	5a88bc973f	all: use Gauge instead of Counter for `*_config_last_reload_successful` metrics This allows exposing the correct TYPE metadata for these labels when the app runs with -metrics.exposeMetadata command-line flag. See https://github.com/VictoriaMetrics/metrics/pull/61#issuecomment-1860085508 for more details. This is follow-up for `326a77c697`	2023-12-20 14:23:42 +02:00
Aliaksandr Valialkin	326a77c697	all: add -metrics.exposeMetadata command-line flag, which can be used for adding TYPE and HELP metadata for metrics exposed at /metrics page This may be needed for systems, which require this metadata such as Google Cloud Managed Prometheus. See https://cloud.google.com/stackdriver/docs/managed-prometheus/troubleshooting#missing-metric-type	2023-12-19 03:20:40 +02:00
Aliaksandr Valialkin	4b529562ce	lib/pushmetrics: add -pushmetrics.header and -pushmetrics.disableCompression command-line flags	2023-12-17 19:56:46 +02:00
Aliaksandr Valialkin	0379a0eb82	lib/protoparser/opentelemetry: allow ingesting metrics without resource labels Some clients may ingest samples via OpenTelemetry protocol without Resource labels. Previously VictoriaMetrics was silently dropping such samples. The commit `317834f876` added vm_protoparser_rows_dropped_total{type="opentelemetry",reason="resource_not_set"} counter for tracking of such dropped samples. See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5459 It is better from usability PoV to accept such samples instead of dropping them and incrementing the corresponding counter.	2023-12-17 19:12:58 +02:00
Zakhar Bessarab	317834f876	lib/protoparser/opentelemetry: add metric to track skipped rows without resource (#5459 ) Currently, it is impossible to understand why metrics are not ingested when resource is not set by OTEL exporter. Adding metric should simplify debugging and make it improve debuggability. Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>	2023-12-15 11:16:25 +01:00
Aliaksandr Valialkin	72dbd24b22	lib/fs: remove unused IsEmptyDir() This function became unused after the commit `43b24164ef` The unused function has been found with deadode tool - https://go.dev/blog/deadcode	2023-12-14 19:38:53 +02:00
Aliaksandr Valialkin	0f91f83639	app/vmselect: add support for vmstorage groups with independent -replicationFactor per group Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5197 See https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#vmstorage-groups-at-vmselect Thanks to @zekker6 for the initial pull request at https://github.com/VictoriaMetrics/VictoriaMetrics-enterprise/pull/718	2023-12-13 00:14:45 +02:00
hagen1778	e0fc5ef140	lib/promscrape: comsetic changes after `e373bb84d5` * fix typos in docs * add `shard-` prefix to generated links when `-promscrape.cluster.memberURLTemplate` is enabled Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-12-12 11:28:18 +01:00
Aliaksandr Valialkin	51df2248f0	vendor: run `make vendor-update`	2023-12-11 10:48:36 +02:00
Aliaksandr Valialkin	042267541f	app/vmauth: add support for `hot standby` mode via `first_available` load balancing policy vmauth in `hot standby` mode sends requests to the first url_prefix while it is available. If the first url_prefix becomes unavailable, then vmauth falls back to the next url_prefix. This allows building highly available setup as described at https://docs.victoriametrics.com/vmauth.html#high-availability Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4893 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4792	2023-12-08 23:31:07 +02:00
Aliaksandr Valialkin	b05e1512d4	lib/promscrape: add a wraning when the /service-discovery page contains incomplete list of dropped targets	2023-12-08 19:03:51 +02:00
noodles2hg	8efe694160	lib/streamaggr/streamaggr.go: fix link in error message (#5439 )	2023-12-08 16:55:05 +03:00
Aliaksandr Valialkin	e373bb84d5	lib/promscrape: add `-promscrape.cluster.memberURLTemplate` command-line flag for creating direct links to vmagent instances at /service-discovery page See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4018#issuecomment-1843811569	2023-12-07 16:04:21 +02:00
Aliaksandr Valialkin	7cb8ed8271	lib/promscrape: show -promscrape.cluster.memberNum values for vmagent instances, which scrape the given dropped target at /service-discovery page The /service-discovery page contains the list of all the discovered targets after the commit `487f6380d0` on all the vmagent instances in cluster mode ( https://docs.victoriametrics.com/vmagent.html#scraping-big-number-of-targets ). This commit improves debuggability of targets in cluster mode by providing a list of -promscrape.cluster.memberNum values per each target at /service-discovery page, which has been dropped becasue of sharding, e.g. if this target is scraped by other vmagent instances in the cluster. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5389 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4018	2023-12-07 00:05:32 +02:00
Aliaksandr Valialkin	67468a0c46	lib/promscrape: show `never scraped` message for never scraped targets at /targets page	2023-12-06 22:33:39 +02:00
Aliaksandr Valialkin	65bc460323	lib/promscrape: follow-up for `97373b7786` Substitute O(N^2) algorithm for exposing the `vm_promscrape_scrape_pool_targets` metric with O(N) algorithm, where N is the number of scrape jobs. The previous algorithm could slow down /metrics exposition significantly when -promscrape.config contains thousands of scrape jobs. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5311 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5335	2023-12-06 17:35:50 +02:00
Hui Wang	97373b7786	vmagent: add `vm_promscrape_scrape_pool_targets` for scrape jobs like… (#5335 ) * vmagent: export `vm_promscrape_scrape_pool_targets` metric to track the number of targets that each scrape_job discovers * add extra panel for new metric	2023-12-06 15:44:39 +08:00
Aliaksandr Valialkin	06c73df55a	Revert "add datadog /api/v2/series and /api/beta/sketches support (#5094 )" This reverts commit `543f218fe9`. Reason for revert: https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5094#issuecomment-1839789080	2023-12-05 02:26:22 +02:00
Aliaksandr Valialkin	bc550e22d7	Revert "lib/protoparser/datadog: follow-up after 543f218fe96574b9b2189c8350bb09afa349e3bb" This reverts commit `98d0f81f21`. Reson for revert: see https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5094#issuecomment-1839789080	2023-12-05 02:19:29 +02:00
Aliaksandr Valialkin	0160435802	app/vmagent: code cleanup for Kafka and Google PubSub consumers / producers - Add links to relevant docs into descriptions for every -kafka.* and -gcp.pubsub.* command-line flags. - Wait until message processing goroutines are stopped before returning from gcppubsub.Stop(). - Prevent from multiple calls to Init() without Stop(). - Drop message if tenantID cannot be parsed properly. - Take into account tenantID for all the supported message formats. - Support gzip-compressed messages for graphite format. - Use exponential backoff sleep when the message cannot be pushed to remote storage systems because of disabled on-disk persistence - https://docs.victoriametrics.com/vmagent.html#disabling-on-disk-persistence - Unblock from sleep as soon as Stop() is called. Previously the sleep could take up to 2 seconds after Stop() is called. - Remove unused globalCtx and initContext from app/vmagent/remotewrite/gcppubsub - Mention Google PubSub support at docs/enterprise.md - Make Google PubSub docs more clear at docs/vmagent.md This is a follow-up for commits 115245924a5f096c5a3383d6cc8e8b6fbd421984 and e6eab781ce42285a6a1750dc01eba6801dd35516 . Updates https://github.com/VictoriaMetrics/VictoriaMetrics-enterprise/pull/717 Updates https://github.com/VictoriaMetrics/VictoriaMetrics-enterprise/pull/713	2023-12-04 22:46:28 +02:00
Aliaksandr Valialkin	f5c4fcc250	lib/backup: consistently use path.Join() when constructing paths for s3, gs and azblob E.g. replace `fs.Dir + filePath` with `path.Join(fs.Dir, filePath)` The fs.Dir is guaranteed to end with slash - see Init() functions. The filePath may start with slash. If it starts with slash, then `fs.Dir + filePath` constructs an incorrect path with double slashes. path.Join() properly substitutes duplicate slashes with a single slash in this case. While at it, also substitute incorrect usage of filepath.Join() with path.Join() for constructing paths to object storage systems, which expect forward slashes in paths. filepath.Join() substittues forward slashes with backslashes on Windows, so this may break creating or managing backups from Windows. This is a follow-up for 0399367be602b577baf6a872ca81bf0f99ba401b Updates https://github.com/VictoriaMetrics/VictoriaMetrics-enterprise/pull/719	2023-12-04 10:34:39 +02:00
Aliaksandr Valialkin	487f6380d0	lib/promscrape: show dropped targets because of sharding at /service-discovery page Previously the /service-discovery page didn't show targets dropped because of sharding ( https://docs.victoriametrics.com/vmagent.html#scraping-big-number-of-targets ). Show also the reason why every target is dropped at /service-discovery page. This should improve debuging why particular targets are dropped. While at it, do not remove dropped targets from the list at /service-discovery page until the total number of targets exceeds the limit passed to -promscrape.maxDroppedTargets . Previously the list was cleaned up every 10 minutes from the entries, which weren't updated for the last minute. This could complicate debugging of dropped targets. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5389	2023-12-01 16:48:48 +02:00
Aliaksandr Valialkin	ac65c6b178	lib/promrelabel: add `keep_if_contains` and `drop_if_contains` relabeling actions	2023-11-29 12:22:43 +02:00
Nikolay	41f7940f97	lib/streamaggr: properly reference slice with labels (#5406 ) * lib/streamaggr: properly reference slice with labels by limiting slice capacity. It must fix issues with slice modification, in case of append new slice will be allocated, instead of modifying refrenced slice https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5402 * Reduce memory allocations when output_relabel_configs adds new labels to output samples --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-11-29 10:03:04 +02:00
hagen1778	98d0f81f21	lib/protoparser/datadog: follow-up after `543f218fe9` * prevent /api/v1 from panic on parsing rows * add tests for Extract function for v1 and v2 api's * separate request types in different pools to prevent different objects mixing * add changelog line `543f218fe9` Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-11-28 15:04:15 +01:00
Andrii Chubatiuk	543f218fe9	add datadog /api/v2/series and /api/beta/sketches support (#5094 ) Co-authored-by: Andrew Chubatiuk <andrew.chubatiuk@motional.com> Co-authored-by: Nikolay <https://github.com/f41gh7> Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>	2023-11-28 14:52:29 +01:00
Aliaksandr Valialkin	5034aa0773	app/vmagent: follow-up for `090cb2c9de` - Add Try* prefix to functions, which return bool result in order to improve readability and reduce the probability of missing check for the result returned from these functions. - Call the adjustSampleValues() only once on input samples. Previously it was called on every attempt to flush data to peristent queue. - Properly restore the initial state of WriteRequest passed to tryPushWriteRequest() before returning from this function after unsuccessful push to persistent queue. Previously a part of WriteRequest samples may be lost in such case. - Add -remoteWrite.dropSamplesOnOverload command-line flag, which can be used for dropping incoming samples instead of returning 429 Too Many Requests error to the client when -remoteWrite.disableOnDiskQueue is set and the remote storage cannot keep up with the data ingestion rate. - Add vmagent_remotewrite_samples_dropped_total metric, which counts the number of dropped samples. - Add vmagent_remotewrite_push_failures_total metric, which counts the number of unsuccessful attempts to push data to persistent queue when -remoteWrite.disableOnDiskQueue is set. - Remove vmagent_remotewrite_aggregation_metrics_dropped_total and vm_promscrape_push_samples_dropped_total metrics, because they are replaced with vmagent_remotewrite_samples_dropped_total metric. - Update 'Disabling on-disk persistence' docs at docs/vmagent.md - Update stale comments in the code Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5088 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2110	2023-11-25 12:09:44 +02:00
Nikolay	090cb2c9de	app/vmagent: allow to disabled on-disk persistence (#5088 ) * app/vmagent: allow to disabled on-disk queue Previously, it wasn't possible to build data processing pipeline with a chain of vmagents. In case when remoteWrite for the last vmagent in the chain wasn't accessible, it persisted data only when it has enough disk capacity. If disk queue is full, it started to silently drop ingested metrics. New flags allows to disable on-disk persistent and immediatly return an error if remoteWrite is not accessible anymore. It blocks any writes and notify client, that data ingestion isn't possible. Main use case for this feature - use external queue such as kafka for data persistence. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2110 * adds test, updates readme * apply review suggestions * update docs for vmagent * makes linter happy --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-11-24 13:42:11 +01:00
Roman Khavronenko	0cf55ded34	lib/protoparser: decrease `import.maxLineLen` from 100MB to 10MB (#5364 ) Tests showed that importing a single line with 70MB size takes 5.3GiB RSS memory for VictoriaMetrics single-node. In the scenario when user exports and imports data from one VM to another, it could possibly lead to OOM exception for destination VM. Importing a single line with 16MB size taks 1.3GiB RSS memory. Hence, the limit for `import.maxLineLen` was decreased from 100MB to 10MB to improve reliability of VictoriaMetrics during imports. Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-11-24 12:53:04 +02:00
hagen1778	d493da562e	lib/storage: fix typo Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-11-21 11:20:43 +01:00
hagen1778	e96b4410a1	lib/storage: fix typo Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-11-21 10:52:53 +01:00
Hui Wang	ae3107153c	lib/protoparser/promremotewrite: fall back to zstd decoding if Snappy-decoding fails (#5344 ) This case is possible after the following steps: 1. vmagent successfully performed handshake with the -remoteWrite.url and the remote storage supports zstd-compressed data. 2. remote storage became unavailable or slow to ingest data, vmagent compressed the collected data into blocks with zstd and puts these blocks to persistent queue on disk. 3. vmagent restarts and the remote storage is unavailable during the handshake, then vmagent falls back to Snappy compression. 4. vmagent starts sending zstd-compressed data from persistent queue to the remote storage, while falsely advertizing it sends Snappy-compressed data. 5. The remote storage receives zstd-compressed data and fails unpacking it with Snappy. The solution is the same as `12cd32fd75`, just fall back to zstd decompression if Snappy decompression fails.	2023-11-17 15:51:09 +01:00
Aliaksandr Valialkin	d9a7dea9a1	lib/querytracer: add missing blank comment line after `3121d76bee`	2023-11-15 16:10:43 +01:00
Aliaksandr Valialkin	3076c1f400	lib/ingestserver: properly log the number of closed connections Previously there was off-by-one error, which resulted in logging len(conns-1) connections instead of len(conns) Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4922	2023-11-14 21:53:24 +01:00
Nikolay	3121d76bee	lib/querytracer: makes package concurrent safe to use (#5322 ) * lib/querytracer: makes package concurrent safe to use it must fix various issues with concurrent code usage. Especially, when it's not reasonable to wait for all goroutines to be finished * wip --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-11-14 20:59:08 +01:00
Aliaksandr Valialkin	cb106bdf39	lib/logger: increase default -loggerMaxArgLen command-line flag value from 500 to 1000 The 500 chars limit for the maximum arg lengths during logging appeared to be too low for some cases	2023-11-14 19:52:27 +01:00
Aliaksandr Valialkin	f9bd265249	lib/ingestserver: typo fix after `f7834767c1` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4922	2023-11-14 03:26:26 +01:00
Zakhar Bessarab	37997abd14	vmcluster: re-routing enhancement (#5293 ) * app/vmstorage: close vminsert connections gradually before stopping storage Implements graceful shutdown approach suggested here - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4922#issuecomment-1768146878 Test results for this can be found here - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4922#issuecomment-1790640274 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * app/vmstorage: update graceful shutdown logic - close connections from vminsert in determenistic order - update flag description - lower default timeout to 25 seconds. 25 seconds value was chosen because the lowest default value used in default configuration deployments is 30s(default value in Kubernetes and ansible-playbooks). Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * docs/cluster: add information about re-routing enhancement during restart Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * docs/changelog: add entry for new command-line flag Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * {app/vmstorage,lib/ingestserver}: address review feedback Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * docs/cluster: add note to update workload scheduler timeout Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * wip --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-11-14 01:03:44 +01:00
Aliaksandr Valialkin	cef7a39ba3	lib/logstorage: always check the previous indexBlockHeader for blocks with matching tenantID and/or streamID The previous indexBlockHeader may contain blocks for the matching tenantID and/or streamID, so it must be scanned unconditionally during the search. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5295 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4856 This is a follow-up for `89dcbc2fe7`	2023-11-13 23:13:53 +01:00
XLONG96	89dcbc2fe7	lib/logstorage: fix streamID and tenantID search (#4856 ) (#5295 )	2023-11-13 23:09:39 +01:00
Aliaksandr Valialkin	0feaeca3c1	lib/protoparser/promremotewrite: fall back to Snappy decoding if zstd decoding fails This case is possible after the following steps: 1. vmagent tries to perform handshake with the -remoteWrite.url in order to determine whether the remote storage supports zstd-compressed data. 2. The remote storage is unavailable during the handshake. In this case vmagent falls back to Snappy compression for the data sent to the remote storage. 3. vmagent compresses the collected data into blocks with Snappy and puts these blocks to persistent queue on disk. 4. The remote storage becomes available. 5. vmagent restarts, performs the handshake with the remote storage and detects that it supports zstd-compressed data. 6. vmagent starts sending Snappy-compressed data from persistent queue to the remote storage, while falsely advertizing it sends zstd-compressed data. 7. The remote storage receives Snappy-compressed data and fails unpacking it with zstd. The solution is to just fall back to Snappy decompression if zstd decompression fails. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5301	2023-11-13 21:19:08 +01:00
Aliaksandr Valialkin	8af56ea2ed	lib/htmlcomponents: use relative links for the top page and for favicon.ico This allows hiding VictoriaMetrics components behind proxies with arbitrary path prefixes. For example, vmagent HTTP handlers can be served via /vmagent/ path prefix: - http://proxy/vmagent/targets - http://proxy/vmagent/service-discovery The path prefix can be arbitrary. For example, below are vmagent urls for /tenantID/vmagent/ path prefix: - http://proxy/tenantID/vmagent/targets - http://proxy/tenantID/vmagent/service-discovery While at it, consistently serve favicon.ico from any path directory. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5306 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5307	2023-11-13 20:29:05 +01:00
Aliaksandr Valialkin	cf23dc6480	all: cleanup: remove `// +build ...` lines, since they are no longer needed after Go1.17, and the minimum supported Go version for VictoriaMetrics source code is Go1.20	2023-11-13 19:12:51 +01:00

1 2 3 4 5 ...

2187 Commits