Commit Graph

550 Commits

Author SHA1 Message Date
Aliaksandr Valialkin
71a170d404
lib/promscrape: follow-up for 393876e52a
- Document the change in docs/CHANGELOG.md
- Reduce memory usage when sending stale markers even more by parsing the response in stream parsing mode
- Update the TestSendStaleSeries

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3668
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3675
2023-01-23 21:56:18 -08:00
Roman Khavronenko
8e2a8a6ae2
lib/promscrape: limit number of sent stale series at once (#3686)
Stale series are sent when there is a difference between current
and previous scrapes. Those series which disappeared in the current scrape
are marked as stale and sent to the remote storage.

Sending stale series requires memory allocation and in case when too many
series disappear in the same it could result in noticeable memory spike.
For example, re-deploy of a big fleet of service can result into
excessive memory usage for vmagent, because all the series with old
pod name will be marked as stale and sent to the remote write storage.

This change limits the number of stale series which can be sent at once,
so memory usage remains steady.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3668
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3675
Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-01-23 21:56:17 -08:00
Aliaksandr Valialkin
95d4db0506
lib/promscrape: properly log the actual response size after c4229a1bba 2023-01-23 21:13:06 -08:00
Aliaksandr Valialkin
a844b97942
lib/promscrape: follow-up for d79f1b106c
- Document the fix at docs/CHANGELOG.md
- Limit the concurrency for sendStaleMarkers() function in order to limit its memory usage
  when big number of targets disappear and staleness markers are sent
  for all the metrics exposed by these targets.
- Make sure that the writeRequestCtx is returned to the pool
  when there is no need to send staleness markers.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3668
2023-01-17 23:13:08 -08:00
lzfhust
5ac0f18ca8
using writeRequestCtxPool when delete kubernetes clusters from kubernetes_sd_configs (#3669) 2023-01-17 23:12:59 -08:00
Zakhar Bessarab
40d524edb8
discovery/{consul,nomad}: fix cancelling serviceWatcher in-flight requests (#3658)
* lib/promscrape/discovery/{consul,nomad}: fix background service update watches not canceling requests on serviceWatcher stop

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* lib/promscrape/discovery/{consul,nomad}: fix closing serviseWatcher during scrape job restart

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* wip

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3468

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-01-17 21:47:51 -08:00
Aliaksandr Valialkin
c33728befb
lib/promscrape: properly apply series limit
Fix the following issues:

- Series limit wasn't applied when staleness tracking was disabled.
- Series limit didn't prevent from sending staleness markers for new series exceeding the limit.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3660

Thanks to @hagen1778 for the initial attempt to fix the issue
at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3665
2023-01-17 10:30:16 -08:00
Nikolay
43d1f2d0c4
/lib/promscrape: use correct err logger for scrape unmarshalling (#3645)
/lib/promscrape: use correct err logger for scrape unmarshalling
It correctly suppresses scrape errors and adds correct context for err msg
2023-01-12 09:00:06 -08:00
Aliaksandr Valialkin
a819e30ddf
lib/promscrape: log the number of unsuccessful scrapes during the last -promscrape.suppressScrapeErrorsDelay
This commit is based on https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3413
Thanks to @jelmd for the pull request.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2575
2023-01-12 01:12:22 -08:00
Aliaksandr Valialkin
2e018aebf3
lib/promscrape/discovery: missing changes after b4ad3a3b4c 2023-01-11 23:03:14 -08:00
Aliaksandr Valialkin
434f22f871
lib/promscrape: follow-up for 8537533beb
- Add a comment describing the purpose of the `role` field inside `apiConfig` struct
- Revert changes at lib/promscrape/discovery/dockerswarm/dockerswarm.go ,
  since they reduce code readability. E.g. the reader needs to look up the named string constants
  in order to get their values.
2023-01-11 22:56:48 -08:00
Zakhar Bessarab
ae5b85966a
lib/promscrape/discovery/dockerswarm: fix discovery filters being applied to all objects (#3632)
* lib/promscrape/discovery/dockerswarm: fix discovery filters being applied to all objects

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* Update docs/CHANGELOG.md

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-01-11 22:56:40 -08:00
Aliaksandr Valialkin
ab318660cd
lib/promscrape/discovery/gce: follow-up for b2ccdaaa2f
- Use promutils.Labels.GetLabels() instead of comparing promutils.Labels.Labels to nil.
  This make the code more consistent with other places.

- Mention the release where the issue has been introduced at docs/CHANGELOG.md.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3624
2023-01-10 13:51:57 -08:00
Zakhar Bessarab
02f5c16433
lib/promscrape/discovery/gce: fix crash in case instance does not have any labels set (#3625)
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-01-10 13:51:35 -08:00
Aliaksandr Valialkin
43a4dcdaf8
lib/promscrape/discovery/nomad: sync nomad_sd_configs fields with the Prometheus implementation
See the list of configs supported by Prometheus at f88a0a7d83/discovery/nomad/nomad.go (L76-L84)

- Removed "token" option. In can be set either via NOMAD_TOKEN env var or via `bearer_token` config option.
- Removed "scheme" option. It is automatically detected depending on whether the `tls_config` is set.
- Removed "services" and "tags" options, since they aren't supported by Prometheus.
- Added "region" option. If it is missing, then the region is read from NOMAD_REGION env var.
  If this var is empty, then it is set to "global" in the same way as Nomad client does.
  See 865ee8d37c/api/api.go (L297)
  and 865ee8d37c/api/api.go (L555-L556)
- If the "server" option is missing, then it is read from NOMAD_ADDR in the same way
  as Nomad client does - see 865ee8d37c/api/api.go (L294-L296)

This is a follow-up for 8aee209c53

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3367
2023-01-09 21:30:19 -08:00
Roman Khavronenko
ca5136a0ee
lib/promscrape: remove datacenter field from nomad_sd_config (#3612)
Looks like `datacenter` field isn't part of `/v1/services` API.
See https://developer.hashicorp.com/nomad/api-docs/services#list-services
and https://developer.hashicorp.com/nomad/api-docs/services#read-service

Related issues:
https://github.com/traefik/traefik/issues/9109
https://github.com/prometheus/prometheus/issues/11776

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-01-09 21:24:46 -08:00
Aliaksandr Valialkin
7792ba3272
lib/promscrape/discoveryutils: cleanup after 5df9fddaf2
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3468
2023-01-07 01:27:16 -08:00
Zakhar Bessarab
e8624fd781
lib/promscrape/discoveryutils: use correct timeout for blocking requests (#3609)
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-01-07 01:27:10 -08:00
Aliaksandr Valialkin
0a14b7bb82
lib/promscrape: reduce the number of concurrently executed processScrapedData calls from 2x of the number of CPUs to the number of CPUs
This should reduce the maximum memory usage for processScrapedData() function by 2x.
The only part, which can be IO-bound in the processScrapedData() is pushData() call,
when it buffers data to persistent queue if the remote storage cannot keep up
with the data ingestion speed. In this case it is OK if the scrape pace will be limited.
2023-01-07 00:17:52 -08:00
Aliaksandr Valialkin
7fb02f536a
lib/promscrape: limit the concurrency during parsing and relabeling the scraped samples
This should reduce memory usage when scraping big number of targets,
since this limits the summary memory usage during concurrent parsing and relabeling
by the number of available CPU cores.
2023-01-06 23:01:18 -08:00
Aliaksandr Valialkin
ec7a3b79ab
lib/promscrape/discovery/{consul,nomad}: wait until the deleted serviceWatchers are stopped inside updateServices() call
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3468
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3367
2023-01-05 21:53:08 -08:00
Aliaksandr Valialkin
54410bf51b
lib/promscrape: follow-up after bced9fb978
- Document the bugfix at docs/CHANGELOG.md
- Wait until all the worker goroutines are done in consulWatcher.mustStop()
- Do not log `context canceled` errors when discovering consul serviceNames
- Removed explicit handling of gzipped responses at lib/promscrape/discoveryutils.Client,
  since this handling is automatically performed by net/http.Transport.
  See DisableCompression option at https://pkg.go.dev/net/http#Transport .
- Remove explicit handling of the proxyURL, since it is automatically handled
  by net/http.Transport. See Proxy option at https://pkg.go.dev/net/http#Transport .
- Expliticly set MaxIdleConnsPerHost, since its default value equals to 2.
  Such a small value may result in excess tcp connection churn
  when more than 2 concurrent requests are processed by lib/promscrape/discoveryutils.Client.
- Do not set explicitly the `Host` request header, since it is automatically set by net/http.Client.
- Backport the bugfix to the recently added nomad_sd_configs - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3367

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3468
2023-01-05 21:23:21 -08:00
Zakhar Bessarab
de5aad2cde
lib/promscrape/discoveryutils: switch to native http client from fasthttp (#3568) 2023-01-05 21:23:15 -08:00
Aliaksandr Valialkin
750d309f63
lib/promscrape/discovery/nomad: follow-up after 48f371a46c
- Remove undocumented `username` and `password` config options from `nomad_sd_config`.
  TODO: probably, remove these options from `consul_sd_config` too?
  These options exist there for backwards compatibility purposes.

- Add __meta_nomad_service_alloc_id and __meta_nomad_service_job_id meta-labels
  These labels contain AllocID and JobID fields for the discovered Nomad services.

- Various typo fixes.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3367
2023-01-05 18:09:23 -08:00
Karan Sharma
8f42c5a024
lib/promscrape: add Prometheus-compatible service discovery for Nomad (#3549)
Add nomad_sd_config support for service discovery
2023-01-05 18:07:02 -08:00
Aliaksandr Valialkin
6eda4c6da2
lib/promscrape: use strconv.Atoi instead of strconv.ParseInt for parsing -promscrape.cluster.memberNum
In this case there is no need in converting int64 to int
2023-01-05 16:46:03 -08:00
Aliaksandr Valialkin
1af6e0b233
lib/promrelabel: pass query args via query string at /metric-relabel-debug and /target-relabel-debug pages if their length doesnt exceed 1000
This allows copy-n-pasting the url to another browser window and seeing the same result.

The limit in 1000 chars is selected in order to prevent from potential issues with systems
which limit the url length such as Internet Explorer - see https://stackoverflow.com/questions/812925/what-is-the-maximum-possible-length-of-a-query-string

If the limit is exceeded, then query args are sent via POST method and aren't visible in the url.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3580
2023-01-05 16:45:42 -08:00
Zakhar Bessarab
99f9b02283
lib/promscrape/discovery/dockerswarm: fix query encoding of filters (#3586)
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-01-05 03:35:18 -08:00
Aliaksandr Valialkin
1d16cc9349
lib/promscrape: pre-fetch metric_relabel_configs rules when debugging metric relabeling for a particular target
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3407
2023-01-05 03:28:14 -08:00
Aliaksandr Valialkin
634e24e685
lib/promscrape: follow-up for a7e29c38bc
- Document the bugfix at docs/CHANGELOG.md
- Make the fix more durable against future changes when droppedTargetsMap.Register may be called from other places.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3580
2023-01-05 02:51:45 -08:00
Zakhar Bessarab
52226c392f
lib/promscrape/targetstatus: fix crash during droppedTarget registration (#3595)
* lib/promscrape/targetstatus: fix crash during droppedTarget registration in case original labels are not present

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* lib/promscrape/targetstatus: address review comment

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-01-05 02:48:39 -08:00
Aliaksandr Valialkin
ebb8aeb0cf
app/vmselect: remove dependency on lib/promscrape from app/vmselect 2023-01-03 23:27:36 -08:00
Aliaksandr Valialkin
d3f8298739
lib/bytesutil: add InternBytes() function as a shortcut to InternString(ToUnsafeString(..)) 2023-01-03 22:15:49 -08:00
Aliaksandr Valialkin
8e9548f050
lib/promscrape: log the actual response size in the error message when the response size exceeds -promscrape.maxScrapeSize
This is a follow-up for 7ad9fff7e5
2022-12-28 14:42:45 -08:00
Clément Nussbaumer
04d536c15a
fix(promscrape): check MaxScrapeSize after gzip decompression (#3550) 2022-12-28 14:42:45 -08:00
Aliaksandr Valialkin
c2fc996e01
lib/promscrape/discovery/azure: typo fix 2022-12-21 21:25:25 -08:00
Aliaksandr Valialkin
9330da3195
lib/promscrape/discovery/consul: expose service tags in individual labels __meta_consul_tag_<tagname>
This simplifies copying service tags to target labels with the following relabeling rule:

- action: labelmap
  regex: __meta_consul_tag_(.+)

See https://stackoverflow.com/questions/44339461/relabeling-in-prometheus
2022-12-19 13:02:56 -08:00
Aliaksandr Valialkin
2e597555ec
lib/promscrape: stop dropping metric name if relabeling rules do not instruct to do this on the /metric-relabel-debug page 2022-12-16 16:44:12 -08:00
Aliaksandr Valialkin
1a88fe5b1f
lib/flagutil/bytes.go: properly handle values bigger than 2GiB on 32-bit architectures
This fixes handling of values bigger than 2GiB for the following command-line flags:

- -storage.minFreeDiskSpaceBytes
- -remoteWrite.maxDiskUsagePerURL
2022-12-14 19:29:57 -08:00
Aliaksandr Valialkin
a521135b7b
lib/promscrape: allow editing relabeling configs and labels at /target-relabel-debug page
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3407
2022-12-10 12:47:47 -08:00
Aliaksandr Valialkin
97b41e727c
lib/promscrape: implement target-level and metric-level relabel debugging
Target-level debugging is performed by clicking the 'debug' link at the corresponding target
on either http://vmagent:8429/targets page or on http://vmagent:8428/service-discovery page.

Metric-level debugging is perfromed at http://vmagent:8429/metric-relabel-debug page.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3407

See https://docs.victoriametrics.com/vmagent.html#relabel-debug
2022-12-10 02:25:56 -08:00
Aliaksandr Valialkin
04abd5e113
docs/CHANGELOG.md: document the bugfix at 05b42601c3
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3247
2022-12-08 18:35:47 -08:00
Zakhar Bessarab
c939a8e8a2
lib/promscrape/discovery/azure: remove API server from URL returned by azure (#3403)
* lib/promscrape/discovery/azure: remove API server from URL returned by azure

* lib/promscrape/discovery/azure: validate nextLink contains same URL as apiServer
2022-12-08 18:35:46 -08:00
Aliaksandr Valialkin
a13f16e48a
lib/promscrape: allow using sample_limit and series_limit options in stream parsing mode
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3458
2022-12-08 17:04:12 -08:00
Aliaksandr Valialkin
6910b1de2e
all: typo fix: the the -> the 2022-12-03 21:53:07 -08:00
Aliaksandr Valialkin
be6da5053f
lib/promscrape: optimize service discovery speed
- Return meta-labels for the discovered targets via promutils.Labels
  instead of map[string]string. This improves the speed of generating
  meta-labels for discovered targets by up to 5x.

- Remove memory allocations in hot paths during ScrapeWork generation.
  The ScrapeWork contains scrape settings for a single discovered target.
  This improves the service discovery speed by up to 2x.
2022-11-29 21:26:23 -08:00
Aliaksandr Valialkin
2524d94fe1
lib/promscrape/discovery: add a benchmark for measuring the performance of creating pod meta-labels 2022-11-29 21:11:03 -08:00
Aliaksandr Valialkin
8ce5b095b7
lib/promscrape: add exported_ prefix to metric names exported by scrape targets if they clash with automatically generated metrics
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3406
2022-11-28 18:37:34 -08:00
Roman Khavronenko
d1169c1559
vmagent: expose metrics for tracking config state (#3375)
Expose `vm_relabel_config_*` and `vm_promscrape_config_*` metrics
for tracking relabel and scrape configuration hot-reloads.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3345
Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-11-22 00:48:12 +02:00
Aliaksandr Valialkin
c33bcae457
lib/promscrape/discovery/gce: do not pass filter arg when discovering zones
The filter arg isn't supported by zones API in GCE.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3202
2022-11-21 22:32:45 +02:00