Commit Graph

87 Commits

Author SHA1 Message Date
Aliaksandr Valialkin
def014eb75 lib/promscrape/discovery/kubernetes: remove debug lines left after the commit 133b288681 2021-03-12 11:22:33 +02:00
Aliaksandr Valialkin
133b288681 lib/promscrape/discovery/kubernetes: use a single watcher per apiURL
Previously multiple scrape jobs could create multiple watchers for the same apiURL. Now only a single watcher is used.
This should reduce load on Kubernetes API server when many scrape job configs use Kubernetes service discovery.
2021-03-11 16:43:04 +02:00
Aliaksandr Valialkin
bebcb8130c lib/promscrape/discovery/kubernetes: localize Bookmark parsing code
This is a follow-up for e772d1c920
2021-03-11 13:08:08 +02:00
Aliaksandr Valialkin
e772d1c920 lib/promscrape/discovery/kubernetes: reduce load on Kubernetes API server by using watch bookmarks
This allows continuing object watch from the last bookbark instead of reloading all the objects
on watch errors or timeouts.

See https://kubernetes.io/docs/reference/using-api/api-concepts/#watch-bookmarks
2021-03-10 15:06:35 +02:00
Aliaksandr Valialkin
4b18a4f026 lib/promscrape/discovery/kubernetes: remove too verbose logs about starting and stopping the watchers
Log the number of objects loaded per each watch url

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1113
2021-03-09 15:05:14 +02:00
Aliaksandr Valialkin
ab4f090c63 lib/promscrape/discovery/kubernetes: reduce memory usage further when big number of scrape jobs are configured for the same kubernetes_sd_config role
Serialize reloading per-role objects, so they don't occupy too much memory when objects for many scrape jobs are simultaneously refreshed.
Do not reload per-role objects if they were already refreshed by concurrent goroutines. This should reduce load on Kubernetes API server
when big number of scrape jobs are configured for the same Kubernetes role.

This is a follow-up for 17b87725ed

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1113
2021-03-07 19:51:03 +02:00
Aliaksandr Valialkin
17b87725ed lib/promscrape/discovery/kubernetes: reduce memory usage when Kubernetes service discovery is configured on a big number of scrape jobs
Previously vmagent was creating a separate Kubernetes object cache per each scrape job.
This could result in increased memory usage when monitoring a Kubernetes cluster with big number of objects (pods / nodes / services, etc.)
as seen at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1113

Now it uses a shared map of scrape objects across multiple scrape jobs.
2021-03-05 17:29:55 +02:00
Aliaksandr Valialkin
c9a25e931b lib/promscrape/discovery/kubernetes: move apiWatcher code to a separate file 2021-03-05 12:36:05 +02:00
Aliaksandr Valialkin
9a48c1b53d lib/promscrape/discovery/kubernetes: fix tests after e154f4a644 2021-03-03 22:41:30 +02:00
Nikolay
e154f4a644
Fix ingress discovery api (#1110) 2021-03-03 10:43:39 +02:00
Aliaksandr Valialkin
7906316741 lib/promscrape/discovery/kubernetes: properly check for nil pointer inside interface
See https://mangatmodi.medium.com/go-check-nil-interface-the-right-way-d142776edef1

This fixes a panic when the ScrapeWork is filtered out in swcFunc.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1108
2021-03-03 10:33:17 +02:00
Aliaksandr Valialkin
d1d34664b5 lib/promscrape/discovery: properly track vm_promscrape_discovery_kubernetes_objects_removed_total metric 2021-03-02 18:32:54 +02:00
Aliaksandr Valialkin
6a7ef768ff lib/promscrape/discovery/kubernetes: cache ScrapeWork objects as soon as the corresponding k8s objects are changed
This should reduce CPU usage and memory usage when Kubernetes contains tens of thousands of objects
2021-03-02 16:42:55 +02:00
Aliaksandr Valialkin
22b1941cfc lib/promscrape/discovery/ec2: follow-up after f6114345de 2021-03-02 13:46:26 +02:00
Nikolay
f6114345de
Adds webIndentity token for aws (#1099)
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1080
2021-03-02 13:27:09 +02:00
Aliaksandr Valialkin
1a3689af9a lib/promscrape/discovery/kubernetes: deflake tests; a follow-up for 05fb08713c 2021-03-01 14:32:12 +02:00
Aliaksandr Valialkin
62ebf5c88e lib/promscrape: explicitly stop and cleanup service discovery routines when new config is read from -promscrape.config
This should reduce memory usage when `-promscrape.config` file frequently changes
2021-03-01 14:14:00 +02:00
Aliaksandr Valialkin
e02d1ef93c lib/promscrape/discovery/kubernetes: properly account the number of objects when watcher is stopped
A follow-up for b21b110b7a
2021-02-28 17:06:02 +02:00
Aliaksandr Valialkin
b21b110b7a lib/promscrape/discovery/kubernetes: add vm_promscrape_discovery_kubernetes_* metrics for monitoring internal state of k8s service discovery 2021-02-28 16:57:40 +02:00
Aliaksandr Valialkin
c459600346 lib/promscrape/discovery/kubernetes: remove resourceVersionMatch=NotOlderThan query arg when watching for k8s object changes, since it cannot be used when watch=1 query arg is passed 2021-02-28 16:07:14 +02:00
Aliaksandr Valialkin
68a0f5ce12 lib/promscrape/discovery/kubernetes: fix deadlock in startWatcherForURL
reloadObjects must be called without holding aw.mu lock
2021-02-28 15:26:30 +02:00
Aliaksandr Valialkin
b523e0369c lib/promscrape/discovery/kubernetes: typo fix after 241ffd1f3b 2021-02-28 15:12:17 +02:00
Aliaksandr Valialkin
241ffd1f3b lib/promscrape/discovery/kubernetes: pre-populate labelsByKey in reloadObject() 2021-02-28 15:09:49 +02:00
Aliaksandr Valialkin
05fb08713c lib/promscrape/discovery/kubernetes: compare sorted sets of labels in tests
This should deflake tests where the order of labels isn't stable
2021-02-28 14:10:19 +02:00
Aliaksandr Valialkin
af8b7e8391 lib/promscrape: add missing startWatchersForRole() call at the beginning of apiWatcher.getLabelsForRole 2021-02-28 14:00:17 +02:00
Aliaksandr Valialkin
422b31de40 lib/promscrape/discovery/kubernetes: reload k8s resources on every error
This is needed for obtaining fresh resourceVersion
2021-02-27 01:47:27 +02:00
Aliaksandr Valialkin
ed8441ec52 lib/promscrape: cache ScrapeWork
This should reduce the time needed for updating big number of scrape targets.
2021-02-26 21:43:22 +02:00
Aliaksandr Valialkin
815666e6a6 lib/promscrape/discovery/kubernetes: cache target labels
This should reduce CPU usage on repeated SDConfig.GetLabels() calls.
2021-02-26 20:23:28 +02:00
Aliaksandr Valialkin
19712fc2bd lib/promscrape/discovery/kubernetes: errcheck fix 2021-02-26 17:00:08 +02:00
Aliaksandr Valialkin
c8f2f9b2e8 lib/promscrape: cleanup after 9b2246c29b
Main points:

* Revert changes outside lib/promscrape/discovery/kuberntes . These changes can be applied later in a separate commit
* Minimize changes in lib/promscrape/discovery/kubernetes compared to a93e644001
* Corner case fixes.
2021-02-26 16:54:05 +02:00
Nikolay
9b2246c29b
vmagent kubernetes watch stream discovery. (#1082)
* started work on sd for k8s

* continue work on watch sd

* fixes

* continue work

* continue work on sd k8s

* disable gzip

* fixes typos

* log errror

* minor fix

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2021-02-26 16:46:13 +02:00
Aliaksandr Valialkin
a93e644001 lib/promscrape: remove duplicate code a bit 2021-02-26 16:39:56 +02:00
Aliaksandr Valialkin
38d7e96602 lib/promscrape/discovery/kubernetes: add __meta_kubernetes_endpoints_label_* and __meta_kuberntes_endpoints_annotation_* labels to role: endpoints
This syncs kubernetes SD with Prometheus 2.25
See 617c56f55a
2021-02-15 02:51:16 +02:00
Nikolay
48c8c5093b
fixes dockerswarm (#1053)
fixes improper usage of host network services
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1028
2021-02-04 15:56:42 +02:00
Nikolay
b21e16ad0c
fixes kubernetes_sd (#983)
* fixes kubernetes_sd,
adds missing service metadata for pod ports without endpoint
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/982

* fix test
2020-12-24 11:26:14 +02:00
Aliaksandr Valialkin
820669da69 lib/promscrape: code prettifying for 8dd03ecf19 2020-12-24 10:56:10 +02:00
Nikolay
8dd03ecf19
adds proxy_url support, (#980)
* adds proxy_url support,
adds proxy_url to the dockerswarm, eureka, kubernetes and consul service discovery,
adds proxy_url to the scrape_config for targets scrapping,
http based proxy is supported atm,
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/503

* fixes imports
2020-12-24 10:52:37 +02:00
Aliaksandr Valialkin
5f9d88a3cb lib/promscrape/discovery/consul: reduce load on Consul API server by increasing timeout for blocking requests from 50 seconds to 9 minutes
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/574
2020-12-11 17:24:13 +02:00
Aliaksandr Valialkin
fd9fd191b9 lib/promscrape/discovery/consul: properly pass Datacenter filter to Consul API server
Previously it has been passed as `sdc` query arg, while it should be passed as `dc` query arg.
See https://www.consul.io/api-docs/health#list-nodes-for-service for details.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/574#issuecomment-740454170
2020-12-08 21:52:42 +02:00
Aliaksandr Valialkin
de0643fab5 lib/promscrape/discovery/consul: log the time needed for stoppig Consul service watcher 2020-12-03 20:14:55 +02:00
Aliaksandr Valialkin
9cd8eb92f1 lib/promscrape/discovery/consul: make sure that block response contains X-Consul-Index header 2020-12-03 20:05:23 +02:00
Aliaksandr Valialkin
5009b25a03 lib/promscrape: code cleanup after c6dee6c52d
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/574
2020-12-03 19:50:53 +02:00
Nikolay
c6dee6c52d
Changes consul discovery api (#921)
* adds consul watch api,
it must reduce load on consul service with blocking wait requests,
changed discoveryClient api with fetchResponseMeta callback.

* small fix

* fix after master merge

* adds watch client at discovery utils

* fixes consul watcher,
changes namings,
fixes data race

* small typo fix

* sanity fix

* fix naming and service node update
2020-12-03 19:47:40 +02:00
Aliaksandr Valialkin
aa90b93778 lib/promscrape: expose __meta_ec2_ipv6_addresses label for ec2_sd_config like Prometheus will do in the next release 2020-11-23 16:56:42 +02:00
Aliaksandr Valialkin
de523c81b9 lib/promscrape: add filters option to dockerswarm_sd_config like Prometheus did in v2.23.0 2020-11-23 16:27:40 +02:00
Aliaksandr Valialkin
d7932775cc lib/promscrape/discovery/eureka: follow-up after eec76718e9 2020-11-20 14:00:12 +02:00
Nikolay
eec76718e9
Adds eureka service discovery (#913)
* Adds eureka service discovery
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/851
Netflix service discovery for AWS

* Apply suggestions from code review

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2020-11-20 13:38:12 +02:00
Vasily
6fcbd17bdd Add omitempty for DisableCompression and DisableKeepAlive fields in ScrapeConfig (#796)
* Add omitempty for DisableCompression and DisableKeepAlive fields in ScrapeConfig

* Add omitempty annotation to all the default/optional values

* Fix annotations after review
2020-11-13 16:19:05 +02:00
Aliaksandr Valialkin
9e83335ca9 lib/promscrape/discovery/kubernetes: go fmt 2020-11-07 13:03:49 +02:00
Aliaksandr Valialkin
5407eed2f6 lib/promscrape/discovery/kubernetes: reduce memory usage for labels when discovering big number of scrape targets by using string concatenation instead of fmt.Sprintf
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/825
2020-11-07 13:03:08 +02:00