VictoriaMetrics/lib/promscrape
Hui Wang 49fa92c1d0
lib/promscrape/discovery/kubernetes: fix watcher start order for roles endpoints and endpointslice (#5557)
* lib/promscrape/discovery/kubernetes: fix watcher start order for roles endpoints and endpointslice

Previously the groupWatcher could be mistakenly stopped when requests for pod or services resources take too long.

* remove mislead comment

* docs/sd_configs.md: mention -promscrape.kubernetes.attachNodeMetadataAll flag in the description for attach_metadata section

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4640

* wip

* lib/promscrape/kubernetes: prevent from stopping groupWatcher when there are in-flight apiWatcher.mustStart() calls

groupWatcher is stopped if it has zero registered apiWatchers during 14 seconds.
But such a groupWatcher can be still in use if apiWatcher for `role: endpoints` or `role: endpointslice`
is being registered and the discovery of the associated `pod` and/or `service` objects takes longer
than 14 seconds - see the beginning of groupWatcher.startWatchersForRole() function for details.

Track the number of in-flight calls to apiWatcher.mustStart() and prevent from stopping the associated groupWatcher
if the number of in-flight calls is non-zero.

P.S. postponing the discovery of `pod` and/or `service` objects associated with `endpoints` or `endpointslice` roles
isn't the best solution, since it slows down initial discovery of `endpoints` and `endpointslice` targets.

* typo fix

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2024-01-22 01:33:17 +02:00
..
discovery lib/promscrape/discovery/kubernetes: fix watcher start order for roles endpoints and endpointslice (#5557) 2024-01-22 01:33:17 +02:00
discoveryutils lib/promauth: follow-up for e16d3f5639 2023-10-26 09:55:47 +02:00
testdata lib/promscrape: disable support for service discovery and metrics scrape via http2 2023-07-06 16:04:31 -07:00
client.go lib/promauth: follow-up for e16d3f5639 2023-10-26 09:55:47 +02:00
config_test.go lib/promscrape: show -promscrape.cluster.memberNum values for vmagent instances, which scrape the given dropped target at /service-discovery page 2023-12-07 00:11:30 +02:00
config_timing_test.go lib/promscrape: optimize service discovery speed 2022-11-29 21:26:23 -08:00
config.go all: allow dynamically reading *AuthKey flag values from files and urls 2024-01-22 01:23:23 +02:00
relabel_debug.go app/vmselect: small cleanup after 4f3f9950d0 2023-05-09 22:45:02 -07:00
scraper.go lib/promscrape/discovery/hetzner: follow-up after 03a97dc678 2024-01-22 00:53:23 +02:00
scrapework_test.go lib/promscrape: follow-up for 97373b7786 2023-12-06 17:36:48 +02:00
scrapework_timing_test.go lib/promscrape: add exported_ prefix to metric names exported by scrape targets if they clash with automatically generated metrics 2022-11-28 18:37:34 -08:00
scrapework.go lib/promscrape: code cleanup: send stale markers immediately after generating automatic metrics 2024-01-22 01:12:56 +02:00
statconn_test.go lib/promscrape: do not add a suggestion for enabling TCP6 in error message when the dial address is TCPv4 2023-10-27 14:06:49 +02:00
statconn.go lib/promscrape: do not add a suggestion for enabling TCP6 in error message when the dial address is TCPv4 2023-10-27 14:06:49 +02:00
targetstatus.go lib/promscrape: add a wraning when the /service-discovery page contains incomplete list of dropped targets 2023-12-08 19:04:29 +02:00
targetstatus.qtpl lib/promscrape: comsetic changes after e373bb84d5 2023-12-12 13:45:34 +01:00
targetstatus.qtpl.go lib/promscrape: comsetic changes after e373bb84d5 2023-12-12 13:45:34 +01:00