Aliaksandr Valialkin
c04505e585
lib/promscrape/discovery/kubernetes: reduce memory usage further when big number of scrape jobs are configured for the same kubernetes_sd_config
role
...
Serialize reloading per-role objects, so they don't occupy too much memory when objects for many scrape jobs are simultaneously refreshed.
Do not reload per-role objects if they were already refreshed by concurrent goroutines. This should reduce load on Kubernetes API server
when big number of scrape jobs are configured for the same Kubernetes role.
This is a follow-up for 17b87725ed
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1113
2021-03-07 20:03:22 +02:00
Aliaksandr Valialkin
5807ff57f3
lib/promscrape/discovery/kubernetes: reduce memory usage when Kubernetes service discovery is configured on a big number of scrape jobs
...
Previously vmagent was creating a separate Kubernetes object cache per each scrape job.
This could result in increased memory usage when monitoring a Kubernetes cluster with big number of objects (pods / nodes / services, etc.)
as seen at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1113
Now it uses a shared map of scrape objects across multiple scrape jobs.
2021-03-05 17:32:33 +02:00
Aliaksandr Valialkin
92ddb8f197
lib/promscrape/discovery/kubernetes: move apiWatcher code to a separate file
2021-03-05 17:32:32 +02:00
Aliaksandr Valialkin
02c0959380
lib/promscrape: make cluster membership calculations consistent across 32-bit and 64-bit architectures
2021-03-05 09:06:08 +02:00
Aliaksandr Valialkin
133fb9fc00
lib/promscrape: add -promscrape.cluster.replicationFactor
command-line flag for replicating scrape targets among vmagent
instances in the cluster
2021-03-04 10:21:27 +02:00
Aliaksandr Valialkin
bae7a1b47a
lib/promscrape/discovery/kubernetes: fix tests after e154f4a644
2021-03-03 22:42:04 +02:00
Nikolay
7d92ef3acd
Fix ingress discovery api ( #1110 )
2021-03-03 10:45:50 +02:00
Aliaksandr Valialkin
25f453ce1a
lib/promscrape/discovery/kubernetes: properly check for nil pointer inside interface
...
See https://mangatmodi.medium.com/go-check-nil-interface-the-right-way-d142776edef1
This fixes a panic when the ScrapeWork is filtered out in swcFunc.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1108
2021-03-03 10:42:54 +02:00
Aliaksandr Valialkin
fea27b9845
lib/promscrape: go fmt
2021-03-02 21:20:23 +02:00
Aliaksandr Valialkin
3fbe2bf1c8
lib/promscrape: pre-allocate space for a map in mergeLabels
...
This should reduce the number of memory allocations when discovering big number of targets
2021-03-02 18:42:44 +02:00
Aliaksandr Valialkin
ac5c47a9f5
lib/promscrape/discovery: properly track vm_promscrape_discovery_kubernetes_objects_removed_total metric
2021-03-02 18:33:29 +02:00
Aliaksandr Valialkin
f9c1fe3852
lib/promscrape/discovery/kubernetes: cache ScrapeWork objects as soon as the corresponding k8s objects are changed
...
This should reduce CPU usage and memory usage when Kubernetes contains tens of thousands of objects
2021-03-02 16:44:19 +02:00
Aliaksandr Valialkin
f686174329
lib/promscrape/discovery/ec2: follow-up after f6114345de
2021-03-02 13:47:35 +02:00
Nikolay
fd8ca7df50
Adds webIndentity token for aws ( #1099 )
...
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1080
2021-03-02 13:47:34 +02:00
Aliaksandr Valialkin
b89a4fac2f
lib/promscrape/discovery/kubernetes: deflake tests; a follow-up for 05fb08713c
2021-03-01 14:31:44 +02:00
Aliaksandr Valialkin
c3bf72992f
lib/promscrape: explicitly stop and cleanup service discovery routines when new config is read from -promscrape.config
...
This should reduce memory usage when `-promscrape.config` file frequently changes
2021-03-01 14:15:16 +02:00
Aliaksandr Valialkin
8af9370bf2
lib/promscrape: use target arg in ScrapeWork cache
2021-03-01 12:31:11 +02:00
Aliaksandr Valialkin
5328506c3e
lib/promscrape: typo fix, which prevented from caching ScrapeWork entries
2021-03-01 12:13:13 +02:00
Aliaksandr Valialkin
d5058caccc
lib/promscrape: add vm_promscrape_scrapework_cache_* metrics for tracking ScrapeWork cache effectiveness
2021-03-01 12:05:58 +02:00
Aliaksandr Valialkin
109bfaadad
lib/promscrape: reduce CPU usage an memory allocations when constructing scrapeWorkKey
2021-02-28 22:30:36 +02:00
Aliaksandr Valialkin
9a2bf65134
lib/promscrape: add ability to spread scrape targets among multiple vmagent instances
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1084
2021-02-28 18:40:42 +02:00
Aliaksandr Valialkin
3e44d9947e
lib/promscrape/discovery/kubernetes: properly account the number of objects when watcher is stopped
...
A follow-up for b21b110b7a
2021-02-28 17:06:49 +02:00
Aliaksandr Valialkin
0ef7a94056
lib/promscrape/discovery/kubernetes: add vm_promscrape_discovery_kubernetes_*
metrics for monitoring internal state of k8s service discovery
2021-02-28 16:58:45 +02:00
Aliaksandr Valialkin
f52bdbe2a3
lib/promscrape/discovery/kubernetes: remove resourceVersionMatch=NotOlderThan
query arg when watching for k8s object changes, since it cannot be used when watch=1
query arg is passed
2021-02-28 16:08:43 +02:00
Aliaksandr Valialkin
6f8866a40a
lib/promscrape: fix possible deadlock in parallel execution of target relabeling
2021-02-28 16:05:32 +02:00
Aliaksandr Valialkin
5c9e657808
lib/promscrape/discovery/kubernetes: fix deadlock in startWatcherForURL
...
reloadObjects must be called without holding aw.mu lock
2021-02-28 15:25:33 +02:00
Aliaksandr Valialkin
e77f2f8630
lib/promscrape/discovery/kubernetes: typo fix after 241ffd1f3b
2021-02-28 15:15:27 +02:00
Aliaksandr Valialkin
82441537ff
lib/promscrape/discovery/kubernetes: pre-populate labelsByKey in reloadObject()
2021-02-28 15:09:43 +02:00
Aliaksandr Valialkin
e003453941
lib/promscrape/discovery/kubernetes: compare sorted sets of labels in tests
...
This should deflake tests where the order of labels isn't stable
2021-02-28 14:12:32 +02:00
Aliaksandr Valialkin
6a21ef87b7
lib/promscrape: add missing startWatchersForRole()
call at the beginning of apiWatcher.getLabelsForRole
2021-02-28 14:00:00 +02:00
Aliaksandr Valialkin
6d0e7fb8b0
lib/promscrape/discovery/kubernetes: reload k8s resources on every error
...
This is needed for obtaining fresh resourceVersion
2021-02-27 01:46:59 +02:00
Aliaksandr Valialkin
44975e28fe
lib/promscrape: yet another typo fix after ed8441ec52
2021-02-26 23:36:34 +02:00
Aliaksandr Valialkin
50e74c439c
lib/promscrape: typo fix after ed8441ec52
2021-02-26 23:04:40 +02:00
Aliaksandr Valialkin
fa3ce450fb
lib/promscrape: cache ScrapeWork
...
This should reduce the time needed for updating big number of scrape targets.
2021-02-26 21:43:41 +02:00
Aliaksandr Valialkin
efcdf613c2
lib/promscrape/discovery/kubernetes: cache target labels
...
This should reduce CPU usage on repeated SDConfig.GetLabels() calls.
2021-02-26 20:24:29 +02:00
Aliaksandr Valialkin
22822feea3
lib/promscrape/discovery/kubernetes: errcheck fix
2021-02-26 19:09:12 +02:00
Aliaksandr Valialkin
dc8c045378
lib/promscrape: cleanup after 9b2246c29b
...
Main points:
* Revert changes outside lib/promscrape/discovery/kuberntes . These changes can be applied later in a separate commit
* Minimize changes in lib/promscrape/discovery/kubernetes compared to a93e644001
* Corner case fixes.
2021-02-26 19:09:12 +02:00
Nikolay
cf9262b01f
vmagent kubernetes watch stream discovery. ( #1082 )
...
* started work on sd for k8s
* continue work on watch sd
* fixes
* continue work
* continue work on sd k8s
* disable gzip
* fixes typos
* log errror
* minor fix
Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2021-02-26 19:09:12 +02:00
Aliaksandr Valialkin
4cfac70cde
lib/promscrape: remove duplicate code a bit
2021-02-26 15:54:07 +02:00
Aliaksandr Valialkin
55eb983193
lib/promscrape: reduce processing time for big number of discovered targets by processing them in parallel
2021-02-26 12:48:48 +02:00
Aliaksandr Valialkin
197ecca426
lib/promrelabel: add more optimizations for relabeling for common cases
2021-02-22 16:36:54 +02:00
Aliaksandr Valialkin
8b87398333
lib/promscrape: export vm_promscrape_target_relabel_duration_seconds metric
2021-02-21 23:21:35 +02:00
Aliaksandr Valialkin
f09b46b2b1
lib/promscrape: typo fix after the commit f26162ec99
2021-02-19 00:34:18 +02:00
Aliaksandr Valialkin
502d0e2524
lib/promscrape: add scrape_align_interval config option into scrape config
...
This option allows aligning scrapes to a particular intervals.
2021-02-18 23:53:04 +02:00
Aliaksandr Valialkin
fccb481de2
lib/promscrape/discovery/kubernetes: add __meta_kubernetes_endpoints_label_*
and __meta_kuberntes_endpoints_annotation_*
labels to role: endpoints
...
This syncs kubernetes SD with Prometheus 2.25
See 617c56f55a
2021-02-15 02:51:36 +02:00
Aliaksandr Valialkin
18c2075159
lib/promscrape: remove vm_promscrape_scrapes_failed_per_url_total and vm_promscrape_scrapes_skipped_by_sample_limit_per_url_total metrics
...
These metrics may result in big number of time series when vmagent scrapes thousands of targets and these targets constantly changes.
* It is better using `up == 0` query for determining failing targets.
* It is better using the following query for determining targets with exceeded limit on the number of metrics:
scrape_samples_scraped > 0 if up == 0
2021-02-12 05:23:27 +02:00
Nikolay
6fc2a8e544
fixes dockerswarm ( #1053 )
...
fixes improper usage of host network services
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1028
2021-02-04 15:57:41 +02:00
Aliaksandr Valialkin
755b0998ce
lib/promscrape: add vm_promscrape_service_discovery_duration_seconds metric
2021-02-02 16:16:51 +02:00
Aliaksandr Valialkin
4c59dbc127
lib/promscrape: add vm_promscrape_scrape_retries_total
, vm_promscrape_discovery_retries_total
and vm_promscrape_discovery_requests_total
metrics
2021-02-01 20:06:16 +02:00
Aliaksandr Valialkin
ffec5131ae
lib/promscrape: export vm_promscrape_scrapes_failed_per_url_total
and vm_promscrape_scrapes_skipped_by_sample_limit_per_url_total
metrics
...
These metrics could be useful for determining imporperly working scrape targets.
Note that these metrics are exported only for failing scrape targets. They aren't exposed for normally working targets.
2021-01-27 00:40:39 +02:00