Ween
b42cf33c4d
Fix Auto metrics relabeled errors ( #593 )
...
* Fix Auto metrics relabeled errors
* Finalize auto-genenated Labels
* Fix Test Errors
Co-authored-by: xinyulong <xinyulong@kuaishou.com>
2020-06-29 22:39:39 +03:00
Aliaksandr Valialkin
de7e585ac8
lib/promscrape: preserve the previously discovered targets on discovery errors per each job_name
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/582
2020-06-23 15:42:46 +03:00
Aliaksandr Valialkin
a80e852aab
lib/promscrape: retry performing the request to the server for up to 3 times before giving up when it closes keep-alive connections
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/580
2020-06-23 12:34:12 +03:00
Aliaksandr Valialkin
50aa34bcbe
lib/promscrape/discovery/consul: reduce load on Consul when discovering big number of targets by using background caching
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/574
2020-06-20 18:20:07 +03:00
Aliaksandr Valialkin
62e1908986
lib/promscrape: reduce default value for -promscrape.discovery.concurrency
from 500 to 100
...
This should reduce load on Kubernetes API server and Consul when big number of targets are discovered
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/574
2020-06-20 17:53:48 +03:00
Aliaksandr Valialkin
1f2826bae2
lib/promscrape/discovery/ec2: expose __meta_ec2_ami
like the next Prometheus release will do
...
See b5d61fb66c
for details
2020-06-20 17:45:30 +03:00
Aliaksandr Valialkin
2e5b6220a4
lib/promrelabel: allows regex capture groups in target_label like Prometheus does
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/569
2020-06-19 02:20:58 +03:00
Aliaksandr Valialkin
b747362936
lib/promscrape: mention about -promscrape.maxScrapeSize in the error message when target returns too big response
2020-05-24 14:41:24 +03:00
Aliaksandr Valialkin
b59e089ac7
app/vmagent: add -dryRun
option for checking all the configs mentioned in command-line flags without running vmagent
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/362
2020-05-21 15:23:18 +03:00
Aliaksandr Valialkin
482bae8466
lib/promscrape: add -promscrape.config.dryRun
flag for checking -promscrape.config
for errors or unsupported options
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/508
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/362
2020-05-21 14:54:32 +03:00
Aliaksandr Valialkin
73ec5cf460
lib/promscrape: add -promscrape.discovery.concurrency
and -promscrape.discovery.concurrentWaitTime
flags for tuning the number of concurrent requests to autodiscovery API servers at Consul or Kubernetes
2020-05-19 17:35:59 +03:00
Aliaksandr Valialkin
3845420a8f
lib: extract common code for returning fast unix timestamp into lib/fasttime
2020-05-14 23:06:50 +03:00
Aliaksandr Valialkin
d54a93fc81
app/vmagent: fix scraping mTLS targets, which has been broken in v1.35.1
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/470
2020-05-12 17:23:43 +03:00
Aliaksandr Valialkin
405cf44aed
app/vmagent,lib/promscrape: do not set HostClient.DialDualStack, since it isnt used if HostClient.Dial is set
2020-05-12 15:24:53 +03:00
Aliaksandr Valialkin
9f39e618ed
lib/promscrape/discovery/gce: discover per-zone instances for gce_sd_config
in parallel. This should reduce discovery latency
2020-05-06 15:00:23 +03:00
Aliaksandr Valialkin
8ab5e47b5c
lib/promscrape: add Prometheus-compatible DNS-based service discovery aka dns_sd_configs
2020-05-06 00:02:41 +03:00
Aliaksandr Valialkin
42d563934b
lib/promscrape: properly connect to TCP6 addresses if -enableTCP6
is set
2020-05-06 00:02:40 +03:00
Aliaksandr Valialkin
1c8e97c8a0
lib/procutil: add NewSighupChan function, which returns a channel, which is triggered on every SIGHUP
2020-05-05 10:56:15 +03:00
Aliaksandr Valialkin
054457d1f4
lib/promscrape: allow explicitly setting empty token via token: ""
in consul_sd_config
2020-05-05 07:49:54 +03:00
Aliaksandr Valialkin
89aa6dbf56
lib/promscrape: add Prometheus-compatible service discovery for Consul aka consul_sd_configs
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/330
2020-05-04 20:53:06 +03:00
Aliaksandr Valialkin
ed91fe1d9b
lib/promscrape: move common code for discovery api config map handling into discoveryutils
2020-05-04 20:52:58 +03:00
Aliaksandr Valialkin
c50fd219dc
lib/promscrape/discovery/kubernetes/: unify apiConfig creation
2020-05-04 20:52:53 +03:00
Aliaksandr Valialkin
a5880f17af
lib/promscrape: remove debug line left after the commit e4aac6ea40
2020-05-03 17:16:19 +03:00
Aliaksandr Valialkin
1f0e8fdc0d
lib/promscrape: fix tests after the commit 658a8742ac
...
The original commit copies `__address__` label to `instance` label when generating per-target labels as Prometheus does.
See https://www.robustperception.io/life-of-a-label for details.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/453
2020-05-03 16:59:29 +03:00
DexterZhang
317688f144
fix(vmagent): different behavior as how prometheus deal with labels. [Issue#453] ( #454 )
2020-05-03 16:59:28 +03:00
Aliaksandr Valialkin
ab1e6a76bb
lib/promscrape: make consistent scrape time offsets across reloads for the same ScrapeURL and Labels
...
This should make consistent intervals between data points for scrape targets across reloads.
Previously these intervals were random.
2020-05-03 14:31:22 +03:00
Aliaksandr Valialkin
f25416984b
lib/promscrape: fix TestGetFileSDScrapeWorkSuccess after 3b234d82e5
2020-05-03 14:31:20 +03:00
Aliaksandr Valialkin
f422203e10
lib/promscrape: reload only modified scrapers on config changes
...
This should improve scrape stability when big number of targets are scraped and these targets are frequently changed.
Thanks to @xbsura for the idea and initial implementation attempts at the following pull requests:
- https://github.com/VictoriaMetrics/VictoriaMetrics/pull/449
- https://github.com/VictoriaMetrics/VictoriaMetrics/pull/458
- https://github.com/VictoriaMetrics/VictoriaMetrics/pull/459
- https://github.com/VictoriaMetrics/VictoriaMetrics/pull/460
2020-05-03 12:47:16 +03:00
Aliaksandr Valialkin
de5f923476
lib/promscrape: set 30 seconds timeout for discovery api requests
...
Previously such requests could hang for long time. This could make debugging harder.
2020-04-29 17:29:03 +03:00
Aliaksandr Valialkin
b6d88bac04
vendor: use github.com/VictoriaMetrics/fasthttp instead of github.com/fasthttp/fasthttp
...
The upstream fasthttp may contain issues like 996610f021
,
plus a code that isn't used by VictoriaMetrics. So let's use a private copy under our control instead.
2020-04-29 16:43:09 +03:00
Aliaksandr Valialkin
53740d0026
lib/promscrape: handle connection reset when targets responds with http redirect
2020-04-28 02:14:32 +03:00
肖贝贝
3e6f29f462
fix: vmagent not follow 301/302 redirect bug ( #445 )
...
Co-authored-by: xiaobeibei <xiaobeibei@bigo.sg>
2020-04-28 02:14:31 +03:00
Aliaksandr Valialkin
424068f804
lib/promscrape: handle connection reset when targets responds with http redirect
2020-04-28 02:14:26 +03:00
肖贝贝
7d045bf2ca
fix: vmagent not follow 301/302 redirect bug ( #445 )
...
Co-authored-by: xiaobeibei <xiaobeibei@bigo.sg>
2020-04-28 02:14:25 +03:00
Aliaksandr Valialkin
a603a15757
lib/promscrape/discovery/gce: make golangci-lint happy
2020-04-27 19:29:42 +03:00
Aliaksandr Valialkin
86a1d9cb0c
lib/promscrape: add initial support for Prometheus-compatible service discovery for Amazon EC2 aka ec2_sd_configs
2020-04-27 19:29:22 +03:00
Aliaksandr Valialkin
1acb6eb25a
lib/promscrape/discovery/gce: properly set filter
query arg in api url
2020-04-27 16:01:53 +03:00
Aliaksandr Valialkin
0daa37fa02
lib/promscrape/discovery/gce: allow empty project and zone for gce_sd_config
2020-04-27 11:45:45 +03:00
Aliaksandr Valialkin
31861c5b8e
lib/promscrape/discovery/gce: allow empty zone
arg in gce_sd_config
- in this case zones for the given project are automatically discovered
2020-04-26 14:37:38 +03:00
Aliaksandr Valialkin
7c74efd640
lib/promscrape/discovery/gce: make golint happy by ignoring resp.Body.Close() result
2020-04-24 18:13:26 +03:00
Aliaksandr Valialkin
069690e3bd
lib/promscrape: initial implementation for gce_sd_configs
aga Prometheus-compatible service discovery for Google Compute Engine
2020-04-24 17:53:43 +03:00
Aliaksandr Valialkin
de991551f5
lib/promscrape: query /api/v1/namespaces/*
for the configured namespaces in kubernetes_sd_config
...
This should fix authroization issues described at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/432
2020-04-24 14:42:02 +03:00
Aliaksandr Valialkin
387a21c96d
lib/promscrape: add -promscrape.configCheckInterval
command-line flag for automating config checking
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/431
2020-04-23 23:41:26 +03:00
Aliaksandr Valialkin
83e4c8427e
lib/promscrape: access Config entries by reference, so they can be compared by addresses
2020-04-23 14:38:29 +03:00
Aliaksandr Valialkin
e220f3eeb6
lib/promscrape: move KubernetesSDConfig to lib/promscrape/discovery/kubernetes
2020-04-23 11:34:30 +03:00
Aliaksandr Valialkin
1187494c8f
lib/promscrape/discovery/kubernetes: hide role switch logic behind GetLabels function
2020-04-22 22:16:18 +03:00
Aliaksandr Valialkin
81481abaa9
lib/promscrape/discovery/kubernetes: reuse a client for empty api_server
inside different jobs
2020-04-20 17:07:37 +03:00
Aliaksandr Valialkin
6764efde39
lib/promscrape/discovery/kubernetes: update stale comments
2020-04-17 14:06:26 +03:00
Aliaksandr Valialkin
d86640d609
lib/promscrape: suppress scrape errors if -promscrape.suppressScrapeErrors
flag is set
2020-04-16 23:41:52 +03:00
Aliaksandr Valialkin
70104f3fb1
lib/promscrape: print all the labels for the target on error message for failed scrape
...
This should improve debuggability.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/420
2020-04-16 23:35:10 +03:00
Aliaksandr Valialkin
266bbec52d
lib/promscrape: retry target scraping when the target closes previously established keep-alive connection to it
...
This should fix the following error:
the server closed connection before returning the first response byte. Make sure the server returns 'Connection: close' response header before closing the connection
2020-04-16 23:25:34 +03:00
Aliaksandr Valialkin
99f0cb1f5f
lib/promscrape: code cleanup in runScraper func
2020-04-15 11:36:35 +03:00
Aliaksandr Valialkin
6ec582acb9
lib/promscrape: show information on improperly configured scrape targets at the bottom of /targets
page
...
This is a common error whith improperly configured target autodiscovery and/or relabeling.
This error leads to duplicate scraping of the same targets with the same set of labels, which leads
to duplicate samples in time series.
2020-04-14 14:55:13 +03:00
Aliaksandr Valialkin
391fb0903e
lib/promscrape/discovery/kubernetes: remove only unused client for API server during cleaning
2020-04-14 14:19:26 +03:00
Aliaksandr Valialkin
636e1578de
lib/promscrape: add promrelabel.GetLabelValueByName helper function
2020-04-14 14:12:15 +03:00
Aliaksandr Valialkin
3945bf9dec
lib/promscrape: mention job name in error messages when target cannot be scraped
...
This should improve debuggability
2020-04-14 13:33:18 +03:00
Aliaksandr Valialkin
66da177fe9
lib/promscrape: reset ScrapeWork.ID in tests
2020-04-14 13:31:37 +03:00
Aliaksandr Valialkin
88366cad15
lib/promscrape: properly expose statuses for targets with duplicate scrape urls at /targets
page
...
Previously targets with duplicate scrape urls were merged into a single line on the page.
Now each target with duplicate scrape url is displayed on a separate line.
2020-04-14 13:10:06 +03:00
Aliaksandr Valialkin
09f796e2ab
lib/promscrape: remove labels starting with __meta_
after applying relabel_configs
as Prometheus does
...
This should reduce CPU load during scraping when target discovery generates
big number of `__meta_*` labels (for instance, k8s discovery).
See https://www.robustperception.io/life-of-a-label for details.
2020-04-14 12:23:30 +03:00
Aliaksandr Valialkin
f58d15f27c
lib/promscrape: rename 'scrape_config->scrape_limit' to 'scrape_config->sample_limit'
...
`scrape_config` block from Prometheus config contains `sample_limit` field,
while in `vmagent` this field was mistakenly named as `scrape_limit`.
2020-04-14 12:00:03 +03:00
Aliaksandr Valialkin
7c4fb038e3
lib/promscrape: add initial support for kubernetes_sd_config
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/334
2020-04-13 21:03:53 +03:00
Aliaksandr Valialkin
4017163393
lib/promscrape: add -promscrape.config.strictParse
flag for detecting errors in -promscrape.config
file
2020-04-13 13:15:52 +03:00
Aliaksandr Valialkin
7fbfef2aee
lib/promscrape: extract common auth code to lib/promauth
2020-04-13 12:59:22 +03:00
Aliaksandr Valialkin
c189104be7
lib/promscrape: reduce timestamp jitter when scraping targets
...
This should improve compression for timestamps
2020-04-01 16:13:01 +03:00
Aliaksandr Valialkin
bf1869d33d
lib/promscrape: allow overriding external_labels as Prometheus does
...
Prometheus docs at https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config say:
> In communication with external systems, they are always applied only
> when a time series does not have a given label yet and are ignored otherwise.
Though this may result in consistency chaos when scrape targets override `external_labels`,
let's stick with Prometheus behavior for the sake of backwards compatibility.
There is last resort in vmagent with `-remoteWrite.label`, which consistently
sets the configured labels to all the metrics before sending them to remote storage.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/366
2020-03-12 20:25:27 +02:00
Aliaksandr Valialkin
375d5483fa
lib/promscrape: remove possible races when registering and de-registering scrape workers for /targets
page
2020-03-11 16:30:43 +02:00
Aliaksandr Valialkin
187fd89c70
lib/promscrape: consistently update /targets
page after SIGHUP
2020-03-11 03:20:38 +02:00
Aliaksandr Valialkin
7545784a49
all: fix golangci-lint
issues
2020-03-10 19:40:03 +02:00
Aliaksandr Valialkin
d39dd8aa69
lib/promscrape: do not retry idempotent requests when scraping targets
...
This should prevent from the following unexpected side-effects of idempotent request retries:
- increased actual timeout when scraping the target comparing to the configured scrape_timeout
- increased load on the target
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/357
2020-03-09 13:31:20 +02:00
Aliaksandr Valialkin
12789a4621
app/vmagent: do not allow non-supported fields in -remoteWrite.relabelConfig
and file_sd_configs
...
This should reduce possible confusion like in the https://github.com/VictoriaMetrics/VictoriaMetrics/issues/363
2020-03-06 20:19:34 +02:00
Aliaksandr Valialkin
31a76a7b3a
lib/promscrape: consistency renaming: stopCh -> globalStopCh
2020-03-03 20:08:22 +02:00
Aliaksandr Valialkin
e6a481ab11
lib/promscrape: properly reload new configs on SIGHUP
2020-02-26 13:54:24 +02:00
Aliaksandr Valialkin
fa6815712f
lib/promscrape: go fmt
2020-02-26 13:24:40 +02:00
Aliaksandr Valialkin
f2a6948a14
lib/promscrape: do not add missing port to __address__ label in order to be consistent with Prometheus behavior
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/331
2020-02-25 20:50:18 +02:00
Aliaksandr Valialkin
7ee7614e90
app/vmagent: initial implementation for vmagent
2020-02-23 17:31:54 +02:00