jelmd
2fe045e2a4
new feature: debug relabeling ( #1344 )
...
* new feature: relabel logging
Use scrape_configs[x].relabel_debug = true to log metric names inkl.
labels before and after relabeling. After relabeling related metrics
get dropped, i.e. not submitted to servers.
* vminsert wants relabel logging, too.
2021-06-04 17:50:23 +03:00
Hason Chan
6f19bb23a1
fix eureka_sd_configs HTTPClientConfig incorrect parsing ( #1350 )
2021-06-04 11:47:17 +03:00
Aliaksandr Valialkin
2d8bd41f8a
lib/storage: reduce memory allocations when syncing dateMetricIDCache
2021-06-03 16:20:42 +03:00
Nikolay
ddc8022702
fixes solaris build ( #1345 )
2021-05-31 09:21:23 +03:00
Aliaksandr Valialkin
a52a20659a
lib/promscrape: fix tests after f0c21b6300
2021-05-28 01:32:50 +03:00
Aliaksandr Valialkin
d088923aef
Revert "lib/mergeset: remove a pool for inmemoryBlock structs"
...
This reverts commit 793fe39921
.
Reason to revert: production testing revealed possible slowdown when registering big number of new time series
2021-05-28 01:09:32 +03:00
Aliaksandr Valialkin
793fe39921
lib/mergeset: remove a pool for inmemoryBlock structs
...
The pool for inmemoryBlock struct doesn't give any performance gains in production workloads,
while it may result in excess memory usage for inmemoryBlock structs inside the pool during
background merge of indexdb.
2021-05-27 21:57:33 +03:00
Aliaksandr Valialkin
60341722d5
docs: document f0c21b6300
2021-05-27 15:03:30 +03:00
faceair
f0c21b6300
lib/promscrape: apply body size & sample limit to stream parse ( #1331 )
...
* lib/promscrape: apply body size limit to stream parse
Signed-off-by: faceair <git@faceair.me>
* lib/promscrape: apply sample limit to stream parse
Signed-off-by: faceair <git@faceair.me>
2021-05-27 14:52:44 +03:00
Aliaksandr Valialkin
2233d6ed8a
lib/uint64set: store pointers to bucket16 instead of bucket16 objects in bucket32
...
This speeds up bucket32.addBucketAtPos() when bucket32.buckets contains big number of items,
since the copying of bucket16 pointers is much faster than the copying of bucket16 objects.
This is a cpu profile for copying bucket16 objects:
10ms 13.43s (flat, cum) 32.01% of Total
10ms 120ms 650: b.b16his = append(b.b16his[:pos+1], b.b16his[pos:]...)
. . 651: b.b16his[pos] = hi
. 13.31s 652: b.buckets = append(b.buckets[:pos+1], b.buckets[pos:]...)
. . 653: b16 := &b.buckets[pos]
. . 654: *b16 = bucket16{}
. . 655: return b16
. . 656:}
This is a cpu profile for copying pointers to bucket16:
10ms 1.14s (flat, cum) 2.19% of Total
. 100ms 647: b.b16his = append(b.b16his[:pos+1], b.b16his[pos:]...)
. . 648: b.b16his[pos] = hi
10ms 700ms 649: b.buckets = append(b.buckets[:pos+1], b.buckets[pos:]...)
. 330ms 650: b16 := &bucket16{}
. . 651: b.buckets[pos] = b16
. . 652: return b16
. . 653:}
2021-05-25 14:20:52 +03:00
Aliaksandr Valialkin
39ef1e7a51
lib/storage: do not stop data ingestion on the first error in Storage.AddRows
...
Continue data ingestion for the rest of blocks.
2021-05-24 15:32:47 +03:00
Aliaksandr Valialkin
4b01c9fb2e
lib/storage: limit the number of rows per each block in Storage.AddRows()
...
This should reduce memory usage when ingesting big blocks or rows.
2021-05-24 15:24:07 +03:00
Aliaksandr Valialkin
a4ff4b8e65
lib/storage: allow filling all the rows up to their capacity in rawRowsShard.addRows
...
This should reduce memory usage a bit on data ingestion path
2021-05-24 15:22:59 +03:00
Aliaksandr Valialkin
a46551245c
lib/bloomfilter: fix TestLimiterConcurrent
2021-05-24 05:17:36 +03:00
Aliaksandr Valialkin
93d81b486d
lib/fs: do not pass done callback to tryRemoveAll() func
...
This improves code readability a bit.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1313
2021-05-24 04:51:57 +03:00
Aliaksandr Valialkin
f54133b200
lib/storage: do not populate MetricID->MetricName cache during data ingestion
...
This cache isn't needed during data ingestion, so there is no need in spending RAM on it.
This reduces RAM usage on data ingestion path by 30%
2021-05-24 03:02:46 +03:00
Aliaksandr Valialkin
ec79abc382
lib/{mergeset,storage}: reduce the number of IFNO log messages like merged ... items across ... blocks in ... seconds
...
Log these messages if the merge takes more than 30 seconds instead of 10 seconds.
2021-05-23 14:03:21 +03:00
Aliaksandr Valialkin
78dddfb98f
lib/promauth: follow-up after 5b8176c68e
2021-05-22 18:01:11 +03:00
Nikolay
5b8176c68e
basic OAuth2 support for remoteWrite and scrape targets ( #1316 )
...
* adds OAuth2 support for remoteWrite and scrapping
* adds tests
changes init
2021-05-22 16:20:18 +03:00
Aliaksandr Valialkin
e05dd475f0
lib/fs: concurrently remove up to 1024 blocked NFS directories
...
Previously the blocked directories were removed sequentially by a single goroutine.
This can be not enough for highly loaded VictoriaMetrics that accepts millions of sample per second,
when big number of LSM parts are created and removed at high rate.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1313
2021-05-21 17:57:46 +03:00
Aliaksandr Valialkin
8e2985b53d
lib/fs: wait for a while before giving up on NFS file removal if the removal queue is full
...
This should reduce the probability of the panic on a highly loaded VictoriaMetrics
accepting millions of samples per second.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1313
2021-05-21 17:21:00 +03:00
Aliaksandr Valialkin
c54bb73867
all: do not skip SIGHUP signal during service initialization
...
This can lead to stale or incomplete configs like in the https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-21 16:34:06 +03:00
Aliaksandr Valialkin
4c7bb75fa2
Makefile: update golangci-lint from v1.29.0 to v1.40.1
2021-05-20 18:27:10 +03:00
Aliaksandr Valialkin
e394ff6466
app/vmagent/remotewrite: expose metrics with the current number of active series per day and per hour
...
These numbers are exposed via the following metrics:
- vmagent_hourly_series_limit_current_series
- vmagent_daily_series_limit_current_series
Expose also the limits via the following metrics:
- vmagent_hourly_series_limit_max_series
- vmagent_daily_series_limit_max_series
2021-05-20 15:28:09 +03:00
Aliaksandr Valialkin
ad73f226ff
app/vmstorage: add ability to limit series cardinality via -storage.maxHourlySeries
and -storage.maxDailySeries
command-line flags
2021-05-20 14:15:19 +03:00
Aliaksandr Valialkin
7e526effaa
app/vmagent: add ability to limit series cardinality on a per-hour and per-day basis
2021-05-20 13:13:40 +03:00
Aliaksandr Valialkin
22585531ad
lib/promscrape/discovery/kubernetes: make golangci-lint
happy by removing empty branches
2021-05-20 12:00:29 +03:00
Aliaksandr Valialkin
009e136d88
lib/storage: remove possible data race when logging dropped labels
2021-05-20 02:47:22 +03:00
Aliaksandr Valialkin
eb8093ca6b
lib/promscrape/discovery/kubernetes: reload objects on object parse error
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-18 23:25:48 +03:00
Aliaksandr Valialkin
f4719889da
lib/httpserver: typo fix in -http.shutdownDelay
command-line flag description: servier -> server
2021-05-18 16:26:16 +03:00
Aliaksandr Valialkin
0cf1fe19e6
lib/promscrape/discovery/kubernetes: simplify the reload logic for urlWatcher.objectsByKey
2021-05-18 15:40:46 +03:00
Aliaksandr Valialkin
bfb61de606
lib/promscrape/discovery/kubernetes: properly update vm_promscrape_discovery_kubernetes_scrape_works metric
...
Previously it wasn't descreased during config update.
2021-05-18 15:33:45 +03:00
Aliaksandr Valialkin
3166994244
lib/promscrape/discovery/kubernetes: log errors and stop service discovery when unexpected updates are received from Kubernetes API server
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-18 15:10:48 +03:00
Aliaksandr Valialkin
6c944b86d8
docs: dealay -> delay
...
Thanks to @jelmd . See 0b7e3510c8 (r50884991)
2021-05-18 01:07:52 +03:00
Aliaksandr Valialkin
ec6a284978
lib/promrelabel: add tests for conditional removal of label on another label match
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1294
2021-05-18 00:23:03 +03:00
Aliaksandr Valialkin
2eed410466
lib/promscrape/discovery/kubernetes: key ScrapeWork objects by urlWatcher instead of namespace
...
This makes the code less fragile if urlWatcher would depend on additional to namepsace properties.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1170
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-17 23:47:08 +03:00
Aliaksandr Valialkin
733706e6c6
lib/promscrape: reload auth tokens from files every second
...
Previously auth tokens were loaded at startup and couldn't be updated without vmagent restart.
Now there is no need in vmagent restart.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1297
2021-05-14 20:00:08 +03:00
Aliaksandr Valialkin
10a47af631
app/{vmalert,vmauth}: explicitly set MaxIdleConnsPerHost in net/http.Client.Transport
...
By default MaxIdleConnsPerHost is set to 2. This limits the possibility to re-use http keep-alive connections.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1300
2021-05-14 18:12:24 +03:00
Aliaksandr Valialkin
fc3519fa26
lib/promscrape: limit scrape_timeout
by scrape_interval
like Prometheus does
2021-05-13 16:09:45 +03:00
匠心零度
b4f5be8bd8
fix vagent imbalance problem ( #1292 )
...
/path/to/vmagent -promscrape.cluster.membersCount=3 -promscrape.cluster.replicationFactor=2 -promscrape.cluster.memberNum=0 -promscrape.config=/path/to/config.yml ...
/path/to/vmagent -promscrape.cluster.membersCount=3 -promscrape.cluster.replicationFactor=2 -promscrape.cluster.memberNum=1 -promscrape.config=/path/to/config.yml ...
/path/to/vmagent -promscrape.cluster.membersCount=3 -promscrape.cluster.replicationFactor=2 -promscrape.cluster.memberNum=2 -promscrape.config=/path/to/config.yml ...
Co-authored-by: lirenzuo <lirenzuo@shein.com>
2021-05-13 11:14:51 +03:00
Aliaksandr Valialkin
e6fda03e8f
vendor: update github.com/VictoriaMetrics/fasthttp from v1.0.14 to v1.0.15
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1289
2021-05-13 10:44:14 +03:00
Aliaksandr Valialkin
f6a641de62
lib/promscrape: exponentially increase retry interval on unsuccesful requests to scrape targets or to service discovery services
...
This should reduce CPU load at vmagent and at remote side when the remote side doesn't accept HTTP requests.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1289
2021-05-13 10:38:50 +03:00
Aliaksandr Valialkin
c0ec541559
lib/cgroup: document the ability to detect cgroup v2 memory and cpu limits. This is follow-up for b50024812e
2021-05-13 09:26:20 +03:00
Aliaksandr Valialkin
d7be2753c0
lib/storage: substitute GetTSDBStatusForDate with GetTSDBStatusWithFiltersForDate with nil tfss
2021-05-13 09:02:33 +03:00
Nikolay
b50024812e
adds cgroupsv2 support ( #1283 )
...
* adds cgroupv2 limits support
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1269
* small fix
* changes Atoi to ParseUint
2021-05-13 09:02:13 +03:00
Aliaksandr Valialkin
a22a17dc66
lib/storage: merge getTSDBStatusForDate with getTSDBStatusWithFiltersForDate
...
These functions are non-trivial, while their code has minimal differences.
It is better from maintainability PoV to merge these functions into a single function.
2021-05-12 17:56:53 +03:00
Aliaksandr Valialkin
832651c6c2
app/vmselect: follow up after 8a0678678b
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1168
2021-05-12 17:18:30 +03:00
Nikolay
8a0678678b
Adds tsdb match filters ( #1282 )
...
* init work on filters
* init propose for status filters
* fixes tsdb status
adds test
* fix bug
* removes checks from test
2021-05-12 15:18:45 +03:00
Aliaksandr Valialkin
cfd6aa28e1
lib/promscrape/discovery/kubernetes: refresh endpoints and endpointslices scrape targets every 5 seconds, since they may depend on changed service and pod objects
...
This should make endpoints and endpointslices scrape targets eventually consistent with the maximum delay of 5 seconds after the related service or pod object changes.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-12 14:10:34 +03:00
Aliaksandr Valialkin
afec68ad13
lib/httpserver: add new X-Server-Hostname header instead of overwriting already exsiting header
...
This makes possible tracking origins of chained requests over multiple hops.
2021-05-11 23:48:59 +03:00