Commit Graph

1165 Commits

Author SHA1 Message Date
Aliaksandr Valialkin
97d1ccfc8e lib/promscrape: split docker and dockerswarm service discovery code bases, since they have very little in common
This is a follow up after c85a5b7fcb
2021-06-25 13:22:16 +03:00
Aliaksandr Valialkin
4461e20e7d lib/promscrape: consistently sort service discovery routines
This should simplify further maintenance of the code
2021-06-25 13:22:16 +03:00
Lu Jiajing
12b4cbb68f Support Docker ServiceDiscovery (#1402)
* add docker discovery

* add test

* add labels test and add scrape work

* remove TODO

* refactor to merge apiConfig and sdConfig

* apply suggestion
2021-06-25 13:22:16 +03:00
Nikolay
501429c3ff adds missing MustStop call to do and http sd (#1404) 2021-06-25 11:43:32 +03:00
Aliaksandr Valialkin
b84aea1e6e lib/protoparser/clusternative: do not pool unmarshalWork structs, since they can occupy big amounts of memory (more than 100MB per each struct)
This should reduce memory usage for vmstorage under high ingestion rate when the vmstorage runs on a system with big number of CPU cores
2021-06-23 15:45:08 +03:00
Aliaksandr Valialkin
a22f37599b lib/storage: tune tag filters search logic
Tune the logic according to the logs provided at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338#issuecomment-864293624

The previous logic had a race when multiple concurrent queries execute the same tag filter without prior stats.
This could result in incorrectly stored stats for such tag filter, which then could result in non-optimal sorting of tag filters
for further queries.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338
2021-06-23 13:30:36 +03:00
Aliaksandr Valialkin
f10fa0d1d7 lib/promscrape/discovery/consul: properly pass namespace to Consul watcher
Follow-up for 58a2989fe7
2021-06-22 17:43:20 +03:00
Aliaksandr Valialkin
4adf6c9766 lib/promscrape/discovery/http: follow up after e307bbb29a 2021-06-22 13:42:10 +03:00
Nikolay
e03a3d3a36 adds http_sd (#1399)
* adds http_sd

* adds X-Prometheus-Refresh-Interval-Seconds header

* Update lib/promscrape/discovery/http/api.go

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2021-06-22 13:42:09 +03:00
Aliaksandr Valialkin
3ab3902f17 lib/promscrape/discovery: support generic auth configs in Consul service discovery in the same way as Prometheus 2.28 does 2021-06-22 13:18:51 +03:00
Nikolay
827a2396d2 adds consul enterprise namespace support (#1400)
* adds consul enterprise namespace support

* Update lib/promscrape/discovery/consul/consul.go

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2021-06-22 12:56:11 +03:00
Aliaksandr Valialkin
f9069ba32a lib/promscrape: show jobs with empty scrape targets on /targets page 2021-06-18 10:54:12 +03:00
Nikolay
9ea1dca3dd fixes DO service discovery labels (#1389)
adds test for digitalocean sd
2021-06-17 17:21:10 +03:00
Aliaksandr Valialkin
a207be3ffb lib/storage: fix infinite loop introduced in aa9b56a046
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1244
2021-06-17 14:27:30 +03:00
Aliaksandr Valialkin
0efd37cec1 lib/{mergeset,storage}: reduce the number of fsync calls on data ingestion path on systems with many cpu cores
VictoriaMetrics maintains a buffer per CPU core for the ingested data. These buffers are flushed to disk every second.
These buffers are flushed to disk in parallel starting from the commit 56b6b893ce .
This resulted in increased write disk IO usage on systems with many cpu cores
as described at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338#issuecomment-863046999 .

This commit merges the per-CPU buffers into bigger in-memory buffers before flushing them to disk.
This should reduce the rate of fsync syscalls and, consequently, the write disk IO on systems with many CPU cores.

This should help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338
See also https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1244
2021-06-17 13:51:42 +03:00
Aliaksandr Valialkin
b133de1e37 lib/storage: move deletedMetricIDs set from indexDB to Storage
This makes consitent the list of deleted metricIDs when it is used from both the current indexDB and the previous indexDB (aka extDB).
This should fix the issue, which could lead to storing new samples under deleted metricIDs after indexDB rotation.
See more details at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1347#issuecomment-861232136 .

Thanks to @tangqipengleoo for the initial analysis and the pull request - https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1383 .

This commit resolves the issue in more generic way compared to https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1383 .

The downside of the commit is the deletedMetricIDs set isn't cleaned from the metricIDs outside the retention. It needs app restart.
This should be OK in most cases.
2021-06-15 15:07:54 +03:00
Aliaksandr Valialkin
ebaf68bcb0 lib/protoparser: stop reading the input stream as soon as the callback provided by the caller returns error
This is a follow-up for af90c3c43b
2021-06-14 15:20:38 +03:00
faceair
2ea187e801 lib/protoparser: stop read when callback error (#1380) 2021-06-14 15:20:37 +03:00
Aliaksandr Valialkin
5f91a701fa lib/promscrape: show the number of samples collected during the last scrape at /targets and /api/v1/targets pages
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1377
2021-06-14 14:04:35 +03:00
Nikolay
e42da47608 adds digital ocean sd (#1376)
* adds digital ocean sd config

* adds digital ocean sd
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1367

* typo fix
2021-06-14 13:19:29 +03:00
Aliaksandr Valialkin
df057177a0 lib/promscrape: increase the duration for reading the full response in stream parsing mode
Increase the duration from 10x to 30x of the configured `scrape_interval'.

This should help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1365
2021-06-14 12:29:46 +03:00
Aliaksandr Valialkin
074b11fa69 lib/protoparser: measure the duration for reading the whole block of data instead of a single read operation
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1365
2021-06-14 12:29:45 +03:00
Aliaksandr Valialkin
87d221f78a lib/protoparser/common: log the duration for reading a block of data in ReadLinesBlockExt on error
This may help debugging issues like https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1365
2021-06-14 12:21:21 +03:00
Aliaksandr Valialkin
0672cfffa2 app/vmauth: properly handle http.ErrAbortHandler panic
This panic can be raised by the reverseProxy on aborted request to the backend.
So handle it (e.g. suppress) at reverseProxy.ServeHTTP call.

Do not suppress the panic at lib/httpserver generic HTTP handler,
since it may result in an inconsistent state left after the panicking handler.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1353
2021-06-11 12:54:37 +03:00
Aliaksandr Valialkin
ce10bdc82a lib/storage: reset cache on disk during series deletion and during indexdb rotation
This should prevent from inconsistent behavior (aka partially missing data for some time series) after unclean shutdown.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1347
2021-06-11 12:54:36 +03:00
Aliaksandr Valialkin
eb335d2c29 lib/storage: consistency renaming: getMaxRawRowsPerPartition -> getMaxRawRowsPerShard 2021-06-11 10:52:31 +03:00
Aliaksandr Valialkin
d06c0e7a94 lib/storage: reduce the amounts of memory which can be occupied by rawRow items during data ingestion on a system with many CPU cores 2021-06-11 10:49:02 +03:00
Nikolay
2c1611d316 disables panic for net/httpAbortHandler (#1355) 2021-06-09 12:12:45 +03:00
Aliaksandr Valialkin
1e4a64844d lib/storage: properly account the number of loops spent when matching for or suffixes
This may help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338
2021-06-08 13:07:14 +03:00
Aliaksandr Valialkin
e7d353ee6a lib/promrelabel: add tests for labelsToString() function 2021-06-04 20:42:14 +03:00
Aliaksandr Valialkin
269e35d676 app/{vmagent,vminsert}: follow-up after 2fe045e2a4
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1343
2021-06-04 20:33:22 +03:00
jelmd
d8b46908db new feature: debug relabeling (#1344)
* new feature: relabel logging

Use scrape_configs[x].relabel_debug = true to log metric names inkl.
labels before and after relabeling. After relabeling related metrics
get dropped, i.e. not submitted to servers.

* vminsert wants relabel logging, too.
2021-06-04 20:33:21 +03:00
Nikolay
3d89c01d07 fixes solaris build (#1345) 2021-06-04 11:56:06 +03:00
Hason Chan
439c2ed510 fix eureka_sd_configs HTTPClientConfig incorrect parsing (#1350) 2021-06-04 11:56:06 +03:00
Aliaksandr Valialkin
fc2565b4ee lib/storage: reduce memory allocations when syncing dateMetricIDCache 2021-06-03 16:20:02 +03:00
Aliaksandr Valialkin
0b9f0de0a1 lib/promscrape: fix tests after f0c21b6300 2021-05-28 01:33:28 +03:00
Aliaksandr Valialkin
6865f3b497 Revert "lib/mergeset: remove a pool for inmemoryBlock structs"
This reverts commit 793fe39921.

Reason to revert: production testing revealed possible slowdown when registering big number of new time series
2021-05-28 01:11:22 +03:00
Aliaksandr Valialkin
7b33bc67a1 lib/mergeset: remove a pool for inmemoryBlock structs
The pool for inmemoryBlock struct doesn't give any performance gains in production workloads,
while it may result in excess memory usage for inmemoryBlock structs inside the pool during
background merge of indexdb.
2021-05-27 22:00:50 +03:00
Aliaksandr Valialkin
97de72054e docs: document f0c21b6300 2021-05-27 15:04:13 +03:00
faceair
b801b299f0 lib/promscrape: apply body size & sample limit to stream parse (#1331)
* lib/promscrape: apply body size limit to stream parse

Signed-off-by: faceair <git@faceair.me>

* lib/promscrape: apply sample limit to stream parse

Signed-off-by: faceair <git@faceair.me>
2021-05-27 15:04:11 +03:00
Aliaksandr Valialkin
49490ae5a7 lib/protoparser/clusternative: remove duplicate cannot read packet size phrase from the log message
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1336
2021-05-27 12:09:17 +03:00
Aliaksandr Valialkin
c85084b659 lib/handshake: pass io.EOF unmodified to the caller for BufferedConn.Read, so it could properly detect the end of stream 2021-05-27 12:09:17 +03:00
Aliaksandr Valialkin
10b2855949 lib/storage: fix spelling typo: borken->broken
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1336
2021-05-27 12:09:17 +03:00
Aliaksandr Valialkin
6b90570ed3 lib/uint64set: store pointers to bucket16 instead of bucket16 objects in bucket32
This speeds up bucket32.addBucketAtPos() when bucket32.buckets contains big number of items,
since the copying of bucket16 pointers is much faster than the copying of bucket16 objects.

This is a cpu profile for copying bucket16 objects:

      10ms     13.43s (flat, cum) 32.01% of Total
      10ms      120ms    650:	b.b16his = append(b.b16his[:pos+1], b.b16his[pos:]...)
         .          .    651:	b.b16his[pos] = hi
         .     13.31s    652:	b.buckets = append(b.buckets[:pos+1], b.buckets[pos:]...)
         .          .    653:	b16 := &b.buckets[pos]
         .          .    654:	*b16 = bucket16{}
         .          .    655:	return b16
         .          .    656:}

This is a cpu profile for copying pointers to bucket16:

      10ms      1.14s (flat, cum)  2.19% of Total
         .      100ms    647:	b.b16his = append(b.b16his[:pos+1], b.b16his[pos:]...)
         .          .    648:	b.b16his[pos] = hi
      10ms      700ms    649:	b.buckets = append(b.buckets[:pos+1], b.buckets[pos:]...)
         .      330ms    650:	b16 := &bucket16{}
         .          .    651:	b.buckets[pos] = b16
         .          .    652:	return b16
         .          .    653:}
2021-05-25 14:27:52 +03:00
Aliaksandr Valialkin
1c16cbacf5 lib/storage: do not stop data ingestion on the first error in Storage.AddRows
Continue data ingestion for the rest of blocks.
2021-05-24 15:32:24 +03:00
Aliaksandr Valialkin
2601844de3 lib/storage: limit the number of rows per each block in Storage.AddRows()
This should reduce memory usage when ingesting big blocks or rows.
2021-05-24 15:32:24 +03:00
Aliaksandr Valialkin
95b735a883 lib/storage: allow filling all the rows up to their capacity in rawRowsShard.addRows
This should reduce memory usage a bit on data ingestion path
2021-05-24 15:32:24 +03:00
Aliaksandr Valialkin
0f84503880 lib/bloomfilter: fix TestLimiterConcurrent 2021-05-24 05:18:29 +03:00
Aliaksandr Valialkin
745eda9e87 lib/fs: do not pass done callback to tryRemoveAll() func
This improves code readability a bit.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1313
2021-05-24 05:00:53 +03:00
Aliaksandr Valialkin
402a8ca710 lib/storage: do not populate MetricID->MetricName cache during data ingestion
This cache isn't needed during data ingestion, so there is no need in spending RAM on it.

This reduces RAM usage on data ingestion path by 30%
2021-05-24 03:06:40 +03:00
Aliaksandr Valialkin
0fc857d363 lib/{mergeset,storage}: reduce the number of IFNO log messages like merged ... items across ... blocks in ... seconds
Log these messages if the merge takes more than 30 seconds instead of 10 seconds.
2021-05-23 14:15:49 +03:00
Aliaksandr Valialkin
71ff7ee18d lib/promauth: follow-up after 5b8176c68e 2021-05-22 18:02:03 +03:00
Nikolay
2780d6dbcd basic OAuth2 support for remoteWrite and scrape targets (#1316)
* adds OAuth2 support for remoteWrite and scrapping

* adds tests
changes init
2021-05-22 18:02:01 +03:00
Aliaksandr Valialkin
89e1a45cdb lib/fs: concurrently remove up to 1024 blocked NFS directories
Previously the blocked directories were removed sequentially by a single goroutine.
This can be not enough for highly loaded VictoriaMetrics that accepts millions of sample per second,
when big number of LSM parts are created and removed at high rate.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1313
2021-05-21 17:58:08 +03:00
Aliaksandr Valialkin
23355ca34c lib/fs: wait for a while before giving up on NFS file removal if the removal queue is full
This should reduce the probability of the panic on a highly loaded VictoriaMetrics
accepting millions of samples per second.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1313
2021-05-21 17:21:35 +03:00
Aliaksandr Valialkin
d77db9d813 all: do not skip SIGHUP signal during service initialization
This can lead to stale or incomplete configs like in the https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-21 16:38:20 +03:00
Aliaksandr Valialkin
69e365cd48 Makefile: update golangci-lint from v1.29.0 to v1.40.1 2021-05-20 18:30:24 +03:00
Aliaksandr Valialkin
da0b32c31a app/vmagent/remotewrite: expose metrics with the current number of active series per day and per hour
These numbers are exposed via the following metrics:

- vmagent_hourly_series_limit_current_series
- vmagent_daily_series_limit_current_series

Expose also the limits via the following metrics:

- vmagent_hourly_series_limit_max_series
- vmagent_daily_series_limit_max_series
2021-05-20 15:31:57 +03:00
Aliaksandr Valialkin
165a9f9200 app/vmstorage: add ability to limit series cardinality via -storage.maxHourlySeries and -storage.maxDailySeries command-line flags 2021-05-20 15:31:57 +03:00
Aliaksandr Valialkin
7aad5c3f76 app/vmagent: add ability to limit series cardinality on a per-hour and per-day basis 2021-05-20 15:31:57 +03:00
Aliaksandr Valialkin
110a888e39 lib/promscrape/discovery/kubernetes: make golangci-lint happy by removing empty branches 2021-05-20 12:00:17 +03:00
Aliaksandr Valialkin
e228f479a5 lib/storage: remove possible data race when logging dropped labels 2021-05-20 11:54:06 +03:00
Aliaksandr Valialkin
9d97f44772 lib/promscrape/discovery/kubernetes: reload objects on object parse error
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-18 23:27:24 +03:00
Aliaksandr Valialkin
74ef40034c lib/httpserver: typo fix in -http.shutdownDelay command-line flag description: servier -> server 2021-05-18 16:25:27 +03:00
Aliaksandr Valialkin
c507faec0b lib/promscrape/discovery/kubernetes: simplify the reload logic for urlWatcher.objectsByKey 2021-05-18 15:41:51 +03:00
Aliaksandr Valialkin
0f54c0121b lib/promscrape/discovery/kubernetes: properly update vm_promscrape_discovery_kubernetes_scrape_works metric
Previously it wasn't descreased during config update.
2021-05-18 15:41:51 +03:00
Aliaksandr Valialkin
9f62d348db lib/promscrape/discovery/kubernetes: log errors and stop service discovery when unexpected updates are received from Kubernetes API server
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-18 15:41:51 +03:00
Aliaksandr Valialkin
6ea191d196 docs: dealay -> delay 2021-05-18 01:07:32 +03:00
Aliaksandr Valialkin
c4ed50ae54 lib/promrelabel: add tests for conditional removal of label on another label match
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1294
2021-05-18 00:23:23 +03:00
Aliaksandr Valialkin
8764b0ae21 lib/promscrape/discovery/kubernetes: key ScrapeWork objects by urlWatcher instead of namespace
This makes the code less fragile if urlWatcher would depend on additional to namepsace properties.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1170
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-17 23:49:48 +03:00
Aliaksandr Valialkin
e08287f017 lib/promscrape: reload auth tokens from files every second
Previously auth tokens were loaded at startup and couldn't be updated without vmagent restart.
Now there is no need in vmagent restart.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1297
2021-05-14 20:03:35 +03:00
Aliaksandr Valialkin
a6cb4f10a7 app/{vmalert,vmauth}: explicitly set MaxIdleConnsPerHost in net/http.Client.Transport
By default MaxIdleConnsPerHost is set to 2. This limits the possibility to re-use http keep-alive connections.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1300
2021-05-14 18:13:34 +03:00
Aliaksandr Valialkin
e3f61d540b lib/promscrape: limit scrape_timeout by scrape_interval like Prometheus does
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1281
2021-05-13 16:10:42 +03:00
匠心零度
d5285ecaf0 fix vagent imbalance problem (#1292)
/path/to/vmagent -promscrape.cluster.membersCount=3 -promscrape.cluster.replicationFactor=2 -promscrape.cluster.memberNum=0 -promscrape.config=/path/to/config.yml ...
/path/to/vmagent -promscrape.cluster.membersCount=3 -promscrape.cluster.replicationFactor=2 -promscrape.cluster.memberNum=1 -promscrape.config=/path/to/config.yml ...
/path/to/vmagent -promscrape.cluster.membersCount=3 -promscrape.cluster.replicationFactor=2 -promscrape.cluster.memberNum=2 -promscrape.config=/path/to/config.yml ...

Co-authored-by: lirenzuo <lirenzuo@shein.com>
2021-05-13 11:19:30 +03:00
Aliaksandr Valialkin
f13585dc5d vendor: update github.com/VictoriaMetrics/fasthttp from v1.0.14 to v1.0.15
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1289
2021-05-13 10:47:09 +03:00
Aliaksandr Valialkin
d13906bf1f lib/promscrape: exponentially increase retry interval on unsuccesful requests to scrape targets or to service discovery services
This should reduce CPU load at vmagent and at remote side when the remote side doesn't accept HTTP requests.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1289
2021-05-13 10:47:07 +03:00
Aliaksandr Valialkin
66c6976723 lib/cgroup: document the ability to detect cgroup v2 memory and cpu limits. This is follow-up for b50024812e 2021-05-13 09:27:35 +03:00
Nikolay
8743bf541f adds cgroupsv2 support (#1283)
* adds cgroupv2 limits support
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1269

* small fix

* changes Atoi to ParseUint
2021-05-13 09:27:33 +03:00
Aliaksandr Valialkin
2839055513 lib/storage: substitute GetTSDBStatusForDate with GetTSDBStatusWithFiltersForDate with nil tfss 2021-05-13 09:01:05 +03:00
Aliaksandr Valialkin
008ae25b3a lib/storage: merge getTSDBStatusForDate with getTSDBStatusWithFiltersForDate
These functions are non-trivial, while their code has minimal differences.
It is better from maintainability PoV to merge these functions into a single function.
2021-05-12 18:01:08 +03:00
Nikolay
be87be34a4 Adds tsdb match filters (#1282)
* init work on filters

* init propose for status filters

* fixes tsdb status
adds test

* fix bug

* removes checks from test
2021-05-12 17:16:58 +03:00
Aliaksandr Valialkin
027607db3e lib/promscrape/discovery/kubernetes: refresh endpoints and endpointslices scrape targets every 5 seconds, since they may depend on changed service and pod objects
This should make endpoints and endpointslices scrape targets eventually consistent with the maximum delay of 5 seconds after the related service or pod object changes.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-12 14:12:43 +03:00
Aliaksandr Valialkin
1d32b008c6 lib/httpserver: add new X-Server-Hostname header instead of overwriting already exsiting header
This makes possible tracking origins of chained requests over multiple hops.
2021-05-11 23:47:19 +03:00
Aliaksandr Valialkin
f1317f7c6c lib/httpserver: return X-Server-Hostname http header in all the responses for better debuggability 2021-05-11 22:04:41 +03:00
Aliaksandr Valialkin
4e59cf4380 lib/storage: properly apply time range when matching an empty filter
It must match all the time series on the given time range.
Previously it was matched to all the time series without the restriction on the given time range.
2021-05-11 01:09:35 +03:00
Aliaksandr Valialkin
326cf83eb4 lib/storage: remove dead code after the commit 3ccf7ea20c 2021-05-08 20:15:59 +03:00
Aliaksandr Valialkin
9c505d27dd lib/ingestserver: properly close incoming connections during graceful shutdown 2021-05-08 19:53:45 +03:00
Aliaksandr Valialkin
4a5f45c77e app/vminsert: add support for data ingestion via other vminsert nodes 2021-05-08 19:53:45 +03:00
Aliaksandr Valialkin
e6c19cb09d lib/promscrape/discovery/kubernetes: start watchers for pods and services before starting watchers for endpoints
This should eliminate possible race when an update on endpoints depends on pods and/or services, which are missing in the cache yet.
This could result in missing targets based on endpoints or endpointslices.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-05 12:23:16 +03:00
Aliaksandr Valialkin
43c52ff77a lib/storage: use WARNING instead of INFO level for logging dropped labels 2021-05-03 13:57:28 +03:00
Aliaksandr Valialkin
ec6becd3f5 lib/httpserver: stop the process on panics in request handlers
Panics may leave the process in inconsistent state. That's why it is better to stop the process after the panic
instead of recovering from the panic. Unfortunately, the standard net/http.Server recovers panics in request handlers.
See https://github.com/golang/go/issues/16542 . That's lib/httpserver must stop the process on itself after the panic.
2021-05-03 12:00:44 +03:00
Nikolay
62d58324dd adds stalePartsRemover (#1261)
for new created partitions
2021-05-03 11:34:33 +03:00
Aliaksandr Valialkin
60ffbcbb99 lib/promrelabel: add tests for removing the specified {label="value"} pair 2021-05-03 11:26:58 +03:00
Aliaksandr Valialkin
b43ba6d85f lib/storage: log dropped labels if the number of labels in a metric exceeds -maxLabelsPerTimeseries command-line flag value
This should improve debuggability for this case.
2021-05-01 09:29:56 +03:00
Aliaksandr Valialkin
8be1cb297b app/vmagent: list user-visible endpoints at http://vmagent:8429/
While at it, use common WriteAPIHelp function for the listing in vmagent, vmalert and victoria-metrics
2021-04-30 09:38:23 +03:00
Aliaksandr Valialkin
421a92983a lib/promscrape/discovery/kubernetes: remove a mutex at urlWatcher - use groupWatcher mutex for accessing all the urlWatcher children
This simplifies the code a bit and reduces the probability of improper mutex handling and deadlocks.
2021-04-29 10:17:45 +03:00
Nikolay
535b3ff618 vmagent kubernetes_sd tests (#1253)
* first part of tests for kubernetes sd

* makes linter happy

* added more test cases

* adds pub/sub for tests
2021-04-29 10:17:45 +03:00
Aliaksandr Valialkin
e37e1b1e34 lib/{storage,mergeset}: fix unaligned 64-bit atomic operation panic for 32-bit architectures
The panic has been introduced in 56b6b893ce
2021-04-27 16:42:19 +03:00
Aliaksandr Valialkin
2d1d60118d lib/mergeset: split rows ingestion among multiple shards
This improves rows ingestion on systems with many CPU cores by reducing lock contention.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1244

Thanks to @waldoweng for the original idea and draft implementation at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1243
2021-04-27 15:45:11 +03:00
Aliaksandr Valialkin
b3da457629 lib/promscrape/discovery/kubernetes: fix a deadlock introduced in eddba29664
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240

Thanks to @f41gh7 for providing the initial idea for deadlock fix at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1248
2021-04-27 14:59:56 +03:00