Aliaksandr Valialkin
815666e6a6
lib/promscrape/discovery/kubernetes: cache target labels
...
This should reduce CPU usage on repeated SDConfig.GetLabels() calls.
2021-02-26 20:23:28 +02:00
Aliaksandr Valialkin
19712fc2bd
lib/promscrape/discovery/kubernetes: errcheck fix
2021-02-26 17:00:08 +02:00
Aliaksandr Valialkin
c8f2f9b2e8
lib/promscrape: cleanup after 9b2246c29b
...
Main points:
* Revert changes outside lib/promscrape/discovery/kuberntes . These changes can be applied later in a separate commit
* Minimize changes in lib/promscrape/discovery/kubernetes compared to a93e644001
* Corner case fixes.
2021-02-26 16:54:05 +02:00
Nikolay
9b2246c29b
vmagent kubernetes watch stream discovery. ( #1082 )
...
* started work on sd for k8s
* continue work on watch sd
* fixes
* continue work
* continue work on sd k8s
* disable gzip
* fixes typos
* log errror
* minor fix
Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2021-02-26 16:46:13 +02:00
Aliaksandr Valialkin
a93e644001
lib/promscrape: remove duplicate code a bit
2021-02-26 16:39:56 +02:00
Aliaksandr Valialkin
f7b242540b
lib/promscrape: reduce processing time for big number of discovered targets by processing them in parallel
2021-02-26 16:39:56 +02:00
Aliaksandr Valialkin
f7049e2af7
lib/promrelabel: optimize labeldrop
and labelkeep
relabeling for prefix.*
and prefix.+
regexps
2021-02-24 17:58:28 +02:00
Aliaksandr Valialkin
2c44178645
lib/storage: consistency renaming: durationsPerDateTagFilterCache -> loopsPerDateTagFilterCache
2021-02-23 15:47:19 +02:00
faceair
15d61c4879
lib/storage: correct tagfilter match cost ( #1079 )
2021-02-22 21:46:56 +02:00
Aliaksandr Valialkin
d136081040
lib/promrelabel: add more optimizations for relabeling for common cases
2021-02-22 16:33:55 +02:00
Aliaksandr Valialkin
dd1e53b119
lib/promrelabel: optimize relabeling performance for common cases
2021-02-22 00:51:13 +02:00
Aliaksandr Valialkin
ff5bbc4b88
lib/promscrape: export vm_promscrape_target_relabel_duration_seconds metric
2021-02-21 23:21:42 +02:00
Aliaksandr Valialkin
636c55b526
lib/mergeset: reduce memory usage for inmemoryBlock by using more compact items representation
...
This also should reduce CPU time spent by GC, since inmemoryBlock.items don't have pointers now,
so GC doesn't need visiting them.
2021-02-21 22:06:47 +02:00
Aliaksandr Valialkin
388cdb1980
lib/storage: do not re-calculate stats for heavy tag filters
...
This should reduce the number of slow queries when stats for heavy tag filters was recalculated.
2021-02-21 21:39:01 +02:00
Aliaksandr Valialkin
48656dcc38
lib/{mergeset,storage}: allow merging smaller number of small parts
...
While this may increase CPU and disk IO usage needed for background merge,
this also recudes CPU usage during queries in production. This is because
such queries tend to read recently added data and it is better to have lower number
of parts for such data in order to reduce CPU usage.
This partially reverts ebf8da3730
2021-02-21 21:28:36 +02:00
Aliaksandr Valialkin
cb311bb156
lib/{mergeset,storage}: do not use pools for indexBlock and inmemoryBlock during their caching, since this results in higher memory usage in production without any performance gains
2021-02-21 21:18:59 +02:00
Aliaksandr Valialkin
2cfb376945
lib/promscrape: typo fix after the commit f26162ec99
2021-02-19 00:33:37 +02:00
Aliaksandr Valialkin
c2678754e4
app/vmagent: properly perform graceful shutdown, which was broken in the commit 1d1ba889fe
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1065
2021-02-19 00:31:34 +02:00
Aliaksandr Valialkin
f26162ec99
lib/promscrape: add scrape_align_interval config option into scrape config
...
This option allows aligning scrapes to a particular intervals.
2021-02-18 23:53:44 +02:00
Aliaksandr Valialkin
f9084611bd
lib/storage: use composite index for a query with a name filter and negative filters
2021-02-18 18:57:23 +02:00
Aliaksandr Valialkin
a537c4f602
lib/storage: properly handle queries containing a filter on metric name plus any number of negative filters and zero non-negative filters
...
Example: `node_cpu_seconds_total{mode!="idle"}`
2021-02-18 18:46:36 +02:00
Aliaksandr Valialkin
e540c02014
lib/storage: prevent from running identical heavy tag filters in concurrent queries when measuring the number of loops for such tag filter.
...
This should reduce CPU usage spikes when measuring the number of loops needed for heavy tag filters
2021-02-18 13:58:18 +02:00
Aliaksandr Valialkin
711f8a5b8d
lib/storage: sort tag filters by the number of loops they need for the execution
...
This metric should work better than the filter execution duration, since it cannot be distorted
by concurrently running queries.
2021-02-18 12:47:38 +02:00
Aliaksandr Valialkin
ce99b48a9a
Revert "lib/mergeset: tune lifetime for entries inside block caches"
...
This reverts commit 458c89324d
.
Production testing revealed zero improvements for memory usage with reduced lifetime for entries in block caches.
2021-02-17 20:42:21 +02:00
Aliaksandr Valialkin
939d5ffc2b
lib/storage: move composite filters to the top during sorting
2021-02-17 20:26:51 +02:00
Aliaksandr Valialkin
faad6f84a4
lib/storage: return back filter arg to getMetricIDsForTagFilter function
...
The filter arg has been removed in the commit c7ee2fabb8
because it was preventing from caching the number of matching time series per each tf.
Now the cache contains duration for tf execution, so the filter shouldn't break such caching.
2021-02-17 19:33:22 +02:00
Aliaksandr Valialkin
d4849561ef
app/vmstorage: export vm_composite_filter_success_conversions_total and vm_composite_filter_missing_conversions_total metrics
2021-02-17 19:13:38 +02:00
Aliaksandr Valialkin
33806264ec
lib/storage: revert ecf132933e
, since negative filters require the same amount of work as non-negative filters
2021-02-17 18:55:04 +02:00
Aliaksandr Valialkin
63fc140624
lib/storage: tag filters sorting...
2021-02-17 17:55:29 +02:00
Aliaksandr Valialkin
74424b55ee
lib/storage: further tune tag filters sorting
2021-02-17 17:28:15 +02:00
Aliaksandr Valialkin
442fcfec5a
lib/storage: tune the logic for sorting tag filters according the their exeuction times
2021-02-17 15:00:08 +02:00
Aliaksandr Valialkin
4a07820048
lib/storage: make sure that nobody uses partitions when closing the table
2021-02-17 14:59:04 +02:00
Aliaksandr Valialkin
1256931aee
lib/httpserver: make errcheck happy
2021-02-16 22:05:32 +02:00
Aliaksandr Valialkin
d61f7b7279
lib/storage: more tuning for tag filters sorting according the time they take
2021-02-16 21:22:23 +02:00
Aliaksandr Valialkin
458c89324d
lib/mergeset: tune lifetime for entries inside block caches
...
This should reduce memory usage in general case without significant CPU usage increase
2021-02-16 18:11:51 +02:00
Aliaksandr Valialkin
2824856691
lib/mergeset: clarify comments in the code a bit
2021-02-16 18:02:57 +02:00
Aliaksandr Valialkin
1bf6cd814d
lib/uint64set: remove memory allocation in bucket16.appendTo when sorting smallPool
2021-02-16 15:31:49 +02:00
Aliaksandr Valialkin
8ec45ff335
lib/httpserver: cache /metrics
output for a second
...
This should reduce CPU load when `/metrics` output is scraped with a frequency exceeding a request per second
2021-02-16 14:56:36 +02:00
Aliaksandr Valialkin
b861a64510
lib/protoparser/influx: make sure that escaped whitespace can be put in measurement, tag names and field names
2021-02-16 13:59:18 +02:00
Aliaksandr Valialkin
7faa762021
lib/mergeset: remove unused code after a4140de9e6
2021-02-16 13:40:09 +02:00
Aliaksandr Valialkin
ca191696fe
lib/storage: tune sorting for tag filters
2021-02-16 13:04:49 +02:00
Aliaksandr Valialkin
ecf132933e
lib/storage: increase match cost for negative tag filters, since they need to scan all the label pairs
2021-02-15 16:34:23 +02:00
Aliaksandr Valialkin
4e39bf148c
vendor: update github.com/VictoriaMetrics/metrics from v1.13.1 to v1.14.0
...
The new version switches from log-linear histograms to log-based histograms,
which provide up to 3.6 times better accuracy.
2021-02-15 15:12:29 +02:00
Aliaksandr Valialkin
9f5ac603a7
lib/storage: reduce the minimum supported retention for inverted index from one month to one day
2021-02-15 15:12:29 +02:00
Aliaksandr Valialkin
2e30202dc7
lib/flagutil: prevent from integer overflow when parsing duration
2021-02-15 15:12:29 +02:00
Aliaksandr Valialkin
38d7e96602
lib/promscrape/discovery/kubernetes: add __meta_kubernetes_endpoints_label_*
and __meta_kuberntes_endpoints_annotation_*
labels to role: endpoints
...
This syncs kubernetes SD with Prometheus 2.25
See 617c56f55a
2021-02-15 02:51:16 +02:00
Aliaksandr Valialkin
74963f71c6
lib/logger: explicitly import "time/tzdata" package for embedding tzdata into the app
...
The approach with `timetzdata` build tag didn't work for GOARCH=arm and GOARCH=ppc64le
due to the issue https://github.com/golang/go/issues/44073#issuecomment-778854298
2021-02-15 01:00:01 +02:00
Aliaksandr Valialkin
71c417427c
lib/storage: sort tag filters by actual execution time instead of by the number of matching time series
...
This should improve query speed for queries with regexp filters matching small number of time series
on a label with big number of unique values.
2021-02-15 00:18:13 +02:00
Aliaksandr Valialkin
c727d2219b
lib/storage: properly hanle regexp tag filters with dots, which can be converted to full string match filters.
...
For example `{label=~"foo\.bar"}` should be converted to `{label="foo.bar"}`. Previously it has was mistakenly conveted to `{label="foo\.bar"}` .
This could result in missing time series for such tag filters.
2021-02-14 23:38:14 +02:00
Aliaksandr Valialkin
80dc74dbc1
lib/promscrape: remove vm_promscrape_scrapes_failed_per_url_total and vm_promscrape_scrapes_skipped_by_sample_limit_per_url_total metrics
...
These metrics may result in big number of time series when vmagent scrapes thousands of targets and these targets constantly changes.
* It is better using `up == 0` query for determining failing targets.
* It is better using the following query for determining targets with exceeded limit on the number of metrics:
scrape_samples_scraped > 0 if up == 0
2021-02-12 05:26:04 +02:00
Aliaksandr Valialkin
0e26b7168a
lib/storage: return back in-order applying of tag filters, since concurrently executing tag filters can result in CPU and RAM waste in common case
2021-02-10 22:41:04 +02:00
Aliaksandr Valialkin
b51c23dc5b
lib/storage: parallelize tag filters execution a bit
...
This should reduce execution time when a query contains multiple tag filters and each such filter matches big number of time series.
2021-02-10 18:12:25 +02:00
Aliaksandr Valialkin
c7ee2fabb8
lib/storage: remove filter arg from getMetricIDsForDateTagFilter function
...
The `filter` arg breaks the logic for sorting tag filters by the matching metrics,
which may result in non-optimal performance during time series search.
2021-02-10 18:12:20 +02:00
Aliaksandr Valialkin
57cac289e0
lib/storage: fix inconsistencies in error logs
2021-02-10 18:12:16 +02:00
Aliaksandr Valialkin
5d5f0b0627
lib/storage: load metadata before loading indexdb, since indexdb depends on the metadata
2021-02-10 17:55:40 +02:00
Aliaksandr Valialkin
cdecf83ce5
app/vmstorage: export vm_composite_index_min_timestamp metric
2021-02-10 17:14:08 +02:00
Aliaksandr Valialkin
553016ea99
lib/storage: disable composite index usage when querying old data
2021-02-10 14:57:50 +02:00
Aliaksandr Valialkin
fcb7655d1e
lib/storage: fix metric name match for composite filter
2021-02-10 01:27:45 +02:00
Aliaksandr Valialkin
c7dccebaef
lib/storage: optimize search by label filters matching big number of time series
2021-02-10 00:44:54 +02:00
Aliaksandr Valialkin
6b4e6c229c
lib/storage: reduce lock contention in dateMetricIDCache when registering new time series for the current day
...
This should help systems with multiple CPU cores
2021-02-10 00:01:13 +02:00
Aliaksandr Valialkin
31f6b9c977
lib/fs: remove the code for tracking whether the given memory region is in page cache
...
This code didn't give performance gains under production workload, so let's remove it in order to simplify the code.
2021-02-09 16:49:03 +02:00
Aliaksandr Valialkin
0a69122d81
lib/mergeset: remove dead code left after a4140de9e6
2021-02-09 16:33:52 +02:00
Aliaksandr Valialkin
d56390b925
optimize Storage.updatePerDateData()
2021-02-09 02:55:36 +02:00
Aliaksandr Valialkin
fda61e8e96
lib/storage: skip deduplication when creating inmemory data blocks
...
The deduplication will be performed later during merging such blocks.
2021-02-09 02:25:32 +02:00
Aliaksandr Valialkin
a4140de9e6
lib/mergeset: unconditionally cache indexdb blocks
...
Production workloads show that indexdb blocks must be cached unconditionally for reducing CPU usage.
This shouldn't increase memory usage too much, since unused blocks are removed from the cache every two minutes.
2021-02-09 00:47:50 +02:00
Aliaksandr Valialkin
cb96a1865b
app/vmstorage: export missing vm_cache_size_bytes
metrics for indexdb and data caches
2021-02-09 00:47:00 +02:00
Aliaksandr Valialkin
c5770600a2
lib/cgroup: follow-up after b9bf3cbe3e
2021-02-08 15:54:38 +02:00
Nikolay
b9bf3cbe3e
refactored cgroups limits, ( #1061 )
...
adds tests, remove os.Exec
2021-02-08 15:46:22 +02:00
Aliaksandr Valialkin
2242647a04
lib/storage: optimize data ingestion in the beginning of every hour
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1046
2021-02-08 12:01:12 +02:00
Aliaksandr Valialkin
8f28a578d3
lib/logger: exit the app if unsupported timezone value has been passed to -loggerTimezone
...
While at it, clarify descrption for `-loggerTimezone` command-line flag.
2021-02-07 23:35:37 +02:00
Aliaksandr Valialkin
83d3e582ab
lib/storage: check for prevHourMetricIDs cache before falling back to checking for (date, metricID) entries during data ingestion
...
This should reduce possible CPU usage spikes at the beginning of every hour.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1046
2021-02-04 18:48:13 +02:00
Aliaksandr Valialkin
9fb38569eb
lib/httpserver: expose process_open_fds
and process_max_fds
metrics
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/402
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1037
2021-02-04 16:40:50 +02:00
Nikolay
48c8c5093b
fixes dockerswarm ( #1053 )
...
fixes improper usage of host network services
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1028
2021-02-04 15:56:42 +02:00
Aliaksandr Valialkin
d16f22f3a1
app/vmselect,lib/storage: properly parse Graphite selectors with inner wildcards
...
Example: foo{bar{x,yz},a[b-c],*de}
2021-02-03 20:14:22 +02:00
Aliaksandr Valialkin
a5a1b9bd66
lib/storage: fix a bug, which breaks searching by Graphite wildcard filters
2021-02-03 20:14:22 +02:00
Aliaksandr Valialkin
6123aa3e75
sort orSuffixes in tagFilter.InitFromGraphiteQuery for faster seeks
2021-02-03 20:14:22 +02:00
Aliaksandr Valialkin
157c02622b
app/vmselect: add ability to set Graphite-compatible filter via {__graphite__="foo.*.bar"}
syntax
2021-02-03 01:21:54 +02:00
Aliaksandr Valialkin
4068f8d590
lib/promscrape: add vm_promscrape_service_discovery_duration_seconds metric
2021-02-02 16:15:25 +02:00
Aliaksandr Valialkin
bd11fd8f1d
lib/promscrape: add vm_promscrape_scrape_retries_total
, vm_promscrape_discovery_retries_total
and vm_promscrape_discovery_requests_total
metrics
2021-02-01 20:06:27 +02:00
Aliaksandr Valialkin
f0087f0dbb
lib/flagutil: typo fix in comment to ArrayInt.GetOptionalArgOrDefault() func
2021-02-01 14:35:39 +02:00
Aliaksandr Valialkin
b2aa80e74b
app/vmagent: add -remoteWrite.roundDigits command-line option for limiting the number of digits after the point for stored values
...
This commit also adds --vm-round-digits command-line option to vmctl tool.
2021-02-01 14:27:09 +02:00
Aliaksandr Valialkin
d6347a3e56
lib/logger: initialize timezone by UTC in order to fix failing tests
2021-01-27 00:59:12 +02:00
Aliaksandr Valialkin
fc5b26d856
lib/promscrape: export vm_promscrape_scrapes_failed_per_url_total
and vm_promscrape_scrapes_skipped_by_sample_limit_per_url_total
metrics
...
These metrics could be useful for determining imporperly working scrape targets.
Note that these metrics are exported only for failing scrape targets. They aren't exposed for normally working targets.
2021-01-27 00:39:26 +02:00
Aliaksandr Valialkin
de3c662e8a
all: consistently use timers from timerpool
2021-01-27 00:39:26 +02:00
Aliaksandr Valialkin
3149ac7a7e
lib/fs: properly initialize cleaner for pageCache bitmaps
...
Previously it wasnt working because the timer was fired only once
2021-01-27 00:39:26 +02:00
Aliaksandr Valialkin
3fe848cdd7
lib/logger: add -loggerTimezone
command-line flag for adjusting timezone for timestamps in log messages
2021-01-26 22:51:54 +02:00
Aliaksandr Valialkin
8cea3c3cc4
lib/promscrape: retry scrape and service discovery requests when the remote server closes http keep-alive connection
2021-01-22 13:22:33 +02:00
faceair
b638c1eed5
lib/mergeset: add missing shouldCacheBlock ( #1019 )
2021-01-15 11:46:01 +02:00
Aliaksandr Valialkin
bdd0a1cdb2
lib/backup: increase backup chunk size from 128MB to 1GB
...
This should reduce costs for object storage API calls by 8x. See https://cloud.google.com/storage/pricing#operations-pricing
2021-01-13 12:16:35 +02:00
Nikolay
7bf5d48315
bumps minimal tls version ( #1012 )
2021-01-13 00:35:47 +02:00
Aliaksandr Valialkin
31ec79eaf6
lib/storage: inline marshalTags function and remove the code for handling duplicate tags from here
...
This is a follow-up commit after c8ea697db8
2021-01-12 15:13:30 +02:00
Aliaksandr Valialkin
c8ea697db8
lib/storage: de-duplicate tags in MetricName.sortTags
...
Leave only the last tag among tags with duplicate keys. This is needed for reliable addition of extra_labels
during data ingestion. See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1007 for details.
2021-01-12 15:03:42 +02:00
Nikolay
7976c22797
Fixes error handling for promscrape.streamParse ( #1009 )
...
properly return error if client cannot read data,
properly suppress scraper errors
2021-01-12 13:31:47 +02:00
Aliaksandr Valialkin
2c44f9989a
lib/promscrape: properly show scrape duration on /targets
page
...
Previously it has been shown as 0.000s for any scrape duration.
2021-01-11 21:14:46 +02:00
Aliaksandr Valialkin
24ffad74c1
all: use net.Dial
instead of fasthttp.Dial
, because fasthttp.Dial
limits the number of concurrent dials to 1000
2021-01-11 12:53:30 +02:00
Aliaksandr Valialkin
9dcb18e03d
app/vmstorage: disable final merge by default, since it may result in high disk IO and CPU usage without measurable benefits such as increased query performance and reduced disk space usage
2021-01-08 00:16:05 +02:00
Aliaksandr Valialkin
490c69c64e
lib/storage: wait for pending transactions before closing and dropping the partition
...
This deflakes `make test-full-386` test
2020-12-25 11:45:53 +02:00
Aliaksandr Valialkin
cab7e936a3
lib/storage: physically remove stale parts
...
Previously they were removed from partition struct, but the corresponding directories weren't removed.
This is a follow-up for 46dba00756
2020-12-24 16:51:36 +02:00
Nikolay
b21e16ad0c
fixes kubernetes_sd ( #983 )
...
* fixes kubernetes_sd,
adds missing service metadata for pod ports without endpoint
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/982
* fix test
2020-12-24 11:26:14 +02:00
Aliaksandr Valialkin
820669da69
lib/promscrape: code prettifying for 8dd03ecf19
2020-12-24 10:56:10 +02:00