Roman Khavronenko
e850bf0eff
vmalert: fix the access to rules slice element by wrong index ( #486 )
...
During group's update rules deletion was causing slice
mutations while slice index was assumed to be unchanged.
This caused "slice bounds out of range" errors when multiple
rules were deleted sequentially.
2020-05-15 13:26:06 +03:00
hagen1778
d369450f27
vmalert: update README
2020-05-15 13:26:04 +03:00
Aliaksandr Valialkin
a72f18e821
lib/{storage,mergeset}: further tuning of compression levels depending on block size
...
This should improve performance for querying newly added data, since it can be unpacked faster.
2020-05-15 13:12:28 +03:00
Aliaksandr Valialkin
2cf2e9955b
lib/storage: wait for all the goroutines to finish in TestSearch in order to prevent racy behavior on test finish
2020-05-15 12:12:20 +03:00
Aliaksandr Valialkin
67e331ac62
lib/storage: optimize ingestion pefrormance for new time series
2020-05-15 12:12:19 +03:00
Aliaksandr Valialkin
6838fa876c
lib/mergeset: tune compression levels in order to improve ingestion performance a bit
2020-05-15 12:12:15 +03:00
Aliaksandr Valialkin
1b5d272e07
lib/storage: reduce indentation in Storage.add
2020-05-14 23:23:56 +03:00
Aliaksandr Valialkin
71d29a8fa1
lib/storage: return the first error instead of the last error, since the first error usually points to the root cause
2020-05-14 23:18:59 +03:00
Aliaksandr Valialkin
3845420a8f
lib: extract common code for returning fast unix timestamp into lib/fasttime
2020-05-14 23:06:50 +03:00
Aliaksandr Valialkin
7e831741f9
lib/{storage,mergeset}: return dst on error from unmarshalBlockHeaders, so it could be reused
2020-05-14 15:32:23 +03:00
Aliaksandr Valialkin
2f42b85e0e
lib/storage: document that getnerateUniqueMetricID should return dense ids
2020-05-14 14:08:59 +03:00
Aliaksandr Valialkin
f442d81648
lib/{storage,mergeset}: cleanup: remove unused partSearch.indexBlockReuse
2020-05-14 14:03:15 +03:00
Aliaksandr Valialkin
4bc3d284fa
docs/vmalert.md: sync with app/vmalert/README.md
2020-05-13 22:57:29 +03:00
Roman Khavronenko
e208e76222
vmalert: check if remoteRead object was initied before calling Restore ( #473 )
...
The check for non-nil remoteRead was mistakenly dropped
during refactoring which caused panics when `vmalert`
wasn't configured with `remoteRead` flag.
2020-05-13 22:57:26 +03:00
Roman Khavronenko
1523890742
vmalert: fix flag names and description in README ( #475 )
...
Change also adds the recommendation for `remotewrite`
queue error.
2020-05-13 22:57:20 +03:00
肖贝贝
8c3e9adf7f
Feat/vmalert add max queue size ( #472 )
...
* feat: add remoteWrite.maxQueueSize to reduce queue full
* rename remote(write|read) flags to remote(Write|Read) for the sake of consistency
Co-authored-by: xiaobeibei <xiaobeibei@bigo.sg>
2020-05-13 22:57:16 +03:00
Aliaksandr Valialkin
bac9a684e8
docs/vmbackup.md: add a link to vmbackuper tool
2020-05-13 22:57:11 +03:00
Aliaksandr Valialkin
f3d9a5b0ec
app/vmselect/promql: suppress "SA4006: this value of dstValues
is never used" error in golangci-lint
2020-05-13 11:46:05 +03:00
Aliaksandr Valialkin
8bb44a5d09
lib/storage: optimize label matching for regexp ending with literal suffix
...
For example, `{label=~"foo.*bar.+baz"}` contains literal suffix `baz`,
so it should work faster now.
2020-05-13 11:39:05 +03:00
Aliaksandr Valialkin
3b0f66a227
app/vmagent: fix a bug with improper relabeling when multiple -remoteWrite.urlRelableConfig
args are set
...
This bug could result in incorrect relabeling and metrics' drop.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/467
2020-05-12 22:03:45 +03:00
Aliaksandr Valialkin
18a0caee43
app/vmselect/promql: fix any(..)
calculations - return all the data points instead of the first one
2020-05-12 20:36:49 +03:00
Aliaksandr Valialkin
3d3f41b961
app/vmstorage/transport: fix panic during server stop on 32-bit arches
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/212
2020-05-12 20:21:40 +03:00
Aliaksandr Valialkin
c9ab6dc532
lib/fs: do not use mmap for 32-bit arches by default, since they cannot map files bigger than 4GB in RAM
2020-05-12 20:21:39 +03:00
Aliaksandr Valialkin
81b8811cf4
app/vmselect/promql: remove -search.maxPointsPerTimeseries
command-line flag
...
Limit the estimated time series count after aggregation with grouping by the number of source time series.
2020-05-12 19:54:44 +03:00
Aliaksandr Valialkin
408ade27a9
app/vmselect/promql: add any(x) by (y)
aggregate function, which returns any time series from q
for each group y
2020-05-12 19:50:29 +03:00
Aliaksandr Valialkin
21c2982ac8
app/vmselect/promql: support for sum(x) by (y) limit N
syntax in order to limit the number of output time series after aggregation
2020-05-12 19:50:12 +03:00
Aliaksandr Valialkin
f341c6fcc4
Revert "app/vmselect: add -search.estimatedSeriesCountAfterAggregation
command-line flag for tuning the probability of OOMs or false-positive not enough memory
errors"
...
This reverts commit fbb7986dd2380fce2fc8633b7eda8b67f419e74c.
Reason for revert: this commit has been removed from single-node version
2020-05-12 19:50:08 +03:00
Aliaksandr Valialkin
d54a93fc81
app/vmagent: fix scraping mTLS targets, which has been broken in v1.35.1
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/470
2020-05-12 17:23:43 +03:00
Aliaksandr Valialkin
405cf44aed
app/vmagent,lib/promscrape: do not set HostClient.DialDualStack, since it isnt used if HostClient.Dial is set
2020-05-12 15:24:53 +03:00
Aliaksandr Valialkin
da6a84e147
app/vmagent/remotewrite: properly dial TCP6 addresses set via -remoteWrite.url
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/469
2020-05-12 15:24:50 +03:00
Aliaksandr Valialkin
bd5f4e0344
lib/storage: properly initialize part struct before trying to close it on error
...
This should prevent from nil pointer dereference bug at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/468 .
2020-05-12 14:54:16 +03:00
Aliaksandr Valialkin
cc825c483b
vendor: make vendor-update
2020-05-12 14:26:29 +03:00
Aliaksandr Valialkin
ddd8c9d099
deployment/docker: omit http2 support in *-prod
binaries
...
VictoriaMetrics doesn't use http/2.0, so disable it completely.
Use `nethttpomithttp2` tag defined in Go1.14 for this.
See 2566e21f24
for details.
2020-05-12 14:19:33 +03:00
Aliaksandr Valialkin
4e237b4670
app/vminsert/influx: support passing AccountID and ProjectID via plain TCP and UDP
...
Now `vminsert` accepts AccountID and ProjectID via `VictoriaMetrics_AccountID` and `VictoriaMetrics_ProjectID` tags
when reading Influx line protocol data via plain TCP or UDP (i.e. when `-influxListenAddr` is set).
2020-05-12 13:13:04 +03:00
Aliaksandr Valialkin
f7753b1469
lib/storage: gradually pre-populate per-day inverted index for the next day
...
This should prevent from CPU usage spikes at 00:00 UTC every day when
inverted index for new day must be quickly created for all the active time series.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/430
2020-05-12 12:13:32 +03:00
Aliaksandr Valialkin
8c77cb436a
lib/storage: typo fixes in error messages: or -> of
2020-05-12 12:12:33 +03:00
Aliaksandr Valialkin
bbf06a4248
lib/storage: speed up matching for common regexps in label filters
...
The following regexps have been optimized:
* 'foo.+bar'
* 'foo.+bar.+baz'
This should improve performance for matching Graphite-like metrics.
2020-05-11 22:49:01 +03:00
Aliaksandr Valialkin
37254a139a
lib/storage: add a benchmark for Graphite-like regexps for metric names
2020-05-11 22:49:00 +03:00
Roman Khavronenko
0157566fdb
vmalert: cleanup and restructure of code to improve maintainability ( #471 )
...
The change introduces new entity `manager` which replaces
`watchdog`, decouples requestHandler and groups. Manager
supposed to control life cycle of groups, rules and
config reloads.
Groups export an ID method which returns a hash
from filename and group name. ID supposed to be unique
identifier across all loaded groups.
Some tests were added to improve coverage.
Bug with wrong annotation value if $value is used in
templates after metrics being restored fixed.
Notifier interface was extended to accept context.
New set of metrics was introduced for config reload.
2020-05-11 14:35:55 +03:00
Nikolay Khramchikhin
0e8c345ffb
vmalert config reload
...
added config hot reload for vmalert with sighup and api call
2020-05-11 14:35:50 +03:00
Aliaksandr Valialkin
6ce9f81d16
docs/CaseStudies.md: add CERN case study
2020-05-11 14:35:43 +03:00
Aliaksandr Valialkin
6c88e3523b
docs/Single-server-VictoriaMetrics.md: small updates for Monitoring
and How to start VictoriaMetrics
sections
2020-05-08 20:35:31 +03:00
Aliaksandr Valialkin
6646b380ef
docs/vmauth.md: fix a link to docker images
2020-05-08 14:11:10 +03:00
Aliaksandr Valialkin
0362bd220e
docs/Articles.md: add a link to CERN article at https://indico.cern.ch/event/877333/contributions/3696707/attachments/1972189/3281133/CMS_mon_RD_for_opInt.pdf
2020-05-08 01:25:17 +03:00
Aliaksandr Valialkin
657c3e3fc5
Makefile: suppress false positives for golangci-lint on nil pointer dereference
2020-05-07 19:41:11 +03:00
Aliaksandr Valialkin
28ad350a31
app/vmagent: return 200 from /-/reload
endpoint as Prometheus does
2020-05-07 19:29:48 +03:00
Aliaksandr Valialkin
2f28e945b8
lib/httpserver: add -http.shutdownDelay
flag for a grace period before http server shutdown
...
The http server returns 503 non-OK error at `/health` page during grace period,
so load balancers in front of the http server could re-route incoming requests
to other servers.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/463
2020-05-07 15:25:51 +03:00
Aliaksandr Valialkin
3052b479b7
lib/httpserver: reduce typical duration for http server graceful shutdown
...
Previously the duration for graceful shutdown for http server could take more than a minute
because of imporperly set timeouts in setNetworkTimeout.
Now typical duration for graceful shutdown should be reduced to less than 5 seconds.
2020-05-07 14:16:38 +03:00
Aliaksandr Valialkin
dc04040781
docs/{vmagent,vmauth}: small clarifications in the docs
2020-05-07 12:55:06 +03:00
Aliaksandr Valialkin
2b403d3f42
app/vmauth: prevent from attacks with ..
in path for accessing resources outside the configured url_prefix
2020-05-07 12:55:04 +03:00