Commit Graph

8236 Commits

Author SHA1 Message Date
Aliaksandr Valialkin
a970705d8e lib/procutil: prevent from app termination on SIGHUP signal, since this signal is frequently used for config reload 2020-04-30 02:18:06 +03:00
DexterZhang
ae215e5538 feat(vmagent): add promscrap config reload suppport via http (#450)
* feat(vmagent): add promscrap config reload suppport via http endpoint `/-/reload`

* fix: typo fix
2020-04-30 02:18:01 +03:00
Aliaksandr Valialkin
d99f48aa48 lib/httpserver: mention that -http.maxGracefulShutdownDuration command-line flag value can be increased on shutdown timeout 2020-04-30 01:37:02 +03:00
Aliaksandr Valialkin
fbfa6aa9f0 docs/Single-server-VictoriaMetrics.md: mention that it is better to increase CPU and RAM per vmselect node in order to achieve higher query performance 2020-04-30 00:53:14 +03:00
Aliaksandr Valialkin
c19f67a248 docs: add vmalert.md 2020-04-29 17:42:16 +03:00
Artem Navoiev
121f7e1d56 Update README.md 2020-04-29 17:41:04 +03:00
Aliaksandr Valialkin
15876c6425 docs/Single-server-VictoriaMetrics.md: update Alerting section 2020-04-29 17:39:56 +03:00
Aliaksandr Valialkin
de5f923476 lib/promscrape: set 30 seconds timeout for discovery api requests
Previously such requests could hang for long time. This could make debugging harder.
2020-04-29 17:29:03 +03:00
Aliaksandr Valialkin
b6d88bac04 vendor: use github.com/VictoriaMetrics/fasthttp instead of github.com/fasthttp/fasthttp
The upstream fasthttp may contain issues like 996610f021 ,
plus a code that isn't used by VictoriaMetrics. So let's use a private copy under our control instead.
2020-04-29 16:43:09 +03:00
Aliaksandr Valialkin
473188f4fd docs/Single-server-VictoriaMetrics.md: mention that basic downsampling could be made with the help of de-duplication 2020-04-28 16:39:06 +03:00
Aliaksandr Valialkin
9ed4951ec8 lib/metricsql: move it to a separate repository - github.com/VictoriaMetrics/metrics 2020-04-28 15:30:06 +03:00
Aliaksandr Valialkin
cd1145e5f4 app/vmselect: add -search.estimatedSeriesCountAfterAggregation command-line flag for tuning the probability of OOMs or false-positive not enough memory errors 2020-04-28 12:51:48 +03:00
Aliaksandr Valialkin
d78ed50edd lib/storage: recover when metricID->metricName entry is missing in the inverted index after unclean shutdown
Newly added index entries can be missing after unclean shutdown, since they didn't flush to persistent storage yet.
Log about this and delete the corresponding metricID, so it could be re-created next time.
2020-04-28 12:01:32 +03:00
Aliaksandr Valialkin
a858b7e393 app/vmalert: added missing comments for public entities 2020-04-28 11:19:48 +03:00
Aliaksandr Valialkin
716bbe79d4 app/vminsert/netstorage: increase timeout for waiting for ack message after sending big data block to vmstorage 2020-04-28 11:19:46 +03:00
Aliaksandr Valialkin
d435029d10 docs/Articles.md: add https://zerodha.tech/blog/infra-monitoring-at-zerodha/ 2020-04-28 02:24:36 +03:00
Aliaksandr Valialkin
53740d0026 lib/promscrape: handle connection reset when targets responds with http redirect 2020-04-28 02:14:32 +03:00
肖贝贝
3e6f29f462 fix: vmagent not follow 301/302 redirect bug (#445)
Co-authored-by: xiaobeibei <xiaobeibei@bigo.sg>
2020-04-28 02:14:31 +03:00
Aliaksandr Valialkin
424068f804 lib/promscrape: handle connection reset when targets responds with http redirect 2020-04-28 02:14:26 +03:00
肖贝贝
7d045bf2ca fix: vmagent not follow 301/302 redirect bug (#445)
Co-authored-by: xiaobeibei <xiaobeibei@bigo.sg>
2020-04-28 02:14:25 +03:00
Aliaksandr Valialkin
50af16baf2 app/vmalert: fix build 2020-04-28 00:34:01 +03:00
Aliaksandr Valialkin
e3db2c73a6 app/vmalert: sync with master branch 2020-04-28 00:19:42 +03:00
Aliaksandr Valialkin
7644f40763 app/vmalert: include it into the next release 2020-04-28 00:11:41 +03:00
Aliaksandr Valialkin
2aecf7c37c lib/{encoding,decimal}: typo fixes in tests: epxecting->expecting 2020-04-28 00:02:19 +03:00
Aliaksandr Valialkin
806dc73d8a lib/encoding: reduce possibility of failure in TestMarshalInt64ArraySize 2020-04-28 00:02:18 +03:00
Aliaksandr Valialkin
a603a15757 lib/promscrape/discovery/gce: make golangci-lint happy 2020-04-27 19:29:42 +03:00
Aliaksandr Valialkin
86a1d9cb0c lib/promscrape: add initial support for Prometheus-compatible service discovery for Amazon EC2 aka ec2_sd_configs 2020-04-27 19:29:22 +03:00
Aliaksandr Valialkin
1acb6eb25a lib/promscrape/discovery/gce: properly set filter query arg in api url 2020-04-27 16:01:53 +03:00
Aliaksandr Valialkin
0daa37fa02 lib/promscrape/discovery/gce: allow empty project and zone for gce_sd_config 2020-04-27 11:45:45 +03:00
Aliaksandr Valialkin
989d84cf3f app/{vminsert,vmstorage}: wait for ack from vmstorage after each packet sent to it from vminsert
This should protect from possible data loss when `vmstorage` is stopped while the packet is sent from `vminsert`.

This commit switches to new protocol between vminsert and vmstorage, which is incompatible
with the previous protocol. So it is required that both vminsert and vmstorage nodes are updated.
2020-04-27 09:53:26 +03:00
Aliaksandr Valialkin
e933cbac16 lib/storage: postpone reading data from blocks during search
This eliminates the need for storing block data into temporary files on a single-node VictoriaMetrics
during heavy queries, which touch big number of time series over long time ranges.

This improves single-node VM performance on heavy queries by up to 2x.
2020-04-27 08:44:01 +03:00
Aliaksandr Valialkin
23a310cc68 app/vmselect/netstorage: substitute sorting packedTimeseries with the natural order of the fetched blocks
This should minimize the number of disk seeks when reading data from temporary file.
2020-04-26 16:46:17 +03:00
Aliaksandr Valialkin
31861c5b8e lib/promscrape/discovery/gce: allow empty zone arg in gce_sd_config - in this case zones for the given project are automatically discovered 2020-04-26 14:37:38 +03:00
Aliaksandr Valialkin
b16e19c053 lib/storage/dedup.go: go fmt 2020-04-26 14:37:36 +03:00
Aliaksandr Valialkin
a0000c3a6e lib/storage: improve deduplication algorithm
Now it leaves only the first data point on each `-dedup.minScrapeInterval` interval.

Previously it may leave two data points on the interval. This could lead to unexpected results
for `histogram_quantile(phi, sum(rate(buckets)) by (le))` query.
2020-04-26 13:10:18 +03:00
Aliaksandr Valialkin
d9bdda408c docs/{vmbackup,vmrestore}.md: update -help output 2020-04-24 22:44:45 +03:00
Jason Gardner
7a6b2839b4 app/vmbackup: added ability to create and delete snapshots during backup (#428)
* app/vmbackup: added ability to create and delete snapshots during backup

Resolves: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/422

* Add snapshot create and delete url flags

* Fixed errcheck warnings in build
2020-04-24 22:35:50 +03:00
Aliaksandr Valialkin
13b4069c59 lib/storage: postpone label filters matching too many time series instead of giving up with error
This should reduce the frequency of the following errors:

    cannot find tag filter matching less than N time series; either increase -search.maxUniqueTimeseries or use more specific tag filters

    more than N time series found on the time range [...]; either increase -search.maxUniqueTimeseries or shrink the time range
2020-04-24 21:18:52 +03:00
Aliaksandr Valialkin
9b386e594f docs/Single-server-VictoriaMetrics.md: document -search.resetCacheAuthKey 2020-04-24 19:48:13 +03:00
Aliaksandr Valialkin
32b3f959fc app/vmselect: fix description for -search.resetCacheAuthKey 2020-04-24 19:44:35 +03:00
Aliaksandr Valialkin
7c74efd640 lib/promscrape/discovery/gce: make golint happy by ignoring resp.Body.Close() result 2020-04-24 18:13:26 +03:00
Aliaksandr Valialkin
987fcce93d .github/workflows: install dependencies before code checkout
Othwerise dependencies' install mangles go.mod
2020-04-24 17:55:53 +03:00
Aliaksandr Valialkin
069690e3bd lib/promscrape: initial implementation for gce_sd_configs aga Prometheus-compatible service discovery for Google Compute Engine 2020-04-24 17:53:43 +03:00
Aliaksandr Valialkin
cf68c5f66a .github/workflows: enable Go modules when installing dependencies
Disabled Go modules broke golangci-lint build
2020-04-24 17:40:43 +03:00
Aliaksandr Valialkin
c53fd515fe docs/Single-server-VictoriaMetrics.md: mention that -search.maxStalenessInterval can be useful for InfluxDB and TimescaleDB users 2020-04-24 16:23:33 +03:00
Aliaksandr Valialkin
48320cffe0 .github/workflows: install golangci-lint at Dependencies step 2020-04-24 15:37:55 +03:00
Aliaksandr Valialkin
de7887fbf4 .github/workflows: update Go version in actions/setup-go from v1.13 to v1.14 2020-04-24 15:31:12 +03:00
Aliaksandr Valialkin
c66daf1f0a vendor: make vendor-update 2020-04-24 15:28:37 +03:00
Aliaksandr Valialkin
8d76795be5 .github/workflows: use master branch for 'actions/setup-go' and 'actions/checkout' 2020-04-24 14:42:06 +03:00
Aliaksandr Valialkin
de991551f5 lib/promscrape: query /api/v1/namespaces/* for the configured namespaces in kubernetes_sd_config
This should fix authroization issues described at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/432
2020-04-24 14:42:02 +03:00