Roman Khavronenko
1cb7037fc8
Vmalert metrics update ( #1580 )
...
* vmalert: remove `vmalert_execution_duration_seconds` metric
The summary for `vmalert_execution_duration_seconds` metric gives no additional
value comparing to `vmalert_iteration_duration_seconds` metric.
* vmalert: update config reload success metric properly
Previously, if there was unsuccessfull attempt to reload config and then
rollback to previous version - the metric remained set to 0.
* vmalert: add Grafana dashboard to overview application metrics
* docker: include vmalert target into list for scraping
* vmalert: extend notifier metrics with addr label
The change adds an `addr` label to metrics for alerts_sent and alerts_send_errors
to identify which exact address is having issues.
The according change was made to vmalert dashboard.
* vmalert: update documentation and docker environment for vmalert's dashboard
Mention Grafana's dashboard in vmalert's README in a new section #Monitoring.
Update docker-compose env to automatically add vmalert's dashboard.
Update docker-compose README with additional info about services.
2021-09-01 12:19:34 +03:00
Roman Khavronenko
434f33d04d
Cluster sync master changes ( #1592 )
...
* docker: add README for docker compose env
* docker: add vmalert Grafana dashboard
2021-09-01 10:25:07 +03:00
Aliaksandr Valialkin
0b5f30b4f9
vendor: make vendor-update
2021-08-31 12:00:06 +03:00
Aliaksandr Valialkin
9723ac30dd
docs/Single-server-VictoriaMetrics.md: remove outdated link to VictoriaMetrics wiki
...
VictoriaMetrics wiki became outdated after publishing all the docs at https://docs.victoriametrics.com
2021-08-31 11:53:38 +03:00
Aliaksandr Valialkin
b3d240f7b4
docs/CHANGELOG.md: add a link to Prometheus staleness tracking
2021-08-31 11:53:38 +03:00
dependabot[bot]
9a1e079f92
build(deps): bump github.com/aws/aws-sdk-go from 1.40.30 to 1.40.33 ( #1582 )
...
Bumps [github.com/aws/aws-sdk-go](https://github.com/aws/aws-sdk-go ) from 1.40.30 to 1.40.33.
- [Release notes](https://github.com/aws/aws-sdk-go/releases )
- [Changelog](https://github.com/aws/aws-sdk-go/blob/main/CHANGELOG.md )
- [Commits](https://github.com/aws/aws-sdk-go/compare/v1.40.30...v1.40.33 )
---
updated-dependencies:
- dependency-name: github.com/aws/aws-sdk-go
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-08-31 11:53:38 +03:00
dependabot[bot]
7a53e2d093
build(deps): bump google.golang.org/api from 0.54.0 to 0.55.0 ( #1583 )
...
Bumps [google.golang.org/api](https://github.com/googleapis/google-api-go-client ) from 0.54.0 to 0.55.0.
- [Release notes](https://github.com/googleapis/google-api-go-client/releases )
- [Changelog](https://github.com/googleapis/google-api-go-client/blob/master/CHANGES.md )
- [Commits](https://github.com/googleapis/google-api-go-client/compare/v0.54.0...v0.55.0 )
---
updated-dependencies:
- dependency-name: google.golang.org/api
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-08-31 11:53:38 +03:00
dependabot[bot]
75596ad423
build(deps): bump github.com/klauspost/compress from 1.13.4 to 1.13.5 ( #1584 )
...
Bumps [github.com/klauspost/compress](https://github.com/klauspost/compress ) from 1.13.4 to 1.13.5.
- [Release notes](https://github.com/klauspost/compress/releases )
- [Changelog](https://github.com/klauspost/compress/blob/master/.goreleaser.yml )
- [Commits](https://github.com/klauspost/compress/compare/v1.13.4...v1.13.5 )
---
updated-dependencies:
- dependency-name: github.com/klauspost/compress
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-08-31 11:53:38 +03:00
Aliaksandr Valialkin
146c14d879
lib/promscrape/discovery/kubernetes: return back support role: endpointslices
, since it is used by VictoriaMetrics operator
...
This is a follow up commit after 31b42b30b6
2021-08-29 12:37:36 +03:00
Aliaksandr Valialkin
3788d4f4eb
app/vmselect: show useful endpoints when requested /select/<accountID>/
page
2021-08-29 12:05:14 +03:00
Aliaksandr Valialkin
18d7adf731
lib/protoparser/opentsdb: follow-up after 8ee75ca45a
2021-08-29 11:50:01 +03:00
envzhu
00dddfe02f
lib/protoparser/opentsdb: accept multiple spaces between fields in a row as a deliminator. ( #1575 )
2021-08-29 11:50:00 +03:00
Aliaksandr Valialkin
ca61d7c82b
lib/promscrape/discovery/kubernetes: rename role: endpointslices
to role: endpointslice
to be consistent with Prometheus
...
See 2ec6c7dbb8/discovery/kubernetes/kubernetes.go (L99)
2021-08-29 11:23:59 +03:00
Aliaksandr Valialkin
327034b54f
lib/promscrape/discovery/kubernetes: use v1 API instead of v1beta1 API for role: ingress
and role: endpointslices
...
This should fix service discovery for these roles in Kubernetes v1.22 and newer versions.
See https://kubernetes.io/docs/reference/using-api/deprecation-guide/#ingress-v122
The corresponding change in Prometheus - https://github.com/prometheus/prometheus/pull/9205
2021-08-29 11:23:58 +03:00
Aliaksandr Valialkin
0e809bd4b6
docs/Single-server-VictoriaMetrics.md: mention that downsampling doesnt improve query performance on high churn rate
2021-08-27 18:51:28 +03:00
Aliaksandr Valialkin
39bb6bdd79
app/vmselect/promql: add quantile("phiLabel", phi1, ..., phiN, q)
aggregate function to MetricsQL
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1573
2021-08-27 18:40:25 +03:00
Aliaksandr Valialkin
14d4b5aa83
app/vmselect: add -search.disableAutoCacheReset
command-line option for disabling automatic cache reset when a sample with old timestamp outside -search.cacheTimestampOffset is inserted
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1570
2021-08-27 17:17:43 +03:00
Aliaksandr Valialkin
6e5864b014
docs/{vmgateway,vmbackupmanager}: mention that enterprise binaries are free for download and evaluation
2021-08-27 14:53:48 +03:00
Aliaksandr Valialkin
adf71a415e
docs: link to active time series
, churn rate
and high cardinality
questions
2021-08-27 14:45:36 +03:00
Aliaksandr Valialkin
1a7646c142
docs/CHANGELOG.md: document the bugfix for possible timeout error in vmbackupmanager when making snapshots
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1571
2021-08-27 13:04:33 +03:00
Aliaksandr Valialkin
125a94e6ca
docs: mention that enterprise binaries can be downloaded and evaluated for free
2021-08-27 12:48:51 +03:00
Aliaksandr Valialkin
8845dc63db
vendor: make vendor-update
2021-08-26 09:43:01 +03:00
Aliaksandr Valialkin
102ab795f8
docs/vmagent.md: document the ability to load scrape configs from multiple files
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1559
2021-08-26 09:13:54 +03:00
Aliaksandr Valialkin
7fdb4db73d
lib/promscrape: add ability to load scrape configs from multiple files
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1559
2021-08-26 08:51:53 +03:00
Aliaksandr Valialkin
dd96792a43
vendor: make vendor-update
2021-08-25 13:42:34 +03:00
Aliaksandr Valialkin
8d843d0754
docs/CHANGELOG.md: document 48f33d098b
2021-08-25 13:32:05 +03:00
benclive
76816a0193
Remove trailing slash for URLPrefixes with specific path ( #1554 )
2021-08-25 13:32:04 +03:00
Aliaksandr Valialkin
874660a4ae
docs/Cluster-VictoriaMetrics.md: mention that the -replicationFactor
at vmselect
is an optional parameter
2021-08-25 13:10:31 +03:00
Aliaksandr Valialkin
4a2d7aec7f
lib/promscrape: expose promscrape_discovery_http_errors_total metric for tracking errors per each http_sd config
2021-08-25 13:05:29 +03:00
Aliaksandr Valialkin
b885bd9b7d
lib/{mergeset,storage}: improve the detection of the needed free space for background merge
...
This should prevent from possible out of disk space crashes during big merges.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1560
2021-08-25 10:01:09 +03:00
Aliaksandr Valialkin
ae8ec78c63
docs/FAQ.md: add more entries for frequently asked questions
...
The following topics are covered:
* Active time series
* High cardinality
* High churn rate
* Slow inserts
2021-08-24 11:34:31 +03:00
Aliaksandr Valialkin
9b39e078c0
docs/MetricsQL.md: typo fix: histogram_qunatile -> histogram_quantile
2021-08-23 23:08:28 +03:00
Aliaksandr Valialkin
335de30083
app/vmselect/promql: make fmt
after 0078486ea7
2021-08-23 23:05:34 +03:00
Aliaksandr Valialkin
3eca49c4a6
docs/MetricsQL.md: fix the indentation for median
function
2021-08-23 12:04:43 +03:00
Aliaksandr Valialkin
a4948d92b5
docs/MetricsQL.md: typo fix: convesions->conversions
2021-08-23 12:01:34 +03:00
Aliaksandr Valialkin
8b9dc45c3c
docs/MetricsQL.md: typo fixes
2021-08-23 12:00:17 +03:00
Aliaksandr Valialkin
5917c72ddd
docs/MetricsQL.md: rehaul the documentation on MetricsQL
...
* Document all the functions supported by MetricsQL, including PromQL functions
* Group functions by their type: rollup functions, transform functions, label manipulation functions and aggregate functions.
* Document implicit query transformations.
2021-08-23 11:46:30 +03:00
Aliaksandr Valialkin
40b06e84f8
app/vmselect/promql: rename sign()
function to sgn()
in order to be consistent with Prometheus
...
See https://github.com/prometheus/prometheus/pull/8457 for details.
2021-08-23 11:46:29 +03:00
Aliaksandr Valialkin
8493159eed
deployment/docker: update Go builder from Go1.16.7 to Go1.17.0
...
This improves data ingestion and query performance by up to 5% according to benchmarks.
See https://go.dev/blog/go1.17
2021-08-21 22:22:31 +03:00
Aliaksandr Valialkin
67bc407747
lib/promscrape: reduce memory and CPU usage when Prometheus staleness tracking is enabled for metrics from deleted / disappeared scrape targets
...
Store the scraped response body instead of storing the parsed and relabeld metrics.
This should reduce memory usage, since the response body takes less memory than the parsed and relabeled metrics.
This is especially true for Kubernetes service discovery, which adds many long labels for all the scraped metrics.
This should also reduce CPU usage, since the marshaling of the parsed
and relabeld metrics has been substituted by response body copying.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1526
2021-08-21 21:24:07 +03:00
Aliaksandr Valialkin
ff4c7c1a3d
docs/vmalert.md: run make docs-sync
after 9ee3d0378f
2021-08-21 20:25:26 +03:00
Aliaksandr Valialkin
388e07b37f
docs/CHANGELOG.md: document 9ee3d0378f
2021-08-21 20:23:22 +03:00
Roman Khavronenko
0c2284b95f
vmalert: add flag disableAlertgroupLabel
for disabling extra label added to series ( #1534 )
...
The new label added in https://github.com/VictoriaMetrics/VictoriaMetrics/issues/611
may negatively impact deduplication in Alertmanager. The new flag supposed to give
an option to disable adding this label.
To enable flag just add `-disableAlertgroupLabel` to binary execution command.
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1532
2021-08-21 20:23:22 +03:00
Alexander Rickardsson
9e2e9d83a5
vmalert: accept http.StatusOK for remotewrite ( #1550 )
2021-08-21 20:23:22 +03:00
Aliaksandr Valialkin
0ff3e04ed4
vendor: make vendor-update
2021-08-21 20:13:44 +03:00
Aliaksandr Valialkin
91534057a3
app/vmselect/prometheus: do not extend [d]
to the detected interval between samples for first_over_time(m[d])
...
This is for the sake of consistency with similar change for the last_over_time(m[d]) at a724229b5d
2021-08-21 19:56:56 +03:00
dependabot[bot]
959a667aa5
build(deps): bump github.com/aws/aws-sdk-go from 1.40.25 to 1.40.26 ( #1551 )
...
Bumps [github.com/aws/aws-sdk-go](https://github.com/aws/aws-sdk-go ) from 1.40.25 to 1.40.26.
- [Release notes](https://github.com/aws/aws-sdk-go/releases )
- [Changelog](https://github.com/aws/aws-sdk-go/blob/main/CHANGELOG.md )
- [Commits](https://github.com/aws/aws-sdk-go/compare/v1.40.25...v1.40.26 )
---
updated-dependencies:
- dependency-name: github.com/aws/aws-sdk-go
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-08-21 19:56:55 +03:00
Aliaksandr Valialkin
c3b24882a7
lib/promscrape: use scrapeTimestamp when storing stale markers for failed scrape
...
This will make timestamps for stale markers more consistent for timestamps for other samples
2021-08-19 14:19:54 +03:00
Aliaksandr Valialkin
3454f25e0f
docs/CHANGELOG.md: document b5d6a0e499
2021-08-19 14:07:00 +03:00
Roman Khavronenko
1ccb77904b
vmselect: update vm_request_duration_seconds
value when request fails ( #1537 )
...
Before, metric `vm_request_duration_seconds` was update only on successful
attempts which could be misleading. For example, timeout errors on netstorage
request may be not accounted in the metric and won't be visible on dashboards.
Using `defer` statement to update the metric after query arguments validation
may improve the situation.
2021-08-19 14:07:00 +03:00