VictoriaMetrics/docs/CHANGELOG.md
Aliaksandr Valialkin f21fad53b4 lib/promscrape: add ability to disable sending Prometheus staleness markers with -promscrape.disableStaleMarkers command-line flag
This option can be useful when vmagent consumes too much additional memory
for staleness markers functionality and when staleness markers aren't needed.
2021-08-18 13:58:05 +03:00

88 KiB

sort
15

CHANGELOG

tip

  • FEATURE: add bitmap_and(q, mask), bitmap_or(q, mask) and bitmak_xor(q, mask) functions to MetricsQL. These functions allow performing bitwise operations over data points in time series. See this issue.

  • FEATURE: vmalert: add -remoteWrite.disablePathAppend command-line flag, which can be used when custom -remoteWrite.url must be specified. For example, ./vmalert -disablePathAppend -remoteWrite.url='http://foo.bar/a/b/c?d=e' would write data to http://foo.bar/a/b/c?d=e instead of http://foo.bar/a/b/c?d=e/api/v1/write. See this pull request.

  • BUGFIX: vmagent: stop scrapers for deleted targets before starting scrapers for added targets. This should prevent from possible time series overlap when old targets are substituted by new targets (for example, during new deployment in Kubernetes). The overlap could lead to incorrect query results. See this issue.

  • BUGFIX: vmagent: add -promscrape.disableStaleMarkers command-line flag for disabling sending Prometheus stale markers for metrics from disappeared scrape targets. This option may be used for reducing memory usage when scraping big number of metrics with big number of labels and when stale markers aren't needed.

v1.64.0

  • FEATURE: add support for Prometheus staleness markers. See this issue.

  • FEATURE: vmagent: automatically generate Prometheus staleness markers for the scraped metrics when scrape targets disappear in the same way as Prometheus does. See this issue.

  • FEATURE: add present_over_time(m[d]) function, which returns 1 if m has a least a single sample over the previous duration d. This function has been added also to Prometheus 2.29.

  • FEATURE: vmagent: support multitenant writes according to these docs. This allows using a single vmagent instance in front of VictoriaMetrics cluster for all the tenants. Thanks to @omarghader for the pull request. See this issue.

  • FEATURE: vmagent: add __meta_ec2_availability_zone_id label to discovered Amazon EC2 targets. This label is available in Prometheus starting from v2.29.

  • FAETURE: vmagent: add __meta_gce_interface_ipv4_<name> labels to discovered GCE targets. These labels are available in Prometheus starting from v2.29.

  • FEATURE: add -search.maxSamplesPerSeries command-line flag for limiting the number of raw samples a single query can process per each time series. This option can protect from out of memory errors when a query processes tens of millions of raw samples per series. See this issue.

  • FEATURE: add -search.maxSamplesPerQuery command-line flag for limiting the number of raw samples a single query can process across all the time series. This option can protect from heavy queries, which select too big number of raw samples. Thanks to @jiangxinlingdu for the initial pull request.

  • FEATURE: improve performance for queries that process big number of time series and/or samples on systems with big number of CPU cores.

  • FEATURE: vmalert: expose vmalert_alerting_rules_last_evaluation_samples and vmalert_recording_rules_last_evaluation_samples metrics. See this issue.

  • FEATURE: vminsert: expose vm_rpc_send_duration_seconds_total counter, which can be used for determining high saturation of every vminsert -> vmstorage link with an alerting query rate(vm_rpc_send_duration_seconds_total) > 0.9s. This query triggers when the link is saturated by more than 90%. This usually means that more vminsert or vmstorage nodes must be added to the cluster in order to increase the total number of vminsert -> vmstorage links.

  • FEATURE: vmagent: expose vmagent_remotewrite_send_duration_seconds_total counter, which can be used for determining high saturation of every connection to remote storage with an alerting query rate(vmagent_remotewrite_send_duration_seconds_total) > 0.9s. This query triggers when a connection is saturated by more than 90%. This usually means that -remoteWrite.queues command-line flag must be increased in order to increase the number of connections per each remote storage.

  • FEATURE: vmui: automatically fill Server URL field. See this issue.

  • BUGFIX: fix corner cases for queries on time ranges exceeding 40 days. Previously some series can be missing in query results. See this issue.

  • BUGFIX: vmselect: return dummy response at /rules page in the same way as for /api/v1/rules page. The /rules page is requested by Grafana 8. See this issue for details.

  • BUGFIX: vmbackup: automatically set default us-east-1 S3 region if it is missing. This should simplify using S3-compatible services such as MinIO for backups. See this issue.

  • BUGFIX: vmselect: prevent from possible deadlock when multiple target query args are passed to Graphite Render API.

  • BUGFIX: return series with a op b labels and N values for (a op b) default N if (a op b) returns series with all NaN values. Previously such series were removed.

  • BUGFIX: vmui: fix layout when the query selects more than 27 time series. See this issue.

  • BUGFIX: vmagent: restore highlighting in red for DOWN targets at /targets page. See this issue.

v1.63.0

  • FEATURE: reduce memory usage by up to 30% on production workloads.

  • FEATURE: vmselect: embed vmui into a single-node VictoriaMetrics and into vmselect component of cluster version. See this feature request. The web interface is available at the following paths:

    • /vmui/ for a single-node VictoriaMetrics
    • /select/<accountID>/vmui/ for vmselect at cluster version of VictoriaMetrics
  • FEATURE: support durations anywhere in MetricsQL queries. For example, sum_over_time(m[1h]) / 1h is a valid query, which is equivalent to sum_over_time(m[1h]) / 3600.

  • FEATURE: support durations without suffixes in MetricsQL queries. For example, rate(m[3600]) is a valid query, which is equivalent to rate(m[1h]).

  • FEATURE: export vmselect_request_duration_seconds and vminsert_request_duration_seconds VictoriaMetrics histograms at /metrics page. These histograms can be used for determining latency distribution and SLI/SLO for the served requests. For example, the following query would return the percent of queries that took less than 500ms during the last hour: histogram_share(500ms, sum(rate(vmselect_request_duration_seconds_bucket[1h])) by (vmrange)).

  • FEATURE: vmagent: dynamically reload client TLS certificates from disk on every mTLS connection. This should allow using vmagent with Istio service mesh. See this feature request.

  • FEATURE: log http request path plus all the query args on errors during request processing. Previously only http request path was logged without query args, so it could be hard debugging such errors.

  • FEATURE: add is_set label to flag metrics. This allows determining explicitly set command-line flags with the query flag{is_set="true"}.

  • FEATURE: add ability to remove caches stored inside <-storageDataPath>/cache on startup if reset_cache_on_startup file is present there. See this feature request.

  • BUGFIX: vmagent: remove { %space %} typo in /targets output. The typo has been introduced in v1.62.0. See this issue.

  • BUGFIX: vmagent: fix CSS styles on /targets page. See this issue.

  • BUGFIX: vmalert: accept Prometheus-like durations in interval config option inside group section. See this issue.

  • BUGFIX: properly update vm_merge_need_free_disk_space metric at /metrics page when there is no enough free disk space for performing optimal merges. See this issue.

v1.62.0

  • FEATURE: vmagent: add service discovery for Docker (aka docker_sd_config). See this pull request.

  • FEATURE: vmagent: add service discovery for DigitalOcean (aka digitalocean_sd_config). See this feature request.

  • FEATURE: vmagent: change the default value for -remoteWrite.queues from 4 to 2 * numCPUs. This should reduce scrape duration for highly loaded vmagent, which scrapes tens of thousands of targets. See this pull request.

  • FEATURE: vmagent: show the number of samples the target returns during the last scrape on /targets and /api/v1/targets pages. This should simplify debugging targets, which may return too big or too low number of samples. See this feature request.

  • FEATURE: vmagent: show jobs with zero discovered targets on /targets page. This should help debugging improperly configured scrape configs.

  • FEATURE: vmagent: support for http-based service discovery (aka http_sd_config), which has been added since Prometheus 2.28. See this feature request.

  • FEATURE: vmagent: support namespace in Consul serive discovery in the same way as Prometheus 2.28 does. See this issue for details.

  • FEATURE: vmagent: support generic auth configs in consul_sd_configs in the same way as Prometheus 2.28 does. See this issue for details.

  • FEATURE: vmctl: limit the number of samples per each imported JSON line. This should limit the memory usage at VictoriaMetrics side when importing time series with big number of samples.

  • FEATURE: vmselect: log slow queries across all the /api/v1/* handlers (aka Prometheus query API) if their execution duration exceeds -search.logSlowQueryDuration. This should simplify debugging slow requests to such handlers as /api/v1/labels or /api/v1/series additionally to /api/v1/query and /api/v1/query_range, which were logged in the previous releases.

  • FEATURE: vminsert: sort the -storageNode list in order to guarantee the identical series -> vmstorage mapping across all the vminsert nodes. This should reduce resource usage (RAM, CPU and disk IO) at vmstorage nodes if vmstorage addresses are passed in random order to vminsert nodes.

  • FEATURE: vmstorage: reduce memory usage on a system with many CPU cores under high ingestion rate.

  • BUGFIX: prevent from adding new samples to deleted time series after the rotation of the inverted index (the rotation is performed once per -retentionPeriod). See this comment for details.

  • BUGFIX: vmstorage: reduce high disk write IO usage on systems with big number of CPU cores. The issue has been introduced in the release v1.59.0. See this commit and this comment for details.

  • BUGFIX: vmstorage: prevent from incorrect stats collection when multiple concurrent queries execute the same tag filter. This may help reducing CPU usage under certain workloads. See this issue.

  • BUGFIX: vmselect: return the last timestamp for the max / min value from tmax_over_time(m[d]) and tmin_over_time(m[d]) MetricsQL functions as most users expect. See also this issue.

  • BUGFIX: vmselect: return the expected value for increase_pure() MetricsQL function after a gap in a time series. Previously incorrect too big value could be returned after the gap from increase_pure().

v1.61.1

  • BUGFIX: vmalert: fix recording rules, which were broken in v1.61.0. See this issue.
  • BUGFIX: reset the on-disk cache for mapping from the full metric name to an internal metric id (e.g. metric_name{labels} -> internal_metric_id) after deleting metrics via delete API. This should prevent from possible inconsistent state after unclean shutdown. This this issue.

v1.61.0

  • FEATURE: vmalert: add support for backfilling (aka replay) of recording and alerting rules. See these docs and this feature request.

  • FEATURE: vmalert: add a command-line flag -rule.configCheckInterval for automatic re-reading of -rule files without the need to send SIGHUP signal. See this issue.

  • FEATURE: vmagent: respect the sample_limit and -promscrape.maxScrapeSize values when scraping targets in stream parsing mode. See this pull request.

  • FEATURE: vmauth: add ability to specify mutliple url_prefix entries for balancing the load among multiple vmselect and/or vminsert nodes in a cluster. See these docs.

  • FEATURE: vminsert: add -disableRerouting command-line flag for forcibly disabling the rerouting. This should help resolving this and this issues.

  • FEATURE: vminsert: reduce the probability of global re-routing storm if all the vmstorage nodes cannot keep up with the given ingestion rate for some time. This should improve cluster stability in such cases. See this and this issues.

  • FEATURE: allow building VictoriaMetrics components for Solaris / SmartOS. See this issue.

  • FEATURE: vmagent: add ability to debug relabeling rules. See these docs and this issue.

  • BUGFIX: reduce CPU usage by up to 2x during querying a database with big number of active daily time series. The issue has been introduced in v1.59.0.

  • BUGFIX: vmagent: properly apply auth and tls configs in eureka_sd_configs. See this pull request.

  • BUGFIX: vmauth: do not panic on aborted http requests. See this issue.

  • BUGFIX: properly generate target property for *Series(foo.*.bar) responses returned from Graphite Render API. Previously the target contained the expanded list of series for foo.*.bar, e.g. sumSeries(foo.a.bar,foo.b.bar,...foo.z.bar). Now VictoriaMetrics returns sumSeries(foo.*.bar) as a target in the same way as Graphite does.

v1.60.0

  • FEATURE: add ability to limit the number of unique time series, which can be added to storage per hour and per day. This can help dealing with high cardinality and high churn rate issues. See these docs.

  • FEATURE: vmagent: add ability to limit the number of unique time series, which can be sent to remote storage systems per hour and per day. This can help dealing with high cardinality and high churn rate issues. See these docs.

  • FEATURE: vmalert: add ability to run alerting and recording rules for multiple tenants. See this issue and these docs.

  • FEATURE: vminsert: add support for data ingestion via other vminsert nodes. This allows building multi-level data ingestion paths in VictoriaMetrics cluster by writing data from one level of vminsert nodes to another level of vminsert nodes. See these docs and this comment for details.

  • FEATURE: vmagent: reload bearer_token_file, credentials_file and password_file contents every second. This allows dynamically changing the contents of these files during target scraping and service discovery without the need to restart vmagent. See this issue.

  • FEATURE: vmalert: add a flag to control behaviour on startup for state restore errors. Such errors were returned and logged before as well. Now user can specify whether to just log these errors (-remoteRead.ignoreRestoreErrors=true) or to stop the process (-remoteRead.ignoreRestoreErrors=false). The latter is important when VM isn't ready yet to serve queries from vmalert and it needs to wait. See this issue.

  • FEATURE: vmalert: add ability to pass round_digits query arg to datasource via -datasource.roundDigits command-line flag. This can be used for limiting the number of decimal digits after the point in recording rule results. See this issue.

  • FEATURE: return X-Server-Hostname header in http responses of all the VictoriaMetrics components. This should simplify tracing the origin server behind a load balancer or behind auth proxy during troubleshooting.

  • FEATURE: vmselect: allow to use 2x more memory for query processing at vmselect nodes in VictoriaMetrics cluster. This should allow processing heavy queries without the need to increase RAM size at vmselect nodes.

  • FEATURE: add ability to filter /api/v1/status/tsdb output with arbitrary time series selectors passed via match[] query args. See these docs and this issue for details.

  • FEATURE: automatically detect memory and cpu limits for VictoriaMetrics components running under cgroup v2 environments such as HashiCorp Nomad. See this issue.

  • FEATURE: vmauth: allow -auth.config reloading via /-/reload http endpoint. See this issue.

  • FEATURE: add timezone_offset(tz) function. It returns offset in seconds for the given timezone tz relative to UTC. This can be useful when combining with datetime-related functions. For example, day_of_week(time()+timezone_offset("America/Los_Angeles")) would return weekdays for America/Los_Angeles time zone. Special Local time zone can be used for returning an offset for the time zone set on the host where VictoriaMetrics runs. See this issue and MetricsQL docs for more details.

  • FEATURE: vmagent: add support for OAuth2 authorization for scrape targets and service discovery in the same way as Prometheus does. See these docs.

  • FEATURE: vmagent: add support for OAuth2 authorization when writing data to -remoteWrite.url. See -remoteWrite.oauth2.* config params in /path/to/vmagent -help output.

  • FEATURE: vmalert: add ability to set extra_filter_labels at alerting and recording group configs. See these docs.

  • FEATURE: vmstorage: reduce memory usage by up to 30% when ingesting big number of active time series.

  • BUGFIX: vmagent: do not retry scraping targets, which don't support HTTP. This should reduce CPU load and network usage at vmagent and at scrape target. See this issue.

  • BUGFIX: vmagent: fix possible race when refreshing role: endpoints and role: endpointslices scrape targets in kubernetes_sd_config. Prevoiusly pod objects could be updated after the related endpoints object update. This could lead to missing scrape targets. See this issue.

  • BUGFIX: vmagent: properly spread scrape targets among vmagent replicas if -promscrape.cluster.replicationFactor exceeds 1. See this pull request.

  • BUGFIX: vmagent: limit scrape_timeout by scrape_interval. This guarantees that only a single sample is lost during the configured scrape_interval when scrape target responds slowly. See this comment for details.

  • BUGFIX: properly remove stale parts outside the configured retention if -retentionPeriod is smaller than one month. Previously stale parts could remain active for up to a month after they go outside the retention.

  • BUGFIX: stop the process on panic errors, since such errors may leave the process in inconsistent state. Previously panics could be recovered, which could result in unexpected hard-to-debug further behavior of running process.

  • BUGFIX: vminsert, vmagent: make sure data ingestion connections are closed before completing graceful shutdown. Previously the connection may remain open, which could result in trailing samples loss.

  • BUGFIX: vmauth, vmalert: properly re-use HTTP keep-alive connections to backends and datasources. Previously only 2 keep-alive connections per backend could be re-used. Other connections were closed after the first request. See this issue for details.

  • BUGFIX: vmalert: fix false positive error result contains metrics with the same labelset after applying rule labels, which could be triggered when recording rules generate unique metrics. See this issue.

  • BUGFIX: vmctl: properly import InfluxDB rows if they have a field and a tag with identical names. See this issue.

  • BUGFIX: properly reload configs if SIGHUP signal arrives during service initialization. Previously such SIGHUP signal could be ingonred and configs weren't reloaded.

  • BUGFIX: vmalert: properly import default rules from OpenShift. See this issue.

  • BUGFIX: reduce the probability of the removal queue is full panic when highly loaded VictoriaMetrics stores data on NFS. See this issue.

v1.59.0

  • FEATURE: improved new time series registration speed on systems with many CPU cores. See this issue. Thanks to @waldoweng for the idea and draft implementation.

  • FEATURE: vmalert: use the same technique as Grafana for determining evaluation timestamps for recording rules. This should make consistent graphs for series generated by recording rules compared to graphs generated for queries from recording rules in Grafana. See this issue.

  • FEATURE: vmauth: add ability to set madatory query args in url_prefix. For example, url_prefix: http://vm:8428/?extra_label=team=dev would add extra_label=team=dev query arg to all the incoming requests. See the example for more details.

  • FEATURE: vmctl: add OpenTSDB migration option. See more details here. Thanks to @johnseekins!

  • FEATURE: log metrics with dropped labels if the number of labels in the ingested metric exceeds -maxLabelsPerTimeseries. This should simplify debugging for this case.

  • FEATURE: vmagent: list user-visible endpoints at http://vmagent:8429/. See this issue.

  • BUGFIX: vmagent: properly update role: endpoints and role: endpointslices scrape targets if the underlying service objects are updated in kubernetes_sd_config. See this issue.

  • BUGFIX: vmagent: apply scrape_timeout on receiving the first response byte from stream_parse: true scrape targets. Previously it was applied to receiving and processing the full response stream. This could result in false timeout errors when scrape target exposes millions of metrics as described here.

  • BUGFIX: vmagent: eliminate possible data race when obtaining value for the metric vm_persistentqueue_bytes_pending. The data race could result in incorrect value for this metric.

  • BUGFIX: vmstorage: remove empty directories on startup. Such directories can be left after unclean shutdown on NFS storage. Previously such directories could lead to crashloop until manually removed. See this issue.

v1.58.0

  • FEATURE: vminsert and vmagent: add -sortLabels command-line flag for sorting metric labels before pushing them to vmstorage. This should reduce the size of MetricName -> internal_series_id cache (aka vm_cache_size_bytes{type="storage/tsid"}) when ingesting samples for the same time series with distinct order of labels. For example, foo{k1="v1",k2="v2"} and foo{k2="v2",k1="v1"} represent a single time series. Labels sorting is disabled by default, since the majority of established exporters preserve the order of labels for the exported metrics.

  • FEATURE: allow specifying label value alongside label name for the others sum time series returned from topk_* and bottomk_* functions from MetricsQL. For example, topk_avg(3, max(process_resident_memory_bytes) by (instance), "instance=other_sum") would return top 3 series from max(process_resident_memory_bytes) by (instance) plus a series containing the sum of other series. The others sum series will have {instance="other_sum"} label.

  • FEATURE: do not delete dst_label when applying label_copy(q, "src_label", "dst_label") and label_move(q, "src_label", "dst_label") to series without src_label and with non-empty dst_label. See more details at MetricsQL docs.

  • FEATURE: update Go builder from v1.16.2 to v1.16.3. This should fix these issues.

  • FEATURE: vmagent: add support for follow_redirects option to scrape_configs section in the same way as Prometheus 2.26 does.

  • FEATURE: vmagent: add support for authorization section in -promscrape.config in the same way as Prometheus 2.26 does.

  • FEATURE: vmagent: add support for socks5 proxy in proxy_url config option. See this issue.

  • FEATURE: vmagent: add support for socks5 over tls proxy in proxy_url config option. It can be set up with the following config: proxy_url: "tls+socks5://proxy-addr:port".

  • FEATURE: vmagent: reduce memory usage when -remoteWrite.queues is set to a big value. See this issue.

  • FEATURE: vmagent: add AWS IAM roles for tasks support for EC2 service discovery according to these docs.

  • FEATURE: vmagent: add support for proxy_tls_config, proxy_authorization, proxy_basic_auth, proxy_bearer_token and proxy_bearer_token_file options in consul_sd_config, dockerswarm_sd_config and eureka_sd_config sections.

  • FEATURE: vmagent: pass X-Prometheus-Scrape-Timeout-Seconds header to scrape targets as Prometheus does. In this case scrape targets can limit the time needed for performing the scrape. See this comment for details.

  • FEATURE: vmagent: drop corrupted persistent queue files at -remoteWrite.tmpDataPath instead of throwing a fatal error. Corrupted files can appear after unclean shutdown of vmagent such as OOM kill or hardware reset. See this issue.

  • FEATURE: vmauth: add support for authorization via bearer token. See the docs for details.

  • FEATURE: publish arm64 and amd64 binaries for cluster version of VictoriaMetrics at releases page.

  • BUGFIX: properly handle /api/v1/labels and /api/v1/label/<label_name>/values queries on big start ... end time range. This should fix big resource usage when VictoriaMetrics is queried with Promxy v0.0.62 or newer versions.

  • BUGFIX: do not break sort order for series returned from topk*, bottomk* and outliersk MetricsQL functions. See this issue.

  • BUGFIX: vmagent: properly work with simple HTTP proxies which don't support CONNECT method. For example, PushProx. See this issue.

  • BUGFIX: vmagent: properly discover targets if multiple namespace selectors are put inside kubernetes_sd_config. See this issue.

  • BUGFIX: vmagent: properly discover role: endpoints and role: endpointslices targets in kubernetes_sd_config. See this issue.

  • BUGFIX: properly generate filename for *.tar.gz archive inside _checksums.txt file posted at releases page. See this issue.

v1.57.1

  • FEATURE: publish vmutils for GOOS=arm on releases page.

  • BUGFIX: prevent from possible incomplete query results after timed out query.

  • BUGFIX: vmselect: remove -search.storageTimeout command-line flag, since it has the same meaning as -search.maxQueryDuration. See this issue.

  • BUGFIX: vminsert: return back type label to per-tenant metric vm_tenant_inserted_rows_total. See this issue.

v1.57.0

  • FEATURE: optimize query performance by up to 10x on systems with many CPU cores. See this tweet.

  • FEATURE: add the following metrics at /metrics page for every VictoraMetrics app:

    • process_resident_memory_anon_bytes - RSS share for memory allocated by the process itself. This share cannot be freed by the OS, so it must be taken into account by OOM killer.
    • process_resident_memory_file_bytes - RSS share for page cache memory (aka memory-mapped files). This share can be freed by the OS at any time, so it must be ignored by OOM killer.
    • process_resident_memory_shared_bytes - RSS share for memory shared with other processes (aka shared memory). This share can be freed by the OS at any time, so it must be ignored by OOM killer.
    • process_resident_memory_peak_bytes - peak RSS usage for the process.
    • process_virtual_memory_peak_bytes - peak virtual memory usage for the process.
  • FEATURE: accept and enforce extra_label=<label_name>=<label_value> query arg at Graphite APIs.

  • FEATURE: use Influx field as metric name if measurement is empty and -influxSkipSingleField command-line is set. See this issue.

  • FEATURE: vmagent: add -promscrape.consul.waitTime command-line flag for tuning the maximum wait time for Consul service discovery. See this issue.

  • FEATURE: vmagent: add vm_promscrape_discovery_kubernetes_stale_resource_versions_total metric for monitoring the frequency of too old resource version errors during Kubernetes service discovery.

  • FEATURE: single-node VictoriaMetrics: log metrics with timestamps older than -search.cacheTimestampOffset compared to the current time. See these docs for details.

  • BUGFIX: prevent from infinite loop on {__graphite__="..."} filters when a metric name contains *, { or [ chars.

  • BUGFIX: prevent from infinite loop in /metrics/find and /metrics/expand Graphite Metrics API handlers when they match metric names or labels with *, { or [ chars.

  • BUGFIX: do not merge duplicate time series during requests to /api/v1/query. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1141

  • BUGFIX: vmagent: properly handle too old resource version error messages from Kubernetes watch API. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1150

  • BUGFIX: vmagent: do not retry sending data blocks if remote storage returns 400 Bad Request error. The number of dropped blocks due to such errors can be monitored with vmagent_remotewrite_packets_dropped_total metrics. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1149

  • BUGFIX: properly calculate summarize and *Series functions in Graphite Render API.

v1.56.0

  • FEATURE: add the following functions to MetricsQL:

    • histogram_avg(buckets) - returns the average value for the given buckets.
    • histogram_stdvar(buckets) - returns standard variance for the given buckets.
    • histogram_stddev(buckets) - returns standard deviation for the given buckets.
  • FEATURE: export vm_available_memory_bytes and vm_available_cpu_cores metrics, which show the number of available RAM and available CPU cores for VictoriaMetrics apps.

  • FEATURE: export vm_index_search_duration_seconds histogram, which can be used for troubleshooting time series search performance.

  • FEATURE: vmagent: add ability to replicate scrape targets among vmagent instances in the cluster with -promscrape.cluster.replicationFactor command-line flag. See these docs.

  • FEATURE: vmagent: accept scrape_offset option at scrape_config. This option may be useful when scrapes must start at the specified offset of every scrape interval. See these docs for details.

  • FEATURE: vmagent: support proxy_tls_config, proxy_basic_auth, proxy_bearer_token and proxy_bearer_token_file options at scrape_config section for configuring proxies specified via proxy_url. See these docs.

  • FEATURE: vmauth: allow using regexp paths in url_map. See this issue for details.

  • FEATURE: accept round_digits query arg at /api/v1/query and /api/v1/query_range handlers. This option can be set at Prometheus datasource in Grafana for limiting the number of digits after the decimal point in response values.

  • FEATURE: add -influx.databaseNames command-line flag, which can be used for accepting data from some Telegraf plugins such as fluentd plugin. See this issue.

  • FEATURE: add -logNewSeries command-line flag, which can be used for debugging the source of time series churn rate.

  • FEATURE: publish Windows builds for vmagent, vmalert, vmauth and vmctl at vmutils-windows-*.zip archives at releases page.

  • FEATURE: listen for IPv6 UDP if -enableTCP6 command-line flag is passed to VictoriaMetrics. See this issue.

  • BUGFIX: vmagent: prevent from high CPU usage bug during failing scrapes with small scrape_timeout (less than a few seconds).

  • BUGFIX: vmagent: reduce memory usage when Kubernetes service discovery is used in big number of distinct scrape config jobs by sharing Kubernetes object cache. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1113

  • BUGFIX: vmagent: apply sample_limit only after metric_relabel_configs are applied as Prometheus does. Previously the sample_limit was applied before metrics relabeling.

  • BUGFIX: vmagent: properly apply tls_config, basic_auth and bearer_token to proxy connections if proxy_url option is set. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1116

  • BUGFIX: vmagent: properly scrape targets via https proxy specified in proxy_url if insecure_skip_verify flag isn't set in tls_config section. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1116

  • BUGFUX: avoid duplicate time series error if prometheus_buckets() covers a time range with distinct set of buckets.

  • BUGFIX: prevent exponent overflow when processing extremely small values close to zero such as 2.964393875E-314. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1114

  • BUGFIX: do not include datapoints with a timestamp t-d when returning results from /api/v1/query?query=m[d]&time=t as Prometheus does.

  • BUGFIX: do not crash if a query contains histogram_over_time() function name with uppercase chars. For example, Histogram_Over_Time(m[5m]).

v1.55.1

v1.55.0

  • FEATURE: add sign(q) and clamp(q, min, max) functions, which are planned to be added in the upcoming Prometheus release . The last_over_time(m[d]) function is already supported in MetricsQL.

  • FEATURE: vmagent: add scrape_align_interval config option, which can be used for aligning scrapes to the beginning of the configured interval. See these docs for details.

  • FEATURE: expose io-related metrics at /metrics page for every VictoriaMetrics component:

    • process_io_read_bytes_total - the number of bytes read via io syscalls such as read and pread
    • process_io_written_bytes_total - the number of bytes written via io syscalls such as write and pwrite
    • process_io_read_syscalls_total - the number of read syscalls such as read and pread
    • process_io_write_syscalls_total - the number of write syscalls such as write and pwrite
    • process_io_storage_read_bytes_total - the number of bytes read from storage layer
    • process_io_storage_written_bytes_total - the number of bytes written to storage layer
  • FEATURE: vmagent: add ability to spread scrape targets among multiple vmagent instances. See these docs and this issue for details.

  • FEATURE: vmagent: use watch API for Kuberntes service discovery. This should reduce load on Kuberntes API server when it tracks big number of objects (for example, 10K pods). This should also reduce the time needed for k8s targets discovery. See this issue for details.

  • FEATURE: vmagent: export vm_promscrape_target_relabel_duration_seconds metric, which can be used for monitoring the time spend on relabeling for discovered targets.

  • FEATURE: vmagent: optimize relabeling performance for common cases.

  • FEATURE: add increase_pure(m[d]) function to MetricsQL. It works the same as increase(m[d]) except of various edge cases. See this issue for details.

  • FEATURE: increase accuracy for buckets_limit(limit, buckets) results for small limit values. See MetricsQL docs for details.

  • FEATURE: vmagent: initial support for Windows build with CGO_ENABLED=0 GOOS=windows go build -mod=vendor ./app/vmagent. See this and this issue.

  • FEATURE: vmagent: support WebIdentityToken auth in EC2 service discovery. See this issue for details.

  • FEATURE: vmalert: properly process query params in -datasource.url and -remoteRead.url command-line flags. See this issue for details.

  • BUGFIX: vmagent: properly apply -remoteWrite.rateLimit when -remoteWrite.queues is greater than 1. Previously there was a data race, which could prevent from proper rate limiting.

  • BUGFIX: vmagent: properly perform graceful shutdown on SIGINT and SIGTERM signals. The graceful shutdown has been broken in v1.54.0. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1065

  • BUGFIX: reduce the probability of duplicate time series errors when querying Kubernetes metrics.

  • BUGFIX: properly calculate histogram_quantile() over time series with only a single non-zero bucket with {le="+Inf"}. Previously NaN was returned, now the value for the last bucket before {le="+Inf"} is returned like Prometheus does.

  • BUGFIX: vmselect: do not cache partial query results on timeout when receiving data from vmstorage nodes. See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1085

  • BUGFIX: properly handle stale NFS file handle error.

  • BUGFIX: properly cache query results when extra_label query arg is used. Previously the cached results could clash for different extra_label values. See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1095

  • BUGFIX: fix http: superfluous response.WriteHeader call issue. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1078

  • BUGFIX: fix arm64 builds due to the issue in github.com/golang/snappy. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1074

  • BUGFIX: fix index out of range [1024819115206086200] with length 27 panic, which could occur when 1e-9 value is passed to VictoriaMetrics histogram. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1096

  • BUGFIX: fix parsing for Graphite line with empty tags such as foo; 123 456. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1100

  • BUGFIX: unescape only \\, \n and \" in label names when parsing Prometheus text exposition format as Prometheus does. Previously other escape sequences could be improperly unescaped.

v1.54.1

  • BUGFIX: properly handle queries containing a filter on metric name plus any number of negative filters and zero non-negative filters. For example, node_cpu_seconds_total{mode!="idle"}. The bug was introduced in v1.54.0.

v1.54.0

  • FEATURE: optimize searching for matching metrics for metric{<label_filters>} queries if <label_filters> contains at least a single filter. For example, the query up{job="foobar"} should find the matching time series much faster than previously.

  • FEATURE: reduce execution times for q1 <binary_op> q2 queries by executing q1 and q2 in parallel.

  • FEATURE: switch from Go1.15 to Go1.16 for building prod binaries.

  • FEATURE: single-node VictoriaMetrics now accepts requests to handlers with /prometheus and /graphite prefixes such as /prometheus/api/v1/query. This improves compatibility with handlers from VictoriaMetrics cluster.

  • FEATURE: expose process_open_fds and process_max_fds metrics. These metrics can be used for alerting when process_open_fds reaches process_max_fds. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/402 and https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1037

  • FEATURE: vmalert: add -datasource.appendTypePrefix command-line option for querying both Prometheus and Graphite datasource in cluster version of VictoriaMetrics. See these docs for details.

  • FEATURE: vmauth: add ability to route requests from a single user to multiple destinations depending on the requested paths. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1064

  • FEATURE: remove dependency on external programs such as cat, grep and cut when detecting cpu and memory limits inside Docker or LXC container.

  • FEATURE: vmagent: add __meta_kubernetes_endpoints_label_*, __meta_kubernetes_endpoints_labelpresent_*, __meta_kubernetes_endpoints_annotation_* and __meta_kubernetes_endpoints_annotationpresent_* labels for role: endpoints in Kubernetes service discovery. These labels where added in Prometheus 2.25.

  • FEATURE: reduce the minimum supported retention period for inverted index (aka indexdb) from one month to one day. This should reduce disk space usage for <-storageDataPath>/indexdb folder if -retentionPeriod is set to values smaller than one month.

  • FEATURE: vmselect: export per-tenant metrics vm_vmselect_http_requests_total and vm_vmselect_http_requests_duration_ms_total . Other per-tenant metrics are available as a part of enterprise package. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/932 for details.

  • BUGFIX: properly convert regexp tag filters containing escaped dots to non-regexp tag filters. For example, {foo=~"bar\.baz"} should be converted to {foo="bar.baz"}. Previously it was incorrectly converted to {foo="bar\.baz"}, which could result in missing time series for this tag filter.

  • BUGFIX: do not spam error logs when discovering Docker Swarm targets without dedicated IP. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1028 .

  • BUGFIX: properly embed timezone data into VictoriaMetrics apps. This should fix -loggerTimezone usage inside Docker containers.

  • BUGFIX: properly build Docker images for non-amd64 architectures (arm, arm64, ppc64le, 386) on Docker hub. Previously these images were incorrectly based on amd64 base image, so they didn't work.

  • BUGFIX: vmagent: return back unsent block to the queue during graceful shutdown. Previously this block could be dropped if remote storage is unavailable during vmagent shutdown. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1065 .

v1.53.1

  • BUGFIX: vmselect: fix the bug peventing from proper searching by Graphite filter with wildcards such as {__graphite__="foo.*.bar"}.

v1.53.0

  • FEATURE: added vmctl tool to VictoriaMetrics release process. Now it is packaged in vmutils-*.tar.gz archive on the releases page. Source code for vmctl tool has been moved from github.com/VictoriaMetrics/vmctl to github.com/VictoriaMetrics/VictoriaMetrics/app/vmctl.

  • FEATURE: added -loggerTimezone command-line flag for adjusting time zone for timestamps in log messages. By default UTC is used.

  • FEATURE: added -search.maxStepForPointsAdjustment command-line flag, which can be used for disabling adjustment for points returned by /api/v1/query_range handler if such points have timestamps closer than -search.latencyOffset to the current time. Such points may contain incomplete data, so they are substituted by the previous values for step query args smaller than one minute by default.

  • FEATURE: vmselect: added ability to use Graphite-compatible filters in MetricsQL via {__graphite__="foo.*.bar"} syntax. This expression is equivalent to {__name__=~"foo[.][^.]*[.]bar"}, but it works faster and it is easier to use when migrating from Graphite to VictoriaMetrics. This feature deprecates the usage of -search.treatDotsAsIsInRegexps command-line flag.

  • FEATURE: vmselect: added ability to set additional label filters, which must be applied during queries. Such label filters can be set via optional extra_label query arg, which is accepted by querying API handlers. For example, the request to /api/v1/query_range?extra_label=tenant_id=123&query=<query> adds {tenant_id="123"} label filter to the given <query>. It is expected that the extra_label query arg is automatically set by auth proxy sitting in front of VictoriaMetrics. Contact us if you need assistance with such a proxy. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1021 .

  • FEATURE: vmalert: added -datasource.queryStep command-line flag for passing optional step query arg to /api/v1/query endpoint. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1025

  • FEATURE: vmalert: added ability to query Graphite datasource when evaluating alerting and recording rules. See these docs for details.

  • FEATURE: vmagent: added -remoteWrite.roundDigits command-line option for rounding metric values to the given number of decimal digits after the point before sending the metric to the corresponding -remoteWrite.url. This option can be used for improving data compression on the remote storage, because values with lower number of decimal digits can be compressed better than values with bigger number of decimal digits.

  • FEATURE: vmagent: added -remoteWrite.rateLimit command-line flag for limiting data transfer rate to -remoteWrite.url. This may be useful when big amounts of buffered data is sent after temporarily unavailability of the remote storage. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1035

  • FEATURE: vmagent: export the following additional metrics, which may be useful during troubleshooting:

    • vm_promscrape_scrapes_failed_per_url_total
    • vm_promscrape_scrapes_skipped_by_sample_limit_per_url_total
    • vm_promscrape_discovery_requests_total
    • vm_promscrape_discovery_retries_total
    • vm_promscrape_scrape_retries_total
    • vm_promscrape_service_discovery_duration_seconds
  • FEATURE: vmselect: initial implementation for Graphite Render API.

  • BUGFIX: vmagent: reduce HTTP reconnection rate for scrape targets. Previously vmagent could errorneusly close HTTP keep-alive connections more frequently than needed.

  • BUGFIX: vmagent: retry scrape and service discovery requests when the remote server closes HTTP keep-alive connection. Previously disable_keepalive: true option could be used under scrape_configs section when working with such servers.

v1.52.0

v1.51.0

v1.50.2

v1.50.1

  • FEATURE: vmagent: export vmagent_remotewrite_blocks_sent_total and vmagent_remotewrite_blocks_sent_total metrics for each -remoteWrite.url.

  • BUGFIX: vmagent: properly delete unregistered scrape targets from /targets and /api/v1/targets pages. They weren't deleted due to the bug in v1.50.0.

v1.50.0

  • FEATURE: automatically reset response cache when samples with timestamps older than now - search.cacheTimestampOffset are ingested to VictoriaMetrics. This makes unnecessary disabling response cache during data backfilling or resetting it after backfilling is complete as described in these docs. This feature applies only to single-node VictoriaMetrics. It doesn't apply to cluster version of VictoriaMetrics because vminsert nodes don't know about vmselect nodes where the response cache must be reset.

  • FEATURE: vmalert: add query, first and value functions to alert templates. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/539

  • FEATURE: vmagent: return user-friendly HTML page when requesting /targets page from web browser. The page is returned in the old plaintext format when requesting via curl or similar tool.

  • FEATURE: allow multiple whitespace chars between measurements, fields and timestamp when parsing InfluxDB line protocol. Though InfluxDB line protocol denies multiple whitespace chars between these entities, some apps improperly put multiple whitespace chars. This workaround allows accepting data from such apps.

  • FEATURE: export vm_promscrape_active_scrapers{type="<sd_type>"} metric for tracking the number of active scrapers per each service discovery type.

  • FEATURE: export vm_promscrape_scrapers_started_total{type="<sd_type>"} and vm_promscrape_scrapers_stopped_total{type="<sd_type>"} metrics for tracking churn rate for scrapers per each service discovery type.

  • FEATURE: vmagent: allow setting per--remoteWrite.url command-line flags for -remoteWrite.sendTimeout and -remoteWrite.tlsInsecureSkipVerify.

  • BUGFIX: properly handle * and [...] inside curly braces in query passed to Graphite Metrics API. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/952

  • BUGFIX: vmagent: fix memory leak when big number of targets is discovered via service discovery.

  • BUGFIX: vmagent: properly pass datacenter filter to Consul API server. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/574#issuecomment-740454170

  • BUGFIX: properly handle CPU limits set on the host system or host container. The bugfix may result in lower memory usage on systems with CPU limits. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/946

  • BUGFIX: prevent from duplicate name tag returned from /tags/autoComplete/tags handler. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/942

  • BUGFIX: do not enable strict parsing for -promscrape.config if -promscrape.config.dryRun comand-line flag is set. Strict parsing can be enabled with -promscrape.config.strictParse command-line flag. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/944

  • BUGFIX: vminsert: properly update vm_rpc_rerouted_rows_processed_total metric. Previously it wasn't updated. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/955

  • BUGFIX: vmagent: properly recover when opening incorrectly stored persistent queue. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/964

  • BUGFIX: vmagent: properly handle scrape errors when stream parsing is enabled with -promscrape.streamParse command-line flag or with stream_parse: true per-target config option. Previously such errors weren't reported at /targets page. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/967

  • BUGFIX: assume the previous value is 0 when calculating increase() for the first point on the graph if its value doesn't exceed 100 and the delta between two first points equals to 0. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/962

v1.49.0

  • FEATURE: optimize Consul service discovery speed when discovering big number of services. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/574

  • FEATURE: add label_uppercase(q, label1, ... labelN) and label_lowercase(q, label1, ... labelN) function to MetricsQL for uppercasing and lowercasing values for the given labels. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/936

  • FEATURE: add count_eq_over_time(m[d], N) and count_ne_over_time(m[d], N) for counting the number of samples for m over d that (equal / not equal) to N.

  • FEATURE: do not print usage info for all the command-line flags when incorrect command-line flag is passed. Previously it could be hard reading the error message about incorrect command-line flag because of too big usage info for all the flags.

  • FEATURE: upgrade Go builder from v1.15.5 to v1.15.6 . This fixes issues found in Go since v1.15.5.

  • BUGFIX: properly parse timestamps in OpenMetrics format - they are exposed as floating-point number in seconds instead of integer milliseconds unlike in Prometheus exposition format. See the docs.

  • BUGFIX: return nan for a >bool b query when a equals to nan like Prometheus does. Previously 0 was returned in this case. This applies to any comparison operation with bool modifier. See these docs for details.

  • BUGFIX: properly parse hex numbers in MetricsQL. Previously hex numbers with non-decimal digits such as 0x3b couldn't be parsed.

  • BUGFIX: handle time() cmp_op metric like Prometheus does - i.e. return metric value if cmp_op comparison is true. Previously time() value was returned.

  • BUGFIX: return nan for minute(m) query when m equals to nan like Prometheus does. This applies to all the time-related functions such as day_of_month, day_of_week, days_in_month, hour, month and year.

v1.48.0

v1.47.0

v1.46.0

v1.45.0

v1.44.0

  • FEATURE: automatically add missing label filters to binary operands as described at https://utcc.utoronto.ca/~cks/space/blog/sysadmin/PrometheusLabelNonOptimization . This should improve performance for queries with missing label filters in binary operands. For example, the following query should work faster now, because it shouldn't fetch and discard time series for node_filesystem_files_free metric without matching labels for the left side of the expression:

       node_filesystem_files{ host="$host", mountpoint="/" } - node_filesystem_files_free
    
  • FEATURE: vmagent: add Docker Swarm service discovery (aka dockerswarm_sd_config). See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/656

  • FEATURE: add ability to export data in CSV format. See these docs for details.

  • FEATURE: vmagent: add -promscrape.suppressDuplicateScrapeTargetErrors command-line flag for suppressing duplicate scrape target errors. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/651 and https://docs.victoriametrics.com/vmagent.html#troubleshooting .

  • FEATURE: vmagent: show original labels before relabeling is applied on duplicate scrape target errors. This should simplify debugging for incorrect relabeling. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/651

  • FEATURE: vmagent: /targets page now accepts optional show_original_labels=1 query arg for displaying original labels for each target before relabeling is applied. This should simplify debugging for target relabeling configs. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/651

  • FEATURE: add -finalMergeDelay command-line flag for configuring the delay before final merge for per-month partitions. The final merge is started after no new data is ingested into per-month partition during -finalMergeDelay.

  • FEATURE: add vm_rows_added_to_storage_total metric, which shows the total number of rows added to storage since app start. The sum(rate(vm_rows_added_to_storage_total)) can be smaller than sum(rate(vm_rows_inserted_total)) if certain metrics are dropped due to relabeling. The sum(rate(vm_rows_added_to_storage_total)) can be bigger than sum(rate(vm_rows_inserted_total)) if replication is enabled.

  • FEATURE: keep metric name after applying MetricsQL functions, which don't change time series meaning. The list of such functions:

    • keep_last_value
    • keep_next_value
    • interpolate
    • running_min
    • running_max
    • running_avg
    • range_min
    • range_max
    • range_avg
    • range_first
    • range_last
    • range_quantile
    • smooth_exponential
    • ceil
    • floor
    • round
    • clamp_min
    • clamp_max
    • max_over_time
    • min_over_time
    • avg_over_time
    • quantile_over_time
    • mode_over_time
    • geomean_over_time
    • holt_winters
    • predict_linear See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/674
  • BUGFIX: properly handle stale time series after K8S deployment. Previously such time series could be double-counted. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/748

  • BUGFIX: return a single time series at max from absent() function like Prometheus does.

  • BUGFIX: vmalert: accept days, weeks and years in for: part of config like Prometheus does. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/817

  • BUGFIX: fix mode_over_time(m[d]) calculations. Previously the function could return incorrect results.

v1.43.0

v1.42.0

Previous releases

See releases page.