From 4e8583bb022c1ff952117a9aa66aa6393a78ae58 Mon Sep 17 00:00:00 2001 From: Aliaksandr Valialkin Date: Wed, 18 Dec 2019 00:56:44 +0200 Subject: [PATCH] docs: renaming: `PromQL extensions -> MetricsQL` --- docs/ExtendedPromQL.md | 25 +++- docs/Home.md | 6 +- docs/Single-server-VictoriaMetrics.md | 14 +- docs/vmbackup.md | 181 ++++++++++++++++++++++++++ docs/vmrestore.md | 86 ++++++++++++ 5 files changed, 304 insertions(+), 8 deletions(-) create mode 100644 docs/vmbackup.md create mode 100644 docs/vmrestore.md diff --git a/docs/ExtendedPromQL.md b/docs/ExtendedPromQL.md index e9ac32fb64..81e9ed4712 100644 --- a/docs/ExtendedPromQL.md +++ b/docs/ExtendedPromQL.md @@ -1,9 +1,24 @@ -# PromQL extensions +# MetricsQL -VictoriaMetrics supports [standard PromQL](https://prometheus.io/docs/prometheus/latest/querying/basics/) -including [subqueries](https://prometheus.io/blog/2019/01/28/subquery-support/). -Additionally it supports useful extensions mentioned below. -Try these extensions on [an editable Grafana dashboard](http://play-grafana.victoriametrics.com:3000/d/4ome8yJmz/node-exporter-on-victoriametrics-demo). +VictoriaMetrics implements MetricsQL - query language inspired by [PromQL](https://prometheus.io/docs/prometheus/latest/querying/basics/). +It is backwards compatible with PromQL, so Grafana dashboards backed by Prometheus datasource should work the same after switching from Prometheus to VictoriaMetrics. + +The following functionality is implemented differently in MetricsQL comparing to PromQL in order to improve user experience: +* MetricsQL takes into account the previous point before the window in square brackets for range functions such as `rate` and `increase`. + It also doesn't extrapolate range function results. This addresses [this issue from Prometheus](https://github.com/prometheus/prometheus/issues/3746). +* MetricsQL returns the expected non-empty responses for requests with `step` values smaller than scrape interval. This addresses [this issue from Grafana](https://github.com/grafana/grafana/issues/11451). +* MetricsQL treats `scalar` type the same as `instant vector` without labels, since users usually don't feel the difference between these types. + See [the corresponding Prometheus docs](https://prometheus.io/docs/prometheus/latest/querying/basics/#expression-language-data-types) for details. + +Other PromQL functionality should work the same in MetricsQL. [File an issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues) +if you notice discrepancies between PromQL and MetricsQL results other than mentioned above. + +MetricsQL provides additional functionality mentioned below, which is aimed towards solving practical cases. +Feel free [filing a feature request](https://github.com/VictoriaMetrics/VictoriaMetrics/issues) if you think MetricsQL misses certain useful functionality. + +*Note that the functionality mentioned below doesn't work in PromQL, so it is impossible switching back to Prometheus after you start using it.* + +This functionality can be tried at [an editable Grafana dashboard](http://play-grafana.victoriametrics.com:3000/d/4ome8yJmz/node-exporter-on-victoriametrics-demo). - [`WITH` templates](https://play.victoriametrics.com/promql/expand-with-exprs). This feature simplifies writing and managing complex queries. Go to [`WITH` templates playground](https://victoriametrics.com/promql/expand-with-exprs) and try it. - Metric names and metric labels may contain escaped chars. For instance, `foo\-bar{baz\=aa="b"}` is valid expression. It returns time series with name `foo-bar` containing label `baz=aa` with value `b`. Additionally, `\xXX` escape sequence is supported, where `XX` is hexadecimal representation of escaped char. diff --git a/docs/Home.md b/docs/Home.md index bcb28f8389..1cd7b74b5b 100644 --- a/docs/Home.md +++ b/docs/Home.md @@ -3,9 +3,11 @@ * [Quick start](Quick-Start) * [`WITH` templates playground](https://play.victoriametrics.com/promql/expand-with-exprs) * [Grafana playground](http://play-grafana.victoriametrics.com:3000/d/4ome8yJmz/node-exporter-on-victoriametrics-demo) -* [PromQL extensions](ExtendedPromQL) +* [MetricsQL](ExtendedPromQL) * [Single-node version](Single-server-VictoriaMetrics) * [FAQ](FAQ) * [Cluster version](Cluster-VictoriaMetrics) * [Articles](Articles) - +* [CaseStudies](Case studies) +* [vmbackup](vmbackup) +* [vmrestore](vmrestore) diff --git a/docs/Single-server-VictoriaMetrics.md b/docs/Single-server-VictoriaMetrics.md index 241af1b334..b311e19511 100644 --- a/docs/Single-server-VictoriaMetrics.md +++ b/docs/Single-server-VictoriaMetrics.md @@ -18,7 +18,7 @@ Cluster version is available [here](https://github.com/VictoriaMetrics/VictoriaM ## Prominent features * Supports [Prometheus querying API](https://prometheus.io/docs/prometheus/latest/querying/api/), so it can be used as Prometheus drop-in replacement in Grafana. - Additionally, VictoriaMetrics extends PromQL with opt-in [useful features](https://github.com/VictoriaMetrics/VictoriaMetrics/wiki/ExtendedPromQL). + VictoriaMetrics implements [MetricsQL](https://github.com/VictoriaMetrics/VictoriaMetrics/wiki/ExtendedPromQL) query language, which is inspired by PromQL. * Supports global query view. Multiple Prometheus instances may write data into VictoriaMetrics. Later this data may be used in a single query. * High performance and good scalability for both [inserts](https://medium.com/@valyala/high-cardinality-tsdb-benchmarks-victoriametrics-vs-timescaledb-vs-influxdb-13e6ee64dd6b) and [selects](https://medium.com/@valyala/when-size-matters-benchmarking-victoriametrics-vs-timescale-and-influxdb-6035811952d4). @@ -395,6 +395,18 @@ VictoriaMetrics supports the following handlers from [Prometheus querying API](h These handlers can be queried from Prometheus-compatible clients such as Grafana or curl. +VictoriaMetrics accepts additional args for `/api/v1/labels` and `/api/v1/label/.../values` handlers. +See [this feature request](https://github.com/prometheus/prometheus/issues/6178) for details: + +* Any number [time series selectors](https://prometheus.io/docs/prometheus/latest/querying/basics/#time-series-selectors) via `match[]` query arg. +* Optional `start` and `end` query args for limiting the time range for the selected labels or label values. + +Additionally VictoriaMetrics provides the following handlers: + +* `/api/v1/series/count` - it returns the total number of time series in the database. Note that this handler scans all the inverted index, + so it can be slow if the database contains tens of millions of time series. +* `/api/v1/labels/count` - it returns a list of `label: values_count` entries. It can be used for determining labels with the maximum number of values. + ### How to build from sources diff --git a/docs/vmbackup.md b/docs/vmbackup.md new file mode 100644 index 0000000000..2b94d68b90 --- /dev/null +++ b/docs/vmbackup.md @@ -0,0 +1,181 @@ +## vmbackup + +`vmbackup` creates VictoriaMetrics data backups from [instant snapshots](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-work-with-snapshots). + +Supported storage systems for backups: + +* [GCS](https://cloud.google.com/storage/). Example: `gcs:///` +* [S3](https://aws.amazon.com/s3/). Example: `s3:///` +* Any S3-compatible storage such as [MinIO](https://github.com/minio/minio), [Ceph](https://docs.ceph.com/docs/mimic/radosgw/s3/) or [Swift](https://www.swiftstack.com/docs/admin/middleware/s3_middleware.html). See `-customS3Endpoint` command-line flag. +* Local filesystem. Example: `fs://` + +Incremental backups and full backups are supported. Incremental backups are created automatically if the destination path already contains data from the previous backup. +Full backups can be sped up with `-origin` pointing to already existing backup on the same remote storage. In this case `vmbackup` makes server-side copy for the shared +data between the existing backup and new backup. This saves time and costs on data transfer. + +Backup process can be interrupted at any time. It is automatically resumed from the interruption point when restarting `vmbackup` with the same args. + +Backed up data can be restored with [vmrestore](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmrestore/README.md). + +See [this article](https://medium.com/@valyala/speeding-up-backups-for-big-time-series-databases-533c1a927883) for more details. + + +### Use cases + +#### Regular backups + +Regular backup can be performed with the following command: + +``` +vmbackup -storageDataPath= -snapshotName= -dst=gcs:/// +``` + +* `` - path to VictoriaMetrics data pointed by `-storageDataPath` command-line flag in single-node VictoriaMetrics or in cluster `vmstorage`. + There is no need to stop VictoriaMetrics for creating backups, since they are performed from immutable [instant snapshots](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-work-with-snapshots). +* `` is the snapshot to backup. See [how to create instant snapshots](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-work-with-snapshots). +* `` is already existing name for [GCS bucket](https://cloud.google.com/storage/docs/creating-buckets). +* `` is the destination path where new backup will be placed. + + +#### Regular backups with server-side copy from existing backup + +If the destination GCS bucket already contains the previous backup at `-origin` path, then new backup can be sped up +with the following command: + +``` +vmbackup -storageDataPath= -snapshotName= -dst=gcs:/// -origin=gcs:/// +``` + +This saves time and network bandwidth costs by performing server-side copy for the shared data from the `-origin` to `-dst`. + + +#### Incremental backups + +Incremental backups are performed if `-dst` points to already existing backup. In this case only new data is uploaded to remote storage. +This saves time and network bandwidth costs when working with big backups: + +``` +vmbackup -storageDataPath= -snapshotName= -dst=gcs:/// +``` + + +#### Smart backups + +Smart backups mean storing full daily backups into `YYYYMMDD` folders and creating incremental hourly backup into `latest` folder: + +* Run the following command every hour: + +``` +vmbackup -snapshotName= -dst=gcs:///latest +``` + +Where `` is the latest [snapshot](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-work-with-snapshots). +The command will upload only changed data to `gcs:///latest`. + +* Run the following command once a day: + +``` +vmbackup -snapshotName= -dst=gcs:/// -origin=gcs:///latest +``` + +Where `` is the snapshot for the last day ``. + + +This apporach saves network bandwidth costs on hourly backups (since they are incremental) and allows recovering data from either the last hour (`latest` backup) +or from any day (`YYYYMMDD` backups). Note that hourly backup shouldn't run when creating daily backup. + +Do not forget removing old snapshots and backups when they are no longer needed for saving storage costs. + + +### How does it work? + +The backup algorithm is the following: + +1. Collect information about files in the `-snapshotName`, in the `-dst` and in the `-origin`. +2. Determine files in `-dst`, which are missing in `-snapshotName`, and delete them. These are usually small files, which are already merged into bigger files in the snapshot. +3. Determine files from `-snapshotName`, which are missing in `-dst`. These are usually small new files and bigger merged files. +4. Determine files from step 3, which exist in the `-origin`, and perform server-side copy of these files from `-origin` to `-dst`. + This are usually the biggest and the oldest files, which are shared between backups. +5. Upload the remaining files from setp 3 from `-snapshotName` to `-dst`. + +The algorithm splits source files into 100MB chunks in the backup. Each chunk is stored as a separate file in the backup. +Such splitting minimizes the amounts of data to re-transfer after temporary errors. + +`vmbackup` relies on [instant snapshot](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282) properties: + +- All the files in the snapshot are immutable. +- Old files are periodically merged into new files. +- Smaller files have higher probability to be merged. +- Consecutive snapshots share many identical files. + +These properties allow performing fast and cheap incremental backups and server-side copying from `-origin` paths. +See [this article](https://medium.com/@valyala/speeding-up-backups-for-big-time-series-databases-533c1a927883) for more details. +`vmbackup` can work improperly or slowly when these properties are violated. + + +### Troubleshooting + +* If the backup is slow, then try setting higher value for `-concurrency` flag. This will increase the number of concurrent workers that upload data to backup storage. +* If `vmbackup` eats all the network bandwidth, then set `-maxBytesPerSecond` to the desired value. +* If `vmbackup` has been interrupted due to temporary error, then just restart it with the same args. It will resume the backup process. + + +### Advanced usage + +Run `vmbackup -help` in order to see all the available options: + +``` + -concurrency int + The number of concurrent workers. Higher concurrency may reduce backup duration (default 10) + -configFilePath string + Path to file with S3 configs. Configs are loaded from default location if not set. + See https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html + -configProfile string + Profile name for S3 configs (default "default") + -credsFilePath string + Path to file with GCS or S3 credentials. Credentials are loaded from default locations if not set. + See https://cloud.google.com/iam/docs/creating-managing-service-account-keys and https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html + -customS3Endpoint string + Custom S3 endpoint for use with S3-compatible storages (e.g. MinIO). S3 is used if not set + -dst string + Where to put the backup on the remote storage. Example: gcs://bucket/path/to/backup/dir, s3://bucket/path/to/backup/dir or fs:///path/to/local/backup/dir + -dst can point to the previous backup. In this case incremental backup is performed, i.e. only changed data is uploaded + -loggerLevel string + Minimum level of errors to log. Possible values: INFO, ERROR, FATAL, PANIC (default "INFO") + -maxBytesPerSecond int + The maximum upload speed. There is no limit if it is set to 0 + -memory.allowedPercent float + Allowed percent of system memory VictoriaMetrics caches may occupy (default 60) + -origin string + Optional origin directory on the remote storage with old backup for server-side copying when performing full backup. This speeds up full backups + -snapshotName string + Name for the snapshot to backup. See https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-work-with-snapshots + -storageDataPath string + Path to VictoriaMetrics data. Must match -storageDataPath from VictoriaMetrics or vmstorage (default "victoria-metrics-data") + -version + Show VictoriaMetrics version +``` + + +### How to build from sources + +It is recommended using [binary releases](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) - see `vmutils-*` archives there. + + +#### Development build + +1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.12. +2. Run `make vmbackup` from the root folder of the repository. + It builds `vmbackup` binary and puts it into the `bin` folder. + +#### Production build + +1. [Install docker](https://docs.docker.com/install/). +2. Run `make vmbackup-prod` from the root folder of the repository. + It builds `vmbackup-prod` binary and puts it into the `bin` folder. + +#### Building docker images + +Run `make package-vmbackup`. It builds `victoriametrics/vmbackup:` docker image locally. +`` is auto-generated image tag, which depends on source code in the repository. +The `` may be manually set via `PKG_TAG=foobar make package-vmbackup`. diff --git a/docs/vmrestore.md b/docs/vmrestore.md new file mode 100644 index 0000000000..1e62d143c7 --- /dev/null +++ b/docs/vmrestore.md @@ -0,0 +1,86 @@ +## vmrestore + +`vmrestore` restores data from backups created by [vmbackup](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmbackup/README.md). +VictoriaMetrics `v1.29.0` and newer versions must be used for working with the restored data. + +Restore process can be interrupted at any time. It is automatically resumed from the inerruption point +when restarting `vmrestore` with the same args. + + +### Usage + +VictoriaMetrics must be stopped during the restore process. + +``` +vmrestore -src=gcs:/// -storageDataPath= + +``` + +* `` is [GCS bucket](https://cloud.google.com/storage/docs/creating-buckets) name. +* `` is the path to backup made with [vmbackup](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmbackup/README.md) on GCS bucket. +* `` is the path to folder where data will be restored. This folder must be passed + to VictoriaMetrics in `-storageDataPath` command-line flag after the restore process is complete. + +The original `-storageDataPath` directory may contain old files. They will be susbstituted by the files from backup. + + +### Troubleshooting + +* If `vmrestore` eats all the network bandwidth, then set `-maxBytesPerSecond` to the desired value. +* If `vmrestore` has been interrupted due to temporary error, then just restart it with the same args. It will resume the restore process. + + +### Advanced usage + +Run `vmrestore -help` in order to see all the available options: + +``` + -concurrency int + The number of concurrent workers. Higher concurrency may reduce restore duration (default 10) + -configFilePath string + Path to file with S3 configs. Configs are loaded from default location if not set. + See https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html + -configProfile string + Profile name for S3 configs (default "default") + -credsFilePath string + Path to file with GCS or S3 credentials. Credentials are loaded from default locations if not set. + See https://cloud.google.com/iam/docs/creating-managing-service-account-keys and https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html + -customS3Endpoint string + Custom S3 endpoint for use with S3-compatible storages (e.g. MinIO). S3 is used if not set + -loggerLevel string + Minimum level of errors to log. Possible values: INFO, ERROR, FATAL, PANIC (default "INFO") + -maxBytesPerSecond int + The maximum download speed. There is no limit if it is set to 0 + -memory.allowedPercent float + Allowed percent of system memory VictoriaMetrics caches may occupy (default 60) + -src string + Source path with backup on the remote storage. Example: gcs://bucket/path/to/backup/dir, s3://bucket/path/to/backup/dir or fs:///path/to/local/backup/dir + -storageDataPath string + Destination path where backup must be restored. VictoriaMetrics must be stopped when restoring from backup. -storageDataPath dir can be non-empty. In this case only missing data is downloaded from backup (default "victoria-metrics-data") + -version + Show VictoriaMetrics version +``` + + +### How to build from sources + +It is recommended using [binary releases](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) - see `vmutils-*` archives there. + + +#### Development build + +1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.12. +2. Run `make vmrestore` from the root folder of the repository. + It builds `vmrestore` binary and puts it into the `bin` folder. + +#### Production build + +1. [Install docker](https://docs.docker.com/install/). +2. Run `make vmrestore-prod` from the root folder of the repository. + It builds `vmrestore-prod` binary and puts it into the `bin` folder. + +#### Building docker images + +Run `make package-vmrestore`. It builds `victoriametrics/vmrestore:` docker image locally. +`` is auto-generated image tag, which depends on source code in the repository. +The `` may be manually set via `PKG_TAG=foobar make package-vmrestore`.