FAQ.md: update

Aliaksandr Valialkin 2019-05-23 01:24:14 +03:00
parent f1d90e6931
commit 39bd9c7ecd

111
FAQ.md

@ -5,17 +5,31 @@ To provide the best long-term [remote storage](https://prometheus.io/docs/operat
### Which features does VictoriaMetrics have?
VictoriaMetrics has the following features:
- Native [PromQL](https://prometheus.io/docs/prometheus/latest/querying/basics/) support. Additionally, VictoriaMetrics extends PromQL with useful features. See [Extended PromQL](ExtendedPromQL) for more details.
- Simple configuration. Just copy-n-paste remote storage URL to Prometheus config and that's it! See [Quick Start](Quick-Start) for more info.
- Reduced operational overhead. Offload Prometheus storage maintenance burden - scalability, durability, backups, retention policy - to VictoriaMetrics.
- Insertion rate scales to [millions of metric values per second](https://medium.com/@valyala/insert-benchmarks-with-inch-influxdb-vs-victoriametrics-e31a41ae2893).
- Storage scales to [millions of metrics](https://medium.com/@valyala/high-cardinality-tsdb-benchmarks-victoriametrics-vs-timescaledb-vs-influxdb-13e6ee64dd6b) with trillions of metric values.
- Wide range of retention periods - from 1 month to 5 years. Users may create different projects (aka `storage namespaces`) with different retention periods.
- Fast query engine. It excels on [heavy queries over thousands of metrics with millions of metric values](https://medium.com/@valyala/when-size-matters-benchmarking-victoriametrics-vs-timescale-and-influxdb-6035811952d4).
- The lowest price on the market. We can afford this thanks to cost-effective VictoriaMetrics core that requires lower amounts of RAM, CPU and storage comparing to competitors.
- The same remote storage URL may be used by multiple Prometheus instances collecting distinct metric sets, so all these metrics may be used in a single query (aka `global querying view`). This works ideally for multiple Prometheus instances located in different subnetworks / datacenters.
- VictoriaMetrics supports backfilling, i.e. it accepts datapoints from the past.
* Supports [Prometheus querying API](https://prometheus.io/docs/prometheus/latest/querying/api/), so it can be used as Prometheus drop-in replacement in Grafana.
Additionally, VictoriaMetrics extends PromQL with opt-in [useful features](https://github.com/VictoriaMetrics/VictoriaMetrics/wiki/ExtendedPromQL).
* High performance and good scalability for both [inserts](https://medium.com/@valyala/high-cardinality-tsdb-benchmarks-victoriametrics-vs-timescaledb-vs-influxdb-13e6ee64dd6b)
and [selects](https://medium.com/@valyala/when-size-matters-benchmarking-victoriametrics-vs-timescale-and-influxdb-6035811952d4).
[Outperforms InfluxDB and TimescaleDB by up to 20x](https://medium.com/@valyala/measuring-vertical-scalability-for-time-series-databases-in-google-cloud-92550d78d8ae).
* [Uses 10x less RAM than InfluxDB](https://medium.com/@valyala/insert-benchmarks-with-inch-influxdb-vs-victoriametrics-e31a41ae2893) when working with millions of unique time series (aka high cardinality).
* High data compression, so [up to 70x more data points](https://medium.com/@valyala/when-size-matters-benchmarking-victoriametrics-vs-timescale-and-influxdb-6035811952d4)
may be crammed into a limited storage comparing to TimescaleDB.
* Optimized for storage with high-latency IO and low iops (HDD and network storage in AWS, Google Cloud, Microsoft Azure, etc). See [graphs from these benchmarks](https://medium.com/@valyala/high-cardinality-tsdb-benchmarks-victoriametrics-vs-timescaledb-vs-influxdb-13e6ee64dd6b).
* A single-node VictoriaMetrics may substitute moderately sized clusters built with competing solutions such as Thanos, Uber M3, Cortex, InfluxDB or TimescaleDB.
See [vertical scalability benchmarks](https://medium.com/@valyala/measuring-vertical-scalability-for-time-series-databases-in-google-cloud-92550d78d8ae).
* Easy operation:
* VictoriaMetrics consists of a single executable without external dependencies.
* All the configuration is done via explicit command-line flags with reasonable defaults.
* All the data is stored in a single directory pointed by `-storageDataPath` flag.
* Easy backups from [instant snapshots](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282).
* Storage is protected from corruption on unclean shutdown (i.e. hardware reset or `kill -9`) thanks to [the storage architecture](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282).
* Supports metrics' ingestion and backfilling via the following protocols:
* [Prometheus remote write API](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write)
* [InfluxDB line protocol](https://docs.influxdata.com/influxdb/v1.7/write_protocols/line_protocol_tutorial/)
* [Graphite plaintext protocol](https://graphite.readthedocs.io/en/latest/feeding-carbon.html) with [tags](https://graphite.readthedocs.io/en/latest/tags.html#carbon)
if `-graphiteListenAddr` is set.
* [OpenTSDB put message](http://opentsdb.net/docs/build/html/api_telnet/put.html) if `-opentsdbListenAddr` is set.
* Ideally works with big amounts of time series data from IoT sensors, connected car sensors and industrial sensors.
* Has open source [cluster version](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/cluster).
### Which clients do you target?
@ -23,76 +37,77 @@ VictoriaMetrics has the following features:
The following Prometheus users may be interested in VictoriaMetrics:
- Users who don't want to bother with Prometheus' local storage operational burden - backups, replication, capacity planning, scalability, etc.
- Users with multiple Prometheus instances who want performing arbitrary queries over all the metrics collected by their Prometheus instances (aka `global querying view`).
- Users who want reducing costs for storing huge amounts of Prometheus data.
- Users who want reducing costs for storing huge amounts of time series data.
### How to start using VictoriaMetrics?
See [Quick Start](Quick-Start).
Start with [single-node version](Single-server-VictoriaMetrics). It is easy to configure and operate. It should fit the majority of use cases.
### Is it safe to enable [remote write storage](https://prometheus.io/docs/operating/integrations/#remote-endpoints-and-storage) in Prometheus?
Yes. Prometheus continues writing data to local storage after enabling remote storage write, so all the existing local storage data and new data is available for querying via Prometheus as usual.
Yes. Prometheus continues writing data to local storage after enabling remote storage write, so all the existing local storage data
and new data is available for querying via Prometheus as usual.
### How does VictoriaMetrics compare to other clustered TSDBs on top of Prometheus such as [M3 from Uber](https://eng.uber.com/m3/), [Thanos](https://github.com/improbable-eng/thanos), [Cortex](https://github.com/cortexproject/cortex), etc.?
VictoriaMetrics is simpler, is more cost-effective and provides [useful extensions for PromQL](ExtendedPromQL). The simplicity is twofold:
- It is simpler to operate - just copy-n-paste [the provided URLs](Quick-Start) for [Prometheus remote_write API](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#%3Cremote_write%3E) and [Grafana datasource for Prometheus](http://docs.grafana.org/features/datasources/prometheus/) and that's it!
There is no more headache about configuration, replication, backups, scalability and capacity planning for Prometheus local storage or third-party [sidecars](https://github.com/improbable-eng/thanos/blob/master/docs/components/sidecar.md).
VictoriaMetrics is simpler, faster, more cost-effective and it provides [useful extensions for PromQL](ExtendedPromQL). The simplicity is twofold:
- It is simpler to configure and operate. There is no need in configuring third-party [sidecars](https://github.com/improbable-eng/thanos/blob/master/docs/components/sidecar.md)
or fighting with [gossip protocol](https://github.com/improbable-eng/thanos/blob/master/docs/proposals/approved/201809_gossip-removal.md).
- VictoriaMetrics has simpler architecture, which means less bugs and more useful features in a long run comparing to competing TSDBs.
### How does VictoriaMetrics compare to [InfluxDB](https://www.influxdata.com/time-series-platform/influxdb/)?
VictoriaMetrics doesn't [eat your RAM](https://medium.com/@valyala/insert-benchmarks-with-inch-influxdb-vs-victoriametrics-e31a41ae2893) and [data](https://github.com/influxdata/influxdb/search?q=data+loss&type=Issues). Additionally it is easier to operate and provides better query language - [extended PromQL](ExtendedPromQL).
VictoriaMetrics requires [10x less RAM](https://medium.com/@valyala/insert-benchmarks-with-inch-influxdb-vs-victoriametrics-e31a41ae2893) and it [works faster](https://medium.com/@valyala/measuring-vertical-scalability-for-time-series-databases-in-google-cloud-92550d78d8ae).
It is easier to configure and operate. It provides [better query language](https://medium.com/@valyala/promql-tutorial-for-beginners-9ab455142085) than InfluxQL or Flux.
### How does VictoriaMetrics compare to [TimescaleDB](https://www.timescale.com/)?
TimescaleDB insists on using SQL as a query language. While SQL is more powerful than the [extended PromQL](ExtendedPromQL), this power is rarely required during typical TSDB usage. Real-world queries usually look simpler when written in PromQL than in SQL.
TimescaleDB insists on using SQL as a query language. While SQL is more powerful than PromQL, this power is rarely required during typical TSDB usage. Real-world queries usually [look clearer and simpler when written in PromQL than in SQL](https://medium.com/@valyala/promql-tutorial-for-beginners-9ab455142085).
Additionally, VictoriaMetrics requires [up to 70x less storage space comparing to TimescaleDB](https://medium.com/@valyala/when-size-matters-benchmarking-victoriametrics-vs-timescale-and-influxdb-6035811952d4) for storing the same amount of time series data.
### Does VictoriaMetrics use Prometheus technologies like other clustered TSDBs built on top of Prometheus such as [M3 from Uber](https://eng.uber.com/m3/), [Thanos](https://github.com/improbable-eng/thanos), [Cortex](https://github.com/cortexproject/cortex)?
No. VictoriaMetrics core is written in Go from scratch by [fasthttp](https://github.com/valyala/fasthttp) [author](https://github.com/valyala).
The architecture is [optimized for storing and querying large amounts of timeseries data with high cardinality](https://medium.com/devopslinks/victoriametrics-creating-the-best-remote-storage-for-prometheus-5d92d66787ac). VictoriaMetrics storage uses [certain ideas from ClickHouse](https://clickhouse.yandex/). Special thanks to [Alexey Milovidov](https://github.com/alexey-milovidov).
The architecture is [optimized for storing and querying large amounts of time series data with high cardinality](https://medium.com/devopslinks/victoriametrics-creating-the-best-remote-storage-for-prometheus-5d92d66787ac). VictoriaMetrics storage uses [certain ideas from ClickHouse](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282). Special thanks to [Alexey Milovidov](https://github.com/alexey-milovidov).
### Are there performance comparisons with other solutions?
We modified [tsbs benchmark from TimescaleDB](https://blog.timescale.com/time-series-database-benchmarks-timescaledb-influxdb-cassandra-mongodb-bc702b72927e) and run single-server performance tests for VictoriaMetrics, TimescaleDB and InfluxDB. Results are available [here](https://docs.google.com/spreadsheets/d/158AAsLMlGZ72D4MHfSdru_9dt3jpHymqo8Up_vp3LfU/edit?usp=sharing). In short, VictoriaMetrics is up to `20x` faster than TimescaleDB and InfluxDB on heavy queries when run on the same hardware. Additionally, it uses `70x` less storage space for test data comparing to TimescaleDB.
Yes:
VictoriaMetrics perfectly scales on multiple instances, so it should achieve `Nx` better throughput on `N` instances comparing to a single instance.
* [Measuring vertical scalability for time series databases: VictoriaMetrics vs InfluxDB vs TimescaleDB](https://medium.com/@valyala/measuring-vertical-scalability-for-time-series-databases-in-google-cloud-92550d78d8ae).
* [Measuring insert performance on high-cardinality time series: VictoriaMetrics vs InfluxDB](https://medium.com/@valyala/insert-benchmarks-with-inch-influxdb-vs-victoriametrics-e31a41ae2893)
* [TSBS benchmark on high-cardinality time series: VictoriaMetrics vs InfluxDB vs TimescaleDB](https://medium.com/@valyala/high-cardinality-tsdb-benchmarks-victoriametrics-vs-timescaledb-vs-influxdb-13e6ee64dd6b)
* [Standard TSBS benchmark: VictoriaMetrics vs InfluxDB vs TimescaleDB](https://medium.com/@valyala/when-size-matters-benchmarking-victoriametrics-vs-timescale-and-influxdb-6035811952d4)
### What is the pricing for VictoriaMetrics?
VictoriaMetrics is available in the following species:
The following versions are open source and free:
* [Single-node version](https://github.com/VictoriaMetrics/VictoriaMetrics/wiki/Single-server-VictoriaMetrics).
* [Cluster version](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/cluster).
- On-prem cluster solution. [Contact us](mailto:info@victoriametrics.com) for pricing.
We provide commercial support for both versions. [Contact us](mailto:info@victoriametrics.com) for the pricing.
- SaaS. The pricing model for SaaS is `pay as you go`. The following resources are paid:
The following versions are commercial:
* Managed cluster in the Cloud.
* SaaS version.
* Time series values pushed into VictoriaMetrics. The pricing depends on data retention and precision - longer retention and higher precision cost more.
* Time series values scanned during queries.
* Outgoing bandwidth used for query results.
* The number of stored unique timeseries.
- [Free single-node VictoriaMetrics](https://github.com/VictoriaMetrics/VictoriaMetrics/wiki/Single-server-VictoriaMetrics),
which may easily substitute moderately sized cluster built on competing solutions - Thanos, Uber M3, Cortex, InfluxDB or TimescaleDB.
Just try it right now!
[Contact us](mailto:info@victoriametrics.com) for the pricing.
### Why VictoriaMetrics doesn't support [Prometheus remote read API](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#%3Cremote_read%3E)?
Remote read API requires transferring all the raw data for all the metrics in the query over the given time range. For instance,
Remote read API requires transferring all the raw data for all the requested metrics over the given time range. For instance,
if a query covers 1000 metrics with 10K values each, then the remote read API had to return `1000*10K`=10M metric values to Prometheus.
This is slow and expensive. Just query VictoriaMetrics directly via [Prometheus Querying API](https://prometheus.io/docs/prometheus/latest/querying/api/)
or via [Prometheus datasoruce in Grafana](http://docs.grafana.org/features/datasources/prometheus/), using the corresponding
url from the `authTokens` page in [the web UI](https://victoriametrics.com/). See [Quick Start](Quick-Start) for details.
or via [Prometheus datasoruce in Grafana](http://docs.grafana.org/features/datasources/prometheus/).
### Does VictoriaMetrics deduplicate data from Prometheus instances scraping the same targets (aka `HA pairs`)?
@ -100,32 +115,20 @@ url from the `authTokens` page in [the web UI](https://victoriametrics.com/). Se
No. Data from all the Prometheus instances is saved in VictoriaMetrics without deduplication.
It is difficult to deduplicate the data because of scrape times' jitter between Prometheus instances.
We believe there is better approach - automatic failover on the Prometheus side. We are planning to create an open source app for this task.
### Is VictoraMetrics secure?
Yes. VictoriaMetrics works only via HTTPS. Both remote write URL and the URL for Grafana datasource
provide end-to-end encryption via HTTPS. These URLs embed a random 128-bit authorization key (auth token),
which may be revoked (deleted) at any time via web console. The revocation forbids access via the given URL.
It is possible to create multiple auth tokens for the same project via web console.
### How to report bugs and feature requests?
Report bugs and feature requests [here](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/new).
### Where is the source code of VictoriaMetrics?
It is in a private repository. We are planning to open source VictoriaMetrics core in the future.
Source code for the following versions is available in the following places:
* [Single-node version](https://github.com/VictoriaMetrics/VictoriaMetrics).
* [Cluster version](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/cluster).
### Does VictoriaMetrics fit for data from IoT sensors and industrial sensors?
VictoriaMetrics is able to handle data from hundreds of millions of IoT sensors and industrial sensors.
It supports [high cardinality data](https://medium.com/@valyala/high-cardinality-tsdb-benchmarks-victoriametrics-vs-timescaledb-vs-influxdb-13e6ee64dd6b)
and [perfrectly scales up](https://medium.com/@valyala/measuring-vertical-scalability-for-time-series-databases-in-google-cloud-92550d78d8ae).
It supports [high cardinality data](https://medium.com/@valyala/high-cardinality-tsdb-benchmarks-victoriametrics-vs-timescaledb-vs-influxdb-13e6ee64dd6b),
perfectly [scales up on a single node](https://medium.com/@valyala/measuring-vertical-scalability-for-time-series-databases-in-google-cloud-92550d78d8ae)
and scales horizontally to multiple nodes.
### Where can I ask questions about VictoriaMetrics?