diff --git a/README.md b/README.md index 02ceca7b75..60ccde72ec 100644 --- a/README.md +++ b/README.md @@ -565,6 +565,34 @@ Restoring from backup: 2. Restore data from backup using [vmrestore](https://docs.victoriametrics.com/vmrestore.html) into `-storageDataPath` directory. 3. Start `vmstorage` node. +## Retention filters + +[VictoriaMetrics enterprise](https://victoriametrics.com/products/enterprise/) supports configuring multiple retentions for distinct sets of time series +by passing `-retentionFilter` command-line flag to `vmstorage` nodes. See [these docs](https://docs.victoriametrics.com/#retention-filters) for details on this feature. + +Additionally, enterprise version of VictoriaMetrics cluster supports multiple retentions for distinct sets of [tenants](#multitenancy) +by specifying filters on `vm_account_id` and/or `vm_project_id` pseudo-labels in `-retentionFilter` command-line flag. +If the tenant doesn't match specified `-retentionFilter` options, then the global `-retentionPeriod` is used for it. + +For example, the following config sets retention to 1 day for [tenants](#multitenancy) with `accountID` starting from '42', +then sets retention to 3 days for time series with label `env="dev"` or `env="prod"` from any tenant, +while the rest of tenants will have 4 weeks retention: + +``` +-retentionFilter='{vm_account_id=~"42.*"}:1d' -retentionFilter='{env=~"dev|staging"}:3d' -retentionPeriod=4w +``` + +It is OK to mix filters on real labels with filters on `vm_account_id` and `vm_project_id` pseudo-labels. +For example, the following config sets retention to 5 days for time series with `env="dev"` label from [tenant](#multitenancy) `accountID=5`: + +``` +-retentionFilter='{vm_account_id="5",env="dev"}:5d' +``` + +See also [these docs](https://docs.victoriametrics.com/#retention-filters) for additional details on retention filters. + +Enterprise binaries can be downloaded and evaluated for free from [the releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases). + ## Downsampling Downsampling is available in [enterprise version of VictoriaMetrics](https://victoriametrics.com/products/enterprise/). It is configured with `-downsampling.period` command-line flag. The same flag value must be passed to both `vmstorage` and `vmselect` nodes. See [these docs](https://docs.victoriametrics.com/#downsampling) for details. @@ -1124,8 +1152,11 @@ Below is the output for `/path/to/vmstorage -help`: -pushmetrics.url array Optional URL to push metrics exposed at /metrics page. See https://docs.victoriametrics.com/#push-metrics . By default metrics exposed at /metrics page aren't pushed to any remote storage Supports an array of values separated by comma or specified via multiple flags. + -retentionFilter array + Retention filter in the format 'filter:retention'. For example, '{env="dev"}:3d' configures the retention for time series with env="dev" label to 3 days. See https://docs.victoriametrics.com/#retention-filters for details. This flag is available only in enterprise version of VictoriaMetrics + Supports an array of values separated by comma or specified via multiple flags. -retentionPeriod value - Data with timestamps outside the retentionPeriod is automatically deleted + Data with timestamps outside the retentionPeriod is automatically deleted. See also -retentionFilter The following optional suffixes are supported: h (hour), d (day), w (week), y (year). If suffix isn't set, then the duration is counted in months (default 1) -retentionTimezoneOffset duration The offset for performing indexdb rotation. If set to 0, then the indexdb rotation is performed at 4am UTC time per each -retentionPeriod. If set to 2h, then the indexdb rotation is performed at 4am EET time (the timezone with +2h offset) diff --git a/app/vmstorage/main.go b/app/vmstorage/main.go index 1f326b39b4..20faa81b6a 100644 --- a/app/vmstorage/main.go +++ b/app/vmstorage/main.go @@ -25,7 +25,7 @@ import ( ) var ( - retentionPeriod = flagutil.NewDuration("retentionPeriod", "1", "Data with timestamps outside the retentionPeriod is automatically deleted") + retentionPeriod = flagutil.NewDuration("retentionPeriod", "1", "Data with timestamps outside the retentionPeriod is automatically deleted. See also -retentionFilter") httpListenAddr = flag.String("httpListenAddr", ":8482", "Address to listen for http connections") storageDataPath = flag.String("storageDataPath", "vmstorage-data", "Path to storage data") vminsertAddr = flag.String("vminsertAddr", ":8400", "TCP address to accept connections from vminsert services") diff --git a/docs/CHANGELOG.md b/docs/CHANGELOG.md index afa1ac443b..e866d564fa 100644 --- a/docs/CHANGELOG.md +++ b/docs/CHANGELOG.md @@ -17,6 +17,8 @@ The following tip changes can be tested by building VictoriaMetrics components f **Update note 1:** the `indexdb/tagFilters` cache type at [/metrics](https://docs.victoriametrics.com/#monitoring) has been renamed to `indexdb/tagFiltersToMetricIDs` in order to make its puropose more clear. +* FEATURE: [VictoriaMetric enterprise](https://victoriametrics.com/products/enterprise/): allow configuring multiple retentions for distinct sets of time series. See [these docs](https://docs.victoriametrics.com/#retention-filters), [this](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/143) and [this](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/289) feature request. +* FEATURE: [VictoriaMetric enterprise](https://victoriametrics.com/products/enterprise/): add support for multiple retentions for distinct tenants - see [these docs](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#retention-filters) and [this](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/143) and [this](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/289) feature request. * FEATURE: allow limiting memory usage on a per-query basis with `-search.maxMemoryPerQuery` command-line flag. See [this feature request](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3203). * FEATURE: [vmagent](https://docs.victoriametrics.com/vmagent.html): drop all the labels with `__` prefix from discovered targets in the same way as Prometheus does according to [this article](https://www.robustperception.io/life-of-a-label/). Previously the following labels were available during [metric-level relabeling](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#metric_relabel_configs): `__address__`, `__scheme__`, `__metrics_path__`, `__scrape_interval__`, `__scrape_timeout__`, `__param_*`. Now these labels are available only during [target-level relabeling](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config). This should reduce CPU usage and memory usage for `vmagent` setups, which scrape big number of targets. * FEATURE: [vmagent](https://docs.victoriametrics.com/vmagent.html): improve the performance for metric-level [relabeling](https://docs.victoriametrics.com/vmagent.html#relabeling), which can be applied via `metric_relabel_configs` section at [scrape_configs](https://docs.victoriametrics.com/sd_configs.html#scrape_configs), via `-remoteWrite.relabelConfig` or via `-remoteWrite.urlRelabelConfig` command-line options. diff --git a/docs/Cluster-VictoriaMetrics.md b/docs/Cluster-VictoriaMetrics.md index 31e94d8f53..f096b0c804 100644 --- a/docs/Cluster-VictoriaMetrics.md +++ b/docs/Cluster-VictoriaMetrics.md @@ -569,6 +569,34 @@ Restoring from backup: 2. Restore data from backup using [vmrestore](https://docs.victoriametrics.com/vmrestore.html) into `-storageDataPath` directory. 3. Start `vmstorage` node. +## Retention filters + +[VictoriaMetrics enterprise](https://victoriametrics.com/products/enterprise/) supports configuring multiple retentions for distinct sets of time series +by passing `-retentionFilter` command-line flag to `vmstorage` nodes. See [these docs](https://docs.victoriametrics.com/#retention-filters) for details on this feature. + +Additionally, enterprise version of VictoriaMetrics cluster supports multiple retentions for distinct sets of [tenants](#multitenancy) +by specifying filters on `vm_account_id` and/or `vm_project_id` pseudo-labels in `-retentionFilter` command-line flag. +If the tenant doesn't match specified `-retentionFilter` options, then the global `-retentionPeriod` is used for it. + +For example, the following config sets retention to 1 day for [tenants](#multitenancy) with `accountID` starting from '42', +then sets retention to 3 days for time series with label `env="dev"` or `env="prod"` from any tenant, +while the rest of tenants will have 4 weeks retention: + +``` +-retentionFilter='{vm_account_id=~"42.*"}:1d' -retentionFilter='{env=~"dev|staging"}:3d' -retentionPeriod=4w +``` + +It is OK to mix filters on real labels with filters on `vm_account_id` and `vm_project_id` pseudo-labels. +For example, the following config sets retention to 5 days for time series with `env="dev"` label from [tenant](#multitenancy) `accountID=5`: + +``` +-retentionFilter='{vm_account_id="5",env="dev"}:5d' +``` + +See also [these docs](https://docs.victoriametrics.com/#retention-filters) for additional details on retention filters. + +Enterprise binaries can be downloaded and evaluated for free from [the releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases). + ## Downsampling Downsampling is available in [enterprise version of VictoriaMetrics](https://victoriametrics.com/products/enterprise/). It is configured with `-downsampling.period` command-line flag. The same flag value must be passed to both `vmstorage` and `vmselect` nodes. See [these docs](https://docs.victoriametrics.com/#downsampling) for details. @@ -1128,8 +1156,11 @@ Below is the output for `/path/to/vmstorage -help`: -pushmetrics.url array Optional URL to push metrics exposed at /metrics page. See https://docs.victoriametrics.com/#push-metrics . By default metrics exposed at /metrics page aren't pushed to any remote storage Supports an array of values separated by comma or specified via multiple flags. + -retentionFilter array + Retention filter in the format 'filter:retention'. For example, '{env="dev"}:3d' configures the retention for time series with env="dev" label to 3 days. See https://docs.victoriametrics.com/#retention-filters for details. This flag is available only in enterprise version of VictoriaMetrics + Supports an array of values separated by comma or specified via multiple flags. -retentionPeriod value - Data with timestamps outside the retentionPeriod is automatically deleted + Data with timestamps outside the retentionPeriod is automatically deleted. See also -retentionFilter The following optional suffixes are supported: h (hour), d (day), w (week), y (year). If suffix isn't set, then the duration is counted in months (default 1) -retentionTimezoneOffset duration The offset for performing indexdb rotation. If set to 0, then the indexdb rotation is performed at 4am UTC time per each -retentionPeriod. If set to 2h, then the indexdb rotation is performed at 4am EET time (the timezone with +2h offset) diff --git a/docs/README.md b/docs/README.md index 15c0c23846..ebaa1805a5 100644 --- a/docs/README.md +++ b/docs/README.md @@ -1402,7 +1402,11 @@ VictoriaMetrics does not support indefinite retention, but you can specify an ar ## Multiple retentions -A single instance of VictoriaMetrics supports only a single retention, which can be configured via `-retentionPeriod` command-line flag. If you need multiple retentions, then you may start multiple VictoriaMetrics instances with distinct values for the following flags: +Distinct retentions for distinct time series can be configured via [retention filters](#retention-filters) +in [VictoriaMetrics enterprise](https://victoriametrics.com/products/enterprise/). + +Community version of VictoriaMetrics supports only a single retention, which can be configured via [-retentionPeriod](#retention) command-line flag. +If you need multiple retentions in community version of VictoriaMetrics, then you may start multiple VictoriaMetrics instances with distinct values for the following flags: * `-retentionPeriod` * `-storageDataPath`, so the data for each retention period is saved in a separate directory @@ -1410,9 +1414,41 @@ A single instance of VictoriaMetrics supports only a single retention, which can Then set up [vmauth](https://docs.victoriametrics.com/vmauth.html) in front of VictoriaMetrics instances, so it could route requests from particular user to VictoriaMetrics with the desired retention. -The same scheme could be implemented for multiple tenants in [VictoriaMetrics cluster](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html). + +Similar scheme can be applied for multiple tenants in [VictoriaMetrics cluster](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html). See [these docs](https://docs.victoriametrics.com/guides/guide-vmcluster-multiple-retention-setup.html) for multi-retention setup details. +## Retention filters + +[Enterprise version of VictoriaMetrics](https://victoriametrics.com/products/enterprise/) supports e.g. `retention filters`, +which allow configuring multiple retentions for distinct sets of time series matching the configured [series filters](https://docs.victoriametrics.com/keyConcepts.html#filtering) +via `-retentionFilter` command-line flag. This flag accepts `filter:duration` options, where `filter` must be +a valid [series filter](https://docs.victoriametrics.com/keyConcepts.html#filtering), while the `duration` +must contain valid [retention](#retention) for time series matching the given `filter`. If series doesn't match +any configured `-retentionFilter`, then the retention configured via [-retentionPeriod](#retention) command-line flag is applied to it. +If series matches multiple configured retention filters, then the smallest retention is applied. + +For example, the following config sets 3 days retention for time series with `team="juniors"` label, +30 days retention for time series with `env="dev"` or `env="staging"` label and 1 year retention for the remaining time series: + +``` +-retentionFilter='{team="juniors"}:3d' -retentionFilter='{env=~"dev|staging"}:30d' -retentionPeriod=1y +``` + +Important notes: + +- The data outside of the configured retention isn't deleted instantly - it is deleted eventually during [background merges](https://docs.victoriametrics.com/#storage). +- The `-retentionFilter` doesn't remove old data from `indexdb` (aka inverted index) until the configured [-retentionPeriod](#retention). + So the `indexdb` size can grow big under [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate) + even for small retentions configured via `-retentionFilter`. + +It is safe updating `-retentionFilter` during VictoriaMetrics restarts - the updated retention filters are applied eventually +to historical data. + +See [how to configure multiple retentions in VictoriaMetrics cluster](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#retention-filters). + +Retention filters can be evaluated for free by downloading and using enterprise binaries from [the releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases). + ## Downsampling [VictoriaMetrics Enterprise](https://victoriametrics.com/products/enterprise/) supports multi-level downsampling with `-downsampling.period` command-line flag. For example: @@ -2207,8 +2243,11 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li Optional path to a file with relabeling rules, which are applied to all the ingested metrics. The path can point either to local file or to http url. See https://docs.victoriametrics.com/#relabeling for details. The config is reloaded on SIGHUP signal -relabelDebug Whether to log metrics before and after relabeling with -relabelConfig. If the -relabelDebug is enabled, then the metrics aren't sent to storage. This is useful for debugging the relabeling configs + -retentionFilter array + Retention filter in the format 'filter:retention'. For example, '{env="dev"}:3d' configures the retention for time series with env="dev" label to 3 days. See https://docs.victoriametrics.com/#retention-filters for details. This flag is available only in enterprise version of VictoriaMetrics + Supports an array of values separated by comma or specified via multiple flags. -retentionPeriod value - Data with timestamps outside the retentionPeriod is automatically deleted + Data with timestamps outside the retentionPeriod is automatically deleted. See also -retentionFilter The following optional suffixes are supported: h (hour), d (day), w (week), y (year). If suffix isn't set, then the duration is counted in months (default 1) -retentionTimezoneOffset duration The offset for performing indexdb rotation. If set to 0, then the indexdb rotation is performed at 4am UTC time per each -retentionPeriod. If set to 2h, then the indexdb rotation is performed at 4am EET time (the timezone with +2h offset) diff --git a/docs/Single-server-VictoriaMetrics.md b/docs/Single-server-VictoriaMetrics.md index e038e99d0d..fa493094c4 100644 --- a/docs/Single-server-VictoriaMetrics.md +++ b/docs/Single-server-VictoriaMetrics.md @@ -1405,7 +1405,11 @@ VictoriaMetrics does not support indefinite retention, but you can specify an ar ## Multiple retentions -A single instance of VictoriaMetrics supports only a single retention, which can be configured via `-retentionPeriod` command-line flag. If you need multiple retentions, then you may start multiple VictoriaMetrics instances with distinct values for the following flags: +Distinct retentions for distinct time series can be configured via [retention filters](#retention-filters) +in [VictoriaMetrics enterprise](https://victoriametrics.com/products/enterprise/). + +Community version of VictoriaMetrics supports only a single retention, which can be configured via [-retentionPeriod](#retention) command-line flag. +If you need multiple retentions in community version of VictoriaMetrics, then you may start multiple VictoriaMetrics instances with distinct values for the following flags: * `-retentionPeriod` * `-storageDataPath`, so the data for each retention period is saved in a separate directory @@ -1413,9 +1417,41 @@ A single instance of VictoriaMetrics supports only a single retention, which can Then set up [vmauth](https://docs.victoriametrics.com/vmauth.html) in front of VictoriaMetrics instances, so it could route requests from particular user to VictoriaMetrics with the desired retention. -The same scheme could be implemented for multiple tenants in [VictoriaMetrics cluster](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html). + +Similar scheme can be applied for multiple tenants in [VictoriaMetrics cluster](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html). See [these docs](https://docs.victoriametrics.com/guides/guide-vmcluster-multiple-retention-setup.html) for multi-retention setup details. +## Retention filters + +[Enterprise version of VictoriaMetrics](https://victoriametrics.com/products/enterprise/) supports e.g. `retention filters`, +which allow configuring multiple retentions for distinct sets of time series matching the configured [series filters](https://docs.victoriametrics.com/keyConcepts.html#filtering) +via `-retentionFilter` command-line flag. This flag accepts `filter:duration` options, where `filter` must be +a valid [series filter](https://docs.victoriametrics.com/keyConcepts.html#filtering), while the `duration` +must contain valid [retention](#retention) for time series matching the given `filter`. If series doesn't match +any configured `-retentionFilter`, then the retention configured via [-retentionPeriod](#retention) command-line flag is applied to it. +If series matches multiple configured retention filters, then the smallest retention is applied. + +For example, the following config sets 3 days retention for time series with `team="juniors"` label, +30 days retention for time series with `env="dev"` or `env="staging"` label and 1 year retention for the remaining time series: + +``` +-retentionFilter='{team="juniors"}:3d' -retentionFilter='{env=~"dev|staging"}:30d' -retentionPeriod=1y +``` + +Important notes: + +- The data outside of the configured retention isn't deleted instantly - it is deleted eventually during [background merges](https://docs.victoriametrics.com/#storage). +- The `-retentionFilter` doesn't remove old data from `indexdb` (aka inverted index) until the configured [-retentionPeriod](#retention). + So the `indexdb` size can grow big under [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate) + even for small retentions configured via `-retentionFilter`. + +It is safe updating `-retentionFilter` during VictoriaMetrics restarts - the updated retention filters are applied eventually +to historical data. + +See [how to configure multiple retentions in VictoriaMetrics cluster](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#retention-filters). + +Retention filters can be evaluated for free by downloading and using enterprise binaries from [the releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases). + ## Downsampling [VictoriaMetrics Enterprise](https://victoriametrics.com/products/enterprise/) supports multi-level downsampling with `-downsampling.period` command-line flag. For example: @@ -2210,8 +2246,11 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li Optional path to a file with relabeling rules, which are applied to all the ingested metrics. The path can point either to local file or to http url. See https://docs.victoriametrics.com/#relabeling for details. The config is reloaded on SIGHUP signal -relabelDebug Whether to log metrics before and after relabeling with -relabelConfig. If the -relabelDebug is enabled, then the metrics aren't sent to storage. This is useful for debugging the relabeling configs + -retentionFilter array + Retention filter in the format 'filter:retention'. For example, '{env="dev"}:3d' configures the retention for time series with env="dev" label to 3 days. See https://docs.victoriametrics.com/#retention-filters for details. This flag is available only in enterprise version of VictoriaMetrics + Supports an array of values separated by comma or specified via multiple flags. -retentionPeriod value - Data with timestamps outside the retentionPeriod is automatically deleted + Data with timestamps outside the retentionPeriod is automatically deleted. See also -retentionFilter The following optional suffixes are supported: h (hour), d (day), w (week), y (year). If suffix isn't set, then the duration is counted in months (default 1) -retentionTimezoneOffset duration The offset for performing indexdb rotation. If set to 0, then the indexdb rotation is performed at 4am UTC time per each -retentionPeriod. If set to 2h, then the indexdb rotation is performed at 4am EET time (the timezone with +2h offset) diff --git a/docs/guides/guide-vmcluster-multiple-retention-setup.md b/docs/guides/guide-vmcluster-multiple-retention-setup.md index 3051e65f73..bce102c686 100644 --- a/docs/guides/guide-vmcluster-multiple-retention-setup.md +++ b/docs/guides/guide-vmcluster-multiple-retention-setup.md @@ -3,12 +3,16 @@ **Objective** -Setup Victoria Metrics TSDB with support of multiple retention periods within one installation. +Setup Victoria Metrics Cluster with support of multiple retention periods within one installation. **Challenge** -VictoriaMetrics instance (single node or vmstorage node) supports only one retention period. +If you use [VictoriaMetrics enterprise](https://victoriametrics.com/products/enterprise/), then you can use +[retention filters](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#retention-filters) for applying multiple retentions +to distinct sets of time series and/or [tenants](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#multitenancy). +Community version of VictoriaMetrics supports only one retention period via [-retentionPeriod](https://docs.victoriametrics.com/#retention) command-line flag. +The following solution allows supporting multiple retentions for community version of VictoriaMetrics. **Solution** diff --git a/lib/storage/index_db_test.go b/lib/storage/index_db_test.go index c2f78bb7ff..88ac2e86f9 100644 --- a/lib/storage/index_db_test.go +++ b/lib/storage/index_db_test.go @@ -2198,6 +2198,8 @@ func newTestStorage() *Storage { retentionMsecs: maxRetentionMsecs, } s.setDeletedMetricIDs(&uint64set.Set{}) + var idb *indexDB + s.idbCurr.Store(idb) return s } diff --git a/lib/storage/partition.go b/lib/storage/partition.go index d98752924d..8d6a065e20 100644 --- a/lib/storage/partition.go +++ b/lib/storage/partition.go @@ -1685,11 +1685,22 @@ func (pt *partition) createSnapshot(srcDir, dstDir string) error { return fmt.Errorf("cannot read directory: %w", err) } for _, fi := range fis { + fn := fi.Name() if !fs.IsDirOrSymlink(fi) { + if fn == "appliedRetention.txt" { + // Copy the appliedRetention.txt file to dstDir. + // This file can be created by VictoriaMetrics enterprise. + // See https://docs.victoriametrics.com/#retention-filters . + // Do not make hard link to this file, since it can be modified over time. + srcPath := srcDir + "/" + fn + dstPath := dstDir + "/" + fn + if err := fs.CopyFile(srcPath, dstPath); err != nil { + return fmt.Errorf("cannot copy %q to %q: %w", srcPath, dstPath, err) + } + } // Skip non-directories. continue } - fn := fi.Name() if fn == "tmp" || fn == "txn" { // Skip special dirs. continue