VictoriaMetrics/docs/guides/guide-delete-or-replace-metrics/README.md
Denys Holius 336406e2e1
docs/guides/guide-delete-or-replace-metrics/README.md: adds a link to the API examples (#6863)
### Describe Your Changes

Adds a link to the API example section that describes how to delete
metrics on VM Single.

### Checklist

The following checks are **mandatory**:

- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
2024-08-23 10:01:03 +02:00

7.7 KiB

Data deletion is an operation people expect a database to have. VictoriaMetrics supports delete operation but to a limited extent. Due to implementation details, VictoriaMetrics remains an append-only database, which perfectly fits the case for storing time series data. But the drawback of such architecture is that it is extremely expensive to mutate the data. Hence, delete or update operations support is very limited. In this guide, we'll walk through the possible workarounds for deleting or changing already written data in VictoriaMetrics.

Precondition

How to delete metrics

Warning: time series deletion is not recommended to use on a regular basis. Each call to delete API could have a performance penalty. The API was provided for one-off operations to deleting malformed data or to satisfy GDPR compliance.

Delete API expects from user to specify time series selector. So the first thing to do before the deletion is to verify whether the selector matches the correct series.

To check that metrics are present in VictoriaMetrics Cluster run the following command:

Warning: response can return many metrics, so be careful with series selector.

curl -s 'http://vmselect:8481/select/0/prometheus/api/v1/series?match[]=process_cpu_cores_available' | jq

See URL example for single-node here.

The expected output:

{
  "status": "success",
  "isPartial": false,
  "data": [
    {
      "__name__": "process_cpu_cores_available",
      "job": "vminsert",
      "instance": "vminsert:8480"
    },
    {
      "__name__": "process_cpu_cores_available",
      "job": "vmselect",
      "instance": "vmselect:8481"
    },
    {
      "__name__": "process_cpu_cores_available",
      "job": "vmstorage",
      "instance": "vmstorage:8482"
    }
  ]
}

When you're sure time series selector is correct, send a POST request to delete API with match[]=<time-series-selector> argument. For example:

curl -s 'http://vmselect:8481/delete/0/prometheus/api/v1/admin/tsdb/delete_series?match[]=process_cpu_cores_available'

See URL example for single-node here.

If operation was successful, the deleted series will stop being queryable. Storage space for the deleted time series isn't freed instantly - it is freed during subsequent background merges of data files. The background merges may never occur for data from previous months, so storage space won't be freed for historical data. In this case forced merge may help freeing up storage space.

To trigger forced merge on VictoriaMetrics Cluster run the following command:

curl -v -X POST http://vmstorage:8482/internal/force_merge

After the merge is complete, the data will be permanently deleted from the disk.

How to update metrics

By default, VictoriaMetrics doesn't provide a mechanism for replacing or updating data. As a workaround, take the following actions:

Export metrics

For example, let's export metric for node_memory_MemTotal_bytes with labels instance="node-exporter:9100" and job="hostname.com":

curl -X POST -g http://vmselect:8481/select/0/prometheus/api/v1/export -d 'match[]=node_memory_MemTotal_bytes{instance="node-exporter:9100", job="hostname.com"}' > data.jsonl

See URL example for single-node here.

To check that exported file contains time series we can use cat and jq:

cat data.jsonl | jq

The expected output will look like the following:

{
  "metric": {
    "__name__": "node_memory_MemTotal_bytes",
    "job": "hostname.com",
    "instance": "node-exporter:9100"
  },
  "values": [
    33604390912,
    33604390912,
    33604390912,
    33604390912
  ],
  "timestamps": [
    1656669031378,
    1656669032378,
    1656669033378,
    1656669034378
  ]
}

In this example, we will replace the values of node_memory_MemTotal_bytes from 33604390912 to 17179869184 (from 32Gb to 16Gb) via sed, but it can be done in any of the available ways:

sed -i 's/33604390912/17179869184/g' data.jsonl

Let's check the changes in data.jsonl with cat:

cat data.jsonl | jq

The expected output will be the following:

{
  "metric": {
    "__name__": "node_memory_MemTotal_bytes",
    "job": "hostname.com",
    "instance": "node-exporter:9100"
  },
  "values": [
    17179869184,
    17179869184,
    17179869184,
    17179869184
  ],
  "timestamps": [
    1656669031378,
    1656669032378,
    1656669033378,
    1656669034378
  ]
}

Delete metrics

See How-to-delete-metrics from the previous paragraph.

Import metrics

VictoriaMetrics supports a lot of ingestion protocols and we will use import from JSON line format.

The next command will import metrics from data.jsonl to VictoriaMetrics:

curl -v -X POST http://vminsert:8480/insert/0/prometheus/api/v1/import -T data.jsonl

See URL example for single-node here.

Please note, importing data with old timestamps is called backfilling and may require resetting caches as described here.

Check imported metrics

curl -X POST -g http://vmselect:8481/select/0/prometheus/api/v1/export -d match[]=node_memory_MemTotal_bytes

The expected output will look like:

{
  "metric": {
    "__name__": "node_memory_MemTotal_bytes",
    "job": "hostname.com",
    "instance": "node-exporter:9100"
  },
  "values": [
    17179869184,
    17179869184,
    17179869184,
    17179869184
  ],
  "timestamps": [
    1656669031378,
    1656669032378,
    1656669033378,
    1656669034378
  ]
}