From 9b032763fa27295fb904012d1b3467a49f5f2d14 Mon Sep 17 00:00:00 2001
From: Vika <info@victoriametrics.com>
Date: Mon, 20 Mar 2023 06:43:51 +0000
Subject: [PATCH] update wiki pages

---
 CHANGELOG.md                     |  5 +++++
 README.md                        | 18 +++++++++---------
 Single-server-VictoriaMetrics.md | 18 +++++++++---------
 3 files changed, 23 insertions(+), 18 deletions(-)

diff --git a/CHANGELOG.md b/CHANGELOG.md
index 387fd48..f8f8ada 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -15,6 +15,11 @@ The following tip changes can be tested by building VictoriaMetrics components f
 
 ## tip
 
+**Update note: this release contains backwards-incompatible change in storage data format,
+so the previous versions of VictoriaMetrics will exit with the `unexpected number of substrings in the part name` error when trying to run them on the data
+created by v1.90.0 or newer versions. The solution is to upgrade to v1.90.0 or newer releases**
+
+* FEATURE: publish VictoriaMetrics binaries for Windows. See [this](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3236), [this](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3821) and [this](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70) issues.
 * FEATURE: log metrics with truncated labels if the length of label value in the ingested metric exceeds `-maxLabelValueLen`. This should simplify debugging for this case.
 * FEATURE: [vmagent](https://docs.victoriametrics.com/vmagent.html): add support for [VictoriaMetrics remote write protocol](https://docs.victoriametrics.com/vmagent.html#victoriametrics-remote-write-protocol) when [sending / receiving data to / from Kafka](https://docs.victoriametrics.com/vmagent.html#kafka-integration). This protocol allows saving egress network bandwidth costs when sending data from `vmagent` to `Kafka` located in another datacenter or availability zone. See [this feature request](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1225).
 * FEATURE: [vmagent](https://docs.victoriametrics.com/vmagent.html): add `--kafka.consumer.topic.concurrency` command-line flag. It controls the number of Kafka consumer workers to use by `vmagent`. It should eliminate the need to start multiple `vmagent` instances to improve data transfer rate. See [this feature request](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1957).
diff --git a/README.md b/README.md
index 455094e..7817851 100644
--- a/README.md
+++ b/README.md
@@ -1448,12 +1448,14 @@ can be configured with the `-inmemoryDataFlushInterval` command-line flag (note
 In-memory parts are persisted to disk into `part` directories under the `<-storageDataPath>/data/small/YYYY_MM/` folder,
 where `YYYY_MM` is the month partition for the stored data. For example, `2022_11` is the partition for `parts`
 with [raw samples](https://docs.victoriametrics.com/keyConcepts.html#raw-samples) from `November 2022`.
+Each partition directory contains `parts.json` file with the actual list of parts in the partition.
 
-The `part` directory has the following name pattern: `rowsCount_blocksCount_minTimestamp_maxTimestamp`, where:
+Every `part` directory contains `metadata.json` file with the following fields:
 
-- `rowsCount` - the number of [raw samples](https://docs.victoriametrics.com/keyConcepts.html#raw-samples) stored in the part
-- `blocksCount` - the number of blocks stored in the part (see details about blocks below)
-- `minTimestamp` and `maxTimestamp` - minimum and maximum timestamps across raw samples stored in the part
+- `RowsCount` - the number of [raw samples](https://docs.victoriametrics.com/keyConcepts.html#raw-samples) stored in the part
+- `BlocksCount` - the number of blocks stored in the part (see details about blocks below)
+- `MinTimestamp` and `MaxTimestamp` - minimum and maximum timestamps across raw samples stored in the part
+- `MinDedupInterval` - the [deduplication interval](#deduplication) applied to the given part.
 
 Each `part` consists of `blocks` sorted by internal time series id (aka `TSID`).
 Each `block` contains up to 8K [raw samples](https://docs.victoriametrics.com/keyConcepts.html#raw-samples),
@@ -1475,9 +1477,8 @@ for fast block lookups, which belong to the given `TSID` and cover the given tim
   and [freeing up disk space for the deleted time series](#how-to-delete-time-series) are performed during the merge
 
 Newly added `parts` either successfully appear in the storage or fail to appear.
-The newly added `parts` are being created in a temporary directory under `<-storageDataPath>/data/{small,big}/YYYY_MM/tmp` folder.
-When the newly added `part` is fully written and [fsynced](https://man7.org/linux/man-pages/man2/fsync.2.html)
-to a temporary directory, then it is atomically moved to the storage directory.
+The newly added `part` is atomically registered in the `parts.json` file under the corresponding partition
+after it is fully written and [fsynced](https://man7.org/linux/man-pages/man2/fsync.2.html) to the storage.
 Thanks to this alogrithm, storage never contains partially created parts, even if hardware power off
 occurrs in the middle of writing the `part` to disk - such incompletely written `parts`
 are automatically deleted on the next VictoriaMetrics start.
@@ -1506,8 +1507,7 @@ Retention is configured with the `-retentionPeriod` command-line flag, which tak
 
 Data is split in per-month partitions inside `<-storageDataPath>/data/{small,big}` folders.
 Data partitions outside the configured retention are deleted on the first day of the new month.
-Each partition consists of one or more data parts with the following name pattern `rowsCount_blocksCount_minTimestamp_maxTimestamp`.
-Data parts outside of the configured retention are eventually deleted during
+Each partition consists of one or more data parts. Data parts outside of the configured retention are eventually deleted during
 [background merge](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282).
 
 The maximum disk space usage for a given `-retentionPeriod` is going to be (`-retentionPeriod` + 1) months.
diff --git a/Single-server-VictoriaMetrics.md b/Single-server-VictoriaMetrics.md
index 814ead9..625d678 100644
--- a/Single-server-VictoriaMetrics.md
+++ b/Single-server-VictoriaMetrics.md
@@ -1451,12 +1451,14 @@ can be configured with the `-inmemoryDataFlushInterval` command-line flag (note
 In-memory parts are persisted to disk into `part` directories under the `<-storageDataPath>/data/small/YYYY_MM/` folder,
 where `YYYY_MM` is the month partition for the stored data. For example, `2022_11` is the partition for `parts`
 with [raw samples](https://docs.victoriametrics.com/keyConcepts.html#raw-samples) from `November 2022`.
+Each partition directory contains `parts.json` file with the actual list of parts in the partition.
 
-The `part` directory has the following name pattern: `rowsCount_blocksCount_minTimestamp_maxTimestamp`, where:
+Every `part` directory contains `metadata.json` file with the following fields:
 
-- `rowsCount` - the number of [raw samples](https://docs.victoriametrics.com/keyConcepts.html#raw-samples) stored in the part
-- `blocksCount` - the number of blocks stored in the part (see details about blocks below)
-- `minTimestamp` and `maxTimestamp` - minimum and maximum timestamps across raw samples stored in the part
+- `RowsCount` - the number of [raw samples](https://docs.victoriametrics.com/keyConcepts.html#raw-samples) stored in the part
+- `BlocksCount` - the number of blocks stored in the part (see details about blocks below)
+- `MinTimestamp` and `MaxTimestamp` - minimum and maximum timestamps across raw samples stored in the part
+- `MinDedupInterval` - the [deduplication interval](#deduplication) applied to the given part.
 
 Each `part` consists of `blocks` sorted by internal time series id (aka `TSID`).
 Each `block` contains up to 8K [raw samples](https://docs.victoriametrics.com/keyConcepts.html#raw-samples),
@@ -1478,9 +1480,8 @@ for fast block lookups, which belong to the given `TSID` and cover the given tim
   and [freeing up disk space for the deleted time series](#how-to-delete-time-series) are performed during the merge
 
 Newly added `parts` either successfully appear in the storage or fail to appear.
-The newly added `parts` are being created in a temporary directory under `<-storageDataPath>/data/{small,big}/YYYY_MM/tmp` folder.
-When the newly added `part` is fully written and [fsynced](https://man7.org/linux/man-pages/man2/fsync.2.html)
-to a temporary directory, then it is atomically moved to the storage directory.
+The newly added `part` is atomically registered in the `parts.json` file under the corresponding partition
+after it is fully written and [fsynced](https://man7.org/linux/man-pages/man2/fsync.2.html) to the storage.
 Thanks to this alogrithm, storage never contains partially created parts, even if hardware power off
 occurrs in the middle of writing the `part` to disk - such incompletely written `parts`
 are automatically deleted on the next VictoriaMetrics start.
@@ -1509,8 +1510,7 @@ Retention is configured with the `-retentionPeriod` command-line flag, which tak
 
 Data is split in per-month partitions inside `<-storageDataPath>/data/{small,big}` folders.
 Data partitions outside the configured retention are deleted on the first day of the new month.
-Each partition consists of one or more data parts with the following name pattern `rowsCount_blocksCount_minTimestamp_maxTimestamp`.
-Data parts outside of the configured retention are eventually deleted during
+Each partition consists of one or more data parts. Data parts outside of the configured retention are eventually deleted during
 [background merge](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282).
 
 The maximum disk space usage for a given `-retentionPeriod` is going to be (`-retentionPeriod` + 1) months.