Commit Graph

59 Commits

Author SHA1 Message Date
rtm0
4c31a6a1fc
lib/storage: properly handle maxMetrics limit at metricID search
`TL;DR` This PR improves the metric IDs search in IndexDB:

- Avoid seaching for metric IDs twice when `maxMetrics` limit is
exceeded
- Use correct error type for indicating that the `maxMetrics` limit is
exceded
- Simplify the logic of deciding between per-day and global index search

A unit test has been added to ensure that this refactoring does not
break anything.

---

Function calls before the fix:

```
idb.searchMetricIDs
    |__ is.searchMetricIDs
        |__ is.searchMetricIDsInternal
            |__ is.updateMetricIDsForTagFilters
                |__ is.tryUpdatingMetricIDsForDateRange
                |                       |
                |__ is.getMetricIDsForDateAndFilters
```

- `searchMetricIDsInternal` searches metric IDs for each filter set. It
maintains a metric ID set variable which is updated every time the
`updateMetricIDsForTagFilters` function is called. After each successful
call, the function checks the length of the updated metric ID set and if
it is greater than `maxMetrics`, the function returns `too many
timeseries` error.
- `updateMetricIDsForTagFilters` uses either per-day or global index to
search metric IDs for the given filter set. The decision of which index
to use is made is made within the `tryUpdatingMetricIDsForDateRange`
function and if it returns `fallback to global search` error then the
function uses global index by calling `getMetricIDsForDateAndFilters`
with zero date.
- `tryUpdatingMetricIDsForDateRange` first checks if the given time
range is larger than 40 days and if so returns `fallback to global
search` error. Otherwise it proceeds to searching for metric IDs within
that time range by calling `getMetricIDsForDateAndFilters` for each
date.
- `getMetricIDsForDateAndFilters` searches for metric IDs for the given
date and returns `fallback to global search` error if the number of
found metric IDs is greater than `maxMetrics`.

Problems with this solution:

1. The `fallback to global search` error returned by
`getMetricIDsForDateAndFilters` in case when maxMetrics is exceeded is
misleading.
2. If `tryUpdatingMetricIDsForDateRange` proceeds to date range search
and returns `fallback to global search` error (because
`getMetricIDsForDateAndFilters` returns it) then this will trigger
global search in `updateMetricIDsForTagFilters`. However the global
search uses the same maxMetrics value which means this search is
destined to fail too. I.e. the same search is performed twice and fails
twice.
3. `too many timeseries` error is already handled in
`searchMetricIDsInternal` and therefore handing this error in
`updateMetricIDsForTagFilters` is redundant
4. updateMetricIDsForTagFilters is a better place to make a decision on
whether to use per-day or global index.

Solution:

1.  Use a dedicated error for `too many timeseries` case
2. Handle `too many timeseries` error in  `searchMetricIDsInternal` only
3. Move the per-day or global search decision from
`tryUpdatingMetricIDsForDateRange` to `updateMetricIDsForTagFilters` and
remove `fallback to global search` error.

---------

Signed-off-by: Artem Fetishev <wwctrsrx@gmail.com>
Co-authored-by: Nikolay <nik@victoriametrics.com>
2024-08-27 23:08:17 +02:00
rtm0
b51c6bf75d
lib/storage: properly register index records with RegisterMetricNames
Once the timeseries is in tsidCache, new entries won't be created in
per-day index because the RegisterMetricNames() code does consider
different dates for the same timeseries. So this case has been added.

The same bug exists for AddRows() but it is not manifested because the
index entries are finally created in updatePerDateData().

RegisterMetricNames also updated to increase the newTimeseriesCreated
counter because it actually creates new time series in index.

A unit tests has been added that check all possible data patterns
(different metric names and dates) and code branches in both
RegisterMetricNames and AddRows. The total number of new unit tests is
around 100 which increaded the running time of storage tests by 50%.

---------

Signed-off-by: Artem Fetishev <wwctrsrx@gmail.com>
Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>
2024-08-27 23:00:27 +02:00
hagen1778
6db4682d95
docs: typo fix
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 6f17ee0d0f)
2024-08-27 15:49:46 +02:00
Zhu Jiekun
5b49fd83be
lib/promrelabel: follow-up for 8958cecad6
In the previous commit 8958cecad6
the default ports (80/443) were removed for both the `scrapeURL` and
`instance` label values for those targets without a port in
`__address__`. Different values in the `instance` label generate new
time series.

This commit reverts the changes made to the `instance` label. Now,
for those targets:
- `scrapeURL` will remain unchanged.
- The `instance` label value will include the default port.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6792
(cherry picked from commit e97e966f82)
2024-08-27 15:44:07 +02:00
YuDong Tang
cab3ef8294
app/vmselect:add command-line flag -search.inmemoryBufSizeBytes (#6869)
add command-line flag `-search.inmemoryBufSizeBytes` for configuring size of in-memory buffers used by vmselect during processing of vmstorage responses. A new summary metric `vm_tmp_blocks_inmemory_file_size_bytes` is exposed to show the size of the buffer during requests processing. 

The new setting can be used by experienced users to adjust memory usage by vmselect when processing
many small read requests. Instead of allocating 4MB buffers each time, vmselect can be instructed to lower
the buffer size via `-search.inmemoryBufSizeBytes`. To make the decision whether this flag needs to be adjusted
users can consult with `vm_tmp_blocks_inmemory_file_size_bytes` which shows the actual size of buffers used
during query processing.

----------

The detailed information of this PR can be found in
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6851

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

---------

Co-authored-by: hagen1778 <roman@victoriametrics.com>
2024-08-26 14:37:45 +02:00
Yury Akudovich
f759371c00
app/vmagent: add remoteWrite.retryMinInterval and remoteWrite.retryMaxTime flags (#6289)
## Describe Your Changes

Add RemoteWrite Retry Controls

This PR introduces two new flags to the remote write functionality:
- remoteWrite.retryMinInterval
- remoteWrite.retryMaxTime

These flags provide finer control over the retry behavior for
remoteWrite operations, allowing users to customize the minimum interval
between retries and the maximum duration for retry attempts.

Fixes #5486.

## Checklist

- [x] The following checks are mandatory:

My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

---------

Signed-off-by: Yury Akudovich <ya@matterlabs.dev>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit d0f5a9d77a)
2024-08-23 15:28:44 +02:00
Roman Khavronenko
c6e0780f4b
app/vmalert: update parsing for instant responses (#6859)
This change is made in attempt to reduce memory usage by vmalert when
parsing big instant responses from VM/Prometheus.

In
a5c427bac4
vmalert switched from std json lib to fastjson lib in order to reduce
amount of allocations, as according to highloaded profiles of vmalert
the CPU is mostly spent on GC.

But switching to fastjson resulted into excessive memory usage for cases
when vmalert has to parse long json lines, which usually happens when
instant response contains many `metric` objects.

In this change we do a mixed parsing:
1. Slice of `metric` objects is parsed with std lib to keep mem low
2. Each `metric` object is parsed with fastjson to reduce allocs

The benchmark results are the following:
```
pkg: github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource
BenchmarkParsePrometheusResponse/Instant_std+fastjson-10                    1760            668959 ns/op          280147 B/op       5781 allocs/op
MBs allocated at heap: 493.078392
mallocs: 18655472
BenchmarkParsePrometheusResponse/Instant_fastjson-10                        6109            198258 ns/op          172839 B/op       5548 allocs/op
MBs allocated at heap: 1056.384464
mallocs: 34457184
BenchmarkParsePrometheusResponse/Instant_std-10                             1287            950987 ns/op          451677 B/op       9619 allocs/op
MBs allocated at heap: 580.802976
mallocs: 13351636
```
The benchmark function code with mem measurement is available here
https://gist.github.com/hagen1778/b9c3ca7f8ca7d6b21aec9777112c5810

The benchmark contains 3 results:
1. Instant_std+fastjson is the implementation in this change
2. Instant_fastjson-10 is the implementation from
a5c427bac4
3. BenchmarkParsePrometheusResponse/Instant_std-10 is implementation
before
a5c427bac4

According to these results, this new implementation is slower than
previous, but faster than before switching to fastjson. It also has
lower number of allocations and roughly the same memory allocation on
heap with GC turned off.

---------

Other changes:
1. rm BenchmarkMetrics as it doesn't measure anything
2. simplify BenchmarkParsePrometheusResponse into
BenchmarkPromInstantUnmarshal

### Describe Your Changes

Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-08-22 23:56:11 +02:00
Yury Molodov
a3509add4d
vmui: add column search in table settings (#6804)
### Describe Your Changes

Add search functionality to the column display settings in the table
 #6668

![image](https://github.com/user-attachments/assets/e9bd52c3-6428-4d4f-8b7f-d83dd80b6912)

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit e35237920a)
2024-08-22 16:59:18 +02:00
Roman Khavronenko
b3c99579ea
docs: move changelog to dir (#6853)
Moving changelog-related docs to a separate dir should make it easier to
navigate in `docs/` folder.

-----------

The change shouldn't have any visual changes or changes to the links.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 9dc8d1debd)
2024-08-22 16:59:17 +02:00