Commit Graph

2550 Commits

Author SHA1 Message Date
Aliaksandr Valialkin
37a8cc0b12
lib/logstorage: work-in-progress 2024-06-10 18:42:31 +02:00
Aliaksandr Valialkin
7e24bf99de
lib/streamaggr: return back string interning to dedupAggr after 78953723200f15ffc417064d1912bdbb7551505c
It should reduce memory allocation rate during stream deduplication
2024-06-10 18:06:25 +02:00
Aliaksandr Valialkin
6470eac7dc
lib/bytesutil: reduce the number of memory allocations per each interned string in bytesutil.InternString() from 5 to 1
This should reduce GC overhead when tens of millions of strings are interned (for example, during stream deduplication
of millions of active time series).
2024-06-10 18:06:24 +02:00
Roman Khavronenko
8c8d84e30a
lib/protoparser/opentelemetry/firehose: escape requestID before returning it to user (#6451)
All user input should be sanitized before rendering. This should prevent
possible attacks. See
https://github.com/VictoriaMetrics/VictoriaMetrics/security/code-scanning/203

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-06-10 18:06:24 +02:00
Aliaksandr Valialkin
883c0e6221
lib/streamaggr: reduce memory allocations by using dedupAggrSample buffer per each dedupAggrShard 2024-06-10 16:39:26 +02:00
Aliaksandr Valialkin
422225bfa5
lib/streamaggr: reduce the number of duplicates per each sample in BenchmarkDedupAggr from 100 to 2
This is closer to typical production setups when deduplication is used for de-duplicating of 2 samples per series.
2024-06-10 16:39:26 +02:00
Aliaksandr Valialkin
d269a95da3
lib/streamaggr: use strings.Clone() instead of bytesutil.InternString() for creating series key in dedupAggr
Our internal testing shows that this reduces GC overhead when deduplicating tens of millions of active series.
2024-06-10 16:08:47 +02:00
Aliaksandr Valialkin
9ed9e766e8
lib/streamaggr: improve performance for dedupAggr.sizeBytes() and dedupAggr.itemsCount()
These functions are called every time `/metrics` page is scraped, so it would be great
if they could be sped up for the cases when dedupAggr tracks tens of millions of active time series.
2024-06-10 16:00:05 +02:00
Aliaksandr Valialkin
387c22da49
lib/streamaggr: remove flushState arg at dedupAggr.flush(), since it is always set to true in production 2024-06-10 16:00:05 +02:00
Hui Wang
028a80613f
lib/httpserver: allow reloadAuthKey and configAuthKey to override htt… (#6338)
…pAuth.*

address https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6329,
makes `reloadAuthKey`, `configAuthKey`, `flagsAuthKey`, `pprofAuthKey`
behavior the same way,
but keys like `-snapshotAuthKey`, `-forceMergeAuthKey` are still
protected by httpAuth.*. All the available key are listed in
https://docs.victoriametrics.com/single-server-victoriametrics/#security.

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>

(cherry picked from commit 61dce6f2a1)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-06-10 12:41:29 +02:00
Aliaksandr Valialkin
32aa0751a1
lib/streamaggr: follow-up for 7cb894a777
- Use bytesutil.InternString() instead of strings.Clone() for inputKey and outputKey in aggregatorpushSamples().
  This should reduce string allocation rate, since strings can be re-used between aggrState flushes.
- Reduce memory allocations at dedupAggrShard by storing dedupAggrSample by value in the active series map.
- Remove duplicate call to bytesutil.InternBytes() at Deduplicator, since it is already called inside dedupAggr.pushSamples().
- Add missing string interning at rateAggrState.pushSamples().

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6402
2024-06-07 16:35:53 +02:00
Roman Khavronenko
78121642df
lib/streamaggr: reduce number of inuse objects (#6402)
The main change is getting rid of interning of sample key. It was
discovered that for cases with many unique time series aggregated by
vmagent interned keys could grow up to hundreds of millions of objects.
This has negative impact on the following aspects:
1. It slows down garbage collection cycles, as GC has to scan all inuse
objects periodically. The higher is the number of inuse objects, the
longer it takes/the more CPU it takes.
2. It slows down the hot path of samples aggregation where each key
needs to be looked up in the map first.

The change makes code more fragile, but suppose to provide performance
optimization for heavy-loaded vmagents with stream aggregation enabled.

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2024-06-07 16:35:52 +02:00
Roman Khavronenko
fae589bb83
lib/promrelabel: speedup label match by __name__ (#6432)
The change adds a fastpath for `equalValue` comparisons against
`__name__` label by avoiding calls to `toCanonicalLabelName` func. This
speedups matches by metric name like `'foo'`. See bench stats below:
```
benchcmp old.txt new.txt

benchmark                                           old ns/op     new ns/op     delta
BenchmarkIfExpression/equal_label:_last-10          35.6          35.1          -1.18%
BenchmarkIfExpression/equal_label:_middle-10        18.3          17.3          -5.41%
BenchmarkIfExpression/equal_label:_first-10         1.20          1.24          +2.74%
BenchmarkIfExpression/equal___name__:_last-10       10.1          4.96          -50.75%
BenchmarkIfExpression/equal___name__:_middle-10     5.79          3.16          -45.41%
BenchmarkIfExpression/equal___name__:_first-10      1.17          1.05          -9.76%
```

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-06-07 16:35:52 +02:00
Andrii Chubatiuk
93cd08f15f
lib/streamaggr: metrics to track dropped, nan samples and samples lag (#6358)
### Describe Your Changes

Added streamaggr metrics to:
 - `vm_streamaggr_samples_lag_seconds` - samples lag
- `vm_streamaggr_ignored_samples_total{reason="nan"}` - ignored NaN
samples
- `vm_streamaggr_ignored_samples_total{reason="too_old"}` - ignored old
samples

(cherry picked from commit 185fac03b3)
2024-06-06 19:22:45 +02:00
Aliaksandr Valialkin
53382ae837
lib/logstorage: work-in-progress 2024-06-06 12:27:11 +02:00
Aliaksandr Valialkin
a200fb433a
lib/logstorage: allow using eval keyword instead of math keyword in math pipe 2024-06-05 10:08:08 +02:00
Aliaksandr Valialkin
b45e466a1b
lib/logstorage: work-in-progress 2024-06-05 03:18:25 +02:00
pludov
2efd97a63c
lib/fs: support NFS implementations that return EEXIST instead of ENOTEMPTY (#6398)
### Describe Your Changes

Fix for issue #6396: according to rmdir manpage, ENOTEMPTY and EEXIST
should be treated equally

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6396

### Checklist

The following checks are **mandatory**:

- [x ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

---------

Co-authored-by: Ludovic Pollet <ludovic.pollet@exfo.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 3ddae77c63)
2024-06-04 15:30:48 +02:00
Aliaksandr Valialkin
1ce8a9a751
lib/logstorage: allow typing asc in sort pipe for the sake of consistency with desc 2024-06-04 02:29:18 +02:00
Aliaksandr Valialkin
b7b3a9e9a3
lib/logstorage: work-in-progress 2024-06-04 01:50:55 +02:00
Aliaksandr Valialkin
540bbb63a2
lib/logstorage: work-in-progress 2024-05-30 16:19:36 +02:00
Roman Khavronenko
189af53142
lib/storage: filter deleted label names and values from `/api/v1/labe… (#6342)
…ls` and `/api/v1/label/.../values`

Check for deleted metrics when `match[]` filter matches small number of
time series (optimized path).

The issue was introduced
[v1.81.0](https://docs.victoriametrics.com/changelog_2022/#v1810).

Related issue
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6300 Updates
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2978

Signed-off-by: hagen1778 <roman@victoriametrics.com>

(cherry picked from commit b984f4672e)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-05-29 14:37:00 +02:00
Aliaksandr Valialkin
e83fd4a117
lib/logstorage: work-in-progress 2024-05-29 01:52:34 +02:00
Aliaksandr Valialkin
79c03fc35f
lib/logstorage: work-in-progress 2024-05-28 19:29:50 +02:00
Aliaksandr Valialkin
ce5e4c842a
lib/logstorage: fix golangci-lint warnings 2024-05-26 02:02:41 +02:00
Aliaksandr Valialkin
afa597ce2a
lib/logstorage: work-in-progress 2024-05-26 01:56:12 +02:00
Aliaksandr Valialkin
6427b3c3c0
lib/logstorage: work-in-progress 2024-05-25 22:59:21 +02:00
Aliaksandr Valialkin
9edbeca46b
lib/logstorage: re-use per-shard fields across processed blocks in pipePackJSON and pipeUnroll 2024-05-25 22:13:44 +02:00
Aliaksandr Valialkin
03fe4c8963
lib/logstorage: work-in-progress 2024-05-25 21:36:24 +02:00
Aliaksandr Valialkin
3152df2bce
lib/logstorage: work-in-progress 2024-05-25 00:31:55 +02:00
Nikolay
5025ede7bc
lib/mergeset: adds tracking for indexdb records drop (#6297)
It allows to create alert for possible item drops at indexdb. It may
happen, if ingested metric size exceeds max indexdb item size.

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>

(cherry picked from commit 69d244e6fb)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-05-24 16:08:34 +02:00
Aliaksandr Valialkin
7a2a2f173e
lib/logstorage: work-in-progress 2024-05-24 03:07:07 +02:00
Nikolay
dfbd2f8ff7
lib/storage: change default value for maxLabelValueLen to 1024 (#6313)
* It must reduce memory usage for misbehaving clients. Since
VictoriaMetrics stores sparse index inmemory.
* Reduce disk space usage for indexdb.
* Prevent possible indexDB items drops.
* It may trigger slow insert and new timeseries registration due to
default value for flag change

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6176

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2024-05-22 21:55:21 +02:00
Alexander Marshalov
0b70c4c1f1
[vmlogs] fixed time parsing with millisecond precision time (#6293) (#6295)
fix for #6293

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2024-05-22 21:54:50 +02:00
Aliaksandr Valialkin
04d0dd2542
lib/logstorage: work-in-progress 2024-05-22 21:01:28 +02:00
Roman Khavronenko
f3e893f699
lib/backup: add -s3TLSInsecureSkipVerify command-line flag (#6318)
* The new flag can be used for for skipping TLS certificates
verification when connecting to S3 endpoint. Affects vmbackup,
vmrestore, vmbackupmanager.

* replace deprecated `EndpointResolver` with `BaseEndpoint`

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1056

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit ac836bcf6c)
2024-05-22 16:40:06 +02:00
Hui Wang
5b8c3fc9d0
app/vmalert: support DNS SRV record in -remoteWrite.url (#6299)
part of https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6053,
supports [DNS SRV](https://en.wikipedia.org/wiki/SRV_record) address in
`-remoteWrite.url` command-line option.

(cherry picked from commit d7b5062917)
2024-05-22 10:53:22 +02:00
Roman Khavronenko
3e8b5e74d5
lib/streamaggr: skip empty aggregators (#6307)
Prevent excessive resource usage when stream aggregation config file
contains no matchers by prevent pushing data into Aggregators object.
Before this change a lot of extra work was invoked without reason.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

(cherry picked from commit 7ce052b32d)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-05-20 14:46:36 +02:00
Aliaksandr Valialkin
45fbcc74e0
lib/logstorage: fix golangci-lint warnings 2024-05-20 11:04:37 +02:00
Aliaksandr Valialkin
582e7d5439
lib/logstorage: work-in-progress 2024-05-20 04:09:15 +02:00
Andrii Chubatiuk
fe332c3419
app/vmagent: add global aggregator (#6268)
Add global stream aggregation for VMAgent

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5467
(cherry picked from commit f153f54d11)
2024-05-17 14:01:31 +02:00
Nikolay
ee4a94a371
follow-up for c6c5a5a186 (#6265)
* adds datadog extensions for statsd:
  - multiple packed values (v1.1)
  - additional types distribution, histogram

* adds type check and append metric type to the labels with special tag
name `__statsd_metric_type__`. It simplifies streaming aggregation
config.

* remove statsd support from cluster, since cluster doesn't support
stream aggregation.

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>

(cherry picked from commit b2765c45d0)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-05-17 13:49:24 +02:00
Aliaksandr Valialkin
28626db066
lib/logstorage: work-in-progress
(cherry picked from commit 0aa19a2837)
2024-05-16 09:35:55 +02:00
Aliaksandr Valialkin
5dbc4ad5ef
lib/streamaggr: properly return output key from getOutputKey
The bug has been introduced in cc2647d212

(cherry picked from commit b617dc9c0b)
2024-05-16 09:35:53 +02:00
Aliaksandr Valialkin
b1ee7bca1a
lib/logstorage: work-in-progress 2024-05-14 03:06:02 +02:00
Aliaksandr Valialkin
f52275bbd7
lib/logstorage: work-in-progress
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6258
2024-05-14 01:49:58 +02:00
Aliaksandr Valialkin
207b4bd91d
lib/storage: fix SearchQuery.Unmarshal() after 32193b6059 2024-05-14 01:39:01 +02:00
Aliaksandr Valialkin
32193b6059
lib/encoding: optimize UnmarshalVarUint64, UnmarshalVarInt64 and UnmarshalBytes a bit
Change the return values for these functions - now they return the unmarshaled result plus
the size of the unmarshaled result in bytes, so the caller could re-slice the src for further unmarshaling.

This improves performance of these functions in hot loops of VictoriaLogs a bit.
2024-05-14 01:30:25 +02:00
Aliaksandr Valialkin
2e12119a9e
lib/stringsutil: add LessNatural() function for natural sorting
Natural sorting is needed for sort_by_label_natural() and sort_by_label_natural_desc()
functions in MetricsQL - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6192
and https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6256

Natural sorting will be also used by `| sort ...` pipe in VictoriaLogs -
see https://docs.victoriametrics.com/victorialogs/logsql/#sort-pipe

(cherry picked from commit 707f3a69db)
2024-05-13 17:08:56 +02:00
Hui Wang
ec56f4625e
storage: correctly apply -inmemoryDataFlushInterval when it's set t… (#6221)
…o minimum supported value 1s
pendingRowsFlushInterval was bumped to 2s in
73f0a805e2

(cherry picked from commit 4c80b17027)
2024-05-13 16:50:02 +02:00