Commit Graph

2032 Commits

Author SHA1 Message Date
Aliaksandr Valialkin
eea088d87f
docs/CHANGELOG.md: clarify description for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4336 bugfix
This is a follow-up for 5eb5df96e2
2023-07-06 22:42:02 -07:00
Alexander Marshalov
eb611c3dc3
fix removing storage data dir before restoring from backup (#598)
* fix removing storage data dir before restoring from backup

Signed-off-by: Alexander Marshalov <_@marshalov.org>

* fix review comment

Signed-off-by: Alexander Marshalov <_@marshalov.org>

* fix review comment

Signed-off-by: Alexander Marshalov <_@marshalov.org>

* fixes after merge with `enterprise-single-node` branch

Signed-off-by: Alexander Marshalov <_@marshalov.org>

---------

Signed-off-by: Alexander Marshalov <_@marshalov.org>
2023-07-06 22:32:12 -07:00
Aliaksandr Valialkin
eda26a8352
lib/backup/actions: remove misleading comment about the default value for Concurrency field 2023-07-06 22:31:40 -07:00
Aliaksandr Valialkin
ebd08cd822
lib/logstorage: go fmt 2023-07-06 22:24:18 -07:00
Aliaksandr Valialkin
5a12a518a3
lib/logstorage: fix make test-pure tests 2023-07-06 22:22:08 -07:00
Aliaksandr Valialkin
f2f9532fa5
lib/httputils: fix test after b49d04b3dc
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4459
2023-07-06 22:21:43 -07:00
Haleygo
b029286298
fix parse for invalid partial RFC3339 format (#4539)
The validation was needed for covering corner cases when storage is tested with data from 1970.
This resulted into unexpected search results, as year was parsed incorrectly from the given timestamp.


Co-authored-by: hagen1778 <roman@victoriametrics.com>
2023-07-06 22:09:35 -07:00
Alexander Marshalov
677c8a5465
show backup progress percentage in vmbackup log during backup uploading and restoring progress percentage in vmrestore log during backup downloading (#4460) (#4530)
Signed-off-by: Alexander Marshalov <_@marshalov.org>
2023-07-06 21:56:54 -07:00
Aliaksandr Valialkin
a9eb2409ea
app/vlstorage: export vl_active_merges and vl_merges_total metrics 2023-07-06 21:38:09 -07:00
Aliaksandr Valialkin
08634ae612
app/vlinsert/jsonline: code prettifying 2023-07-06 21:35:55 -07:00
Aliaksandr Valialkin
efee71986f
app/vlselect/logsql: sort query results by _time if their summary size doesnt exceed -select.maxSortBufferSize 2023-07-06 21:25:00 -07:00
Aliaksandr Valialkin
1c39af56ab
app/victoria-logs: add ability to debug data ingestion by passing debug query arg to data ingestion API 2023-07-06 21:19:58 -07:00
Aliaksandr Valialkin
374890294e
app/victoria-logs: initial code release 2023-07-06 17:30:05 -07:00
Aliaksandr Valialkin
de574e7128
lib/storage: do not create flock.lock files at partition directories, since it is created at the Storage level 2023-07-06 17:26:37 -07:00
Aliaksandr Valialkin
833a0e25a7
lib/netutil: ignore arificial timeout generated by net/http.Server
This prevents from the inflated vm_tcplistener_read_timeouts_total counter
2023-07-06 17:26:15 -07:00
Aliaksandr Valialkin
115667df82
lib/mergeset: do not create flock.lock file at mergeset table, since it is created at the lib/storage.Storage level 2023-07-06 17:25:45 -07:00
Aliaksandr Valialkin
ed5f4a0c5a
lib/fs: add ReaderAt.Path() function
This function is going to be used in VictoriaLogs
2023-07-06 17:25:19 -07:00
Aliaksandr Valialkin
4c80193a86
lib/encoding: add MarshalBool/UnmarshalBool and GetUint32s/PutUint32s functions
These functions are going to be used by VictoriaLogs
2023-07-06 17:24:52 -07:00
Aliaksandr Valialkin
d01f0a89db
lib/cgroup: add SetGOGC() function
This function is going to be used by VictoriaLogs
2023-07-06 17:24:31 -07:00
Aliaksandr Valialkin
af6c14d5e7
lib/bytesutil: substitute parentheses with slashes in ByteBuffer.Path() output, so it can be passed to path manipulating functions
This is needed for the upcoming VictoriaLogs
2023-07-06 17:23:52 -07:00
Aliaksandr Valialkin
427ce69426
app/vmselect: move common http functionality from app/vmselect/searchutils to lib/httputils
While at it, move app/vmselect/bufferedwriter to lib/bufferedwriter, since it is going to be used in VictoriaLogs
2023-07-06 17:22:23 -07:00
Aliaksandr Valialkin
46210c4d5e
lib/promutils.ParseTime(): add support for timestamps in milliseconds
See https://stackoverflow.com/questions/76437098/how-to-handle-time-unit-and-step-while-ingesting-or-querying-in-victoriametrics/76438405

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4459
2023-07-06 17:11:54 -07:00
Nikolay
dd7ebd6779
lib/storage: creates parts.json on start-up if it not exists. (#4450)
* lib/storage: creates parts.json on start-up if it not exists.
It fixes migrations from versions below v1.90.0.
Previously parts.json was created only after successful merge.
But if merge was interruped for some reason (OOM or shutdown), parts.json wasn't created and partitions left after interruped merge weren't properly deleted.
Since VM cannot check if it must be removed or not.
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4336

* Apply suggestions from code review

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>

* Update lib/storage/partition.go

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>

---------

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
2023-07-06 17:10:26 -07:00
Roman Khavronenko
09c05608f2
lib/storage: add comment for how mustBeDeleted field should be used (#4454)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-07-06 17:02:44 -07:00
Roman Khavronenko
897d17a5b3
lib/mergeset: add comment for how mustBeDeleted field should be used (#4449)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-07-06 17:00:55 -07:00
Alexander Marshalov
4084dba9e4
fixed service name detection for consulagent service discovery in case of a difference in service name and service id (#4390) (#4439)
Signed-off-by: Alexander Marshalov <_@marshalov.org>
2023-07-06 16:53:29 -07:00
Aliaksandr Valialkin
3bc3fb6adf
lib/vmselectapi: move the code for checking the expected client errors into a isExpectedError() function 2023-07-06 16:37:59 -07:00
Aliaksandr Valialkin
5b8095a30a
lib/promscrape: disable support for service discovery and metrics scrape via http2
Reasons for disabling http2:

- http2 is used very rarely comparing to http for Prometheus metrics exposition and service discovery
- http2 is much harder to debug than http
- http2 has very bad security record because of its complexity - see https://portswigger.net/research/http2

VictoriaMetrics components are compiled with nethttpomithttp2 tag because of these issues.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4283
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4274

This is a follow-up for 72c3cd47eb
2023-07-06 16:04:31 -07:00
Aliaksandr Valialkin
6a3cee5c2c
lib/promscrape/discoveryutils: re-use checkRedirect function for both client and blockingClient
Also document follow_redirects option at https://docs.victoriametrics.com/sd_configs.html#http-api-client-options

This is a follow-up for b3d0ff463a

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4282
2023-07-06 10:52:13 -07:00
Alexander Marshalov
b3f8bb5b50
vmbackupmanager bugfixes: (#577)
- error on running with empty -dst dir and without -runOnStart
- error on restoring with backup, created before v1.90.0
2023-07-05 22:08:04 -07:00
Zakhar Bessarab
bf4120a3d9
lib/vmselectapi: extend error handling to ignore "reset by peer" (#4498)
This is a followup for https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4418 to also handle "connection reset by peer" errors in connection handling logic.
This error can be triggered just the same as described in original PR: when query was closed on vmselect side and connection has been interrupted.

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-06-22 11:24:18 +02:00
hagen1778
dde01c826d
lib/vmselectapi: properly check for net.ErrClosed
This error may be wrapped in another error, and should normally be tested using
`errors.Is(err, net.ErrClosed)`.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-06-09 10:42:03 +02:00
Roman Khavronenko
d677c2a5a6
lib/promscrape/discoveryutils: properly check for net.ErrClosed (#4426)
This error may be wrapped in another error, and should normally be tested using
`errors.Is(err, net.ErrClosed)`.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit dfe53a36fc)
2023-06-09 10:41:07 +02:00
Roman Khavronenko
fb9b8f6b1b
app/vmagent: mention enable_http2 in changelog (#4403)
Follow-up after
72c3cd47eb

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 3305a6901c)
2023-06-09 10:40:24 +02:00
Haleygo
6edf94c4b9
vmagent:scrape config support enable_http2 (#4295)
app/vmagent: support `enable_http2` in scrape config

This change adds HTTP2 support for scrape config
and improves compatibility with Prometheus config.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4283

(cherry picked from commit 72c3cd47eb)
2023-06-09 10:40:17 +02:00
Roman Khavronenko
dfb05c884b
lib/vmselectapi: suppress "broken pipe" error logs on vmstorage side (#4418)
The "broken pipe" error is emitted when the connection has been interrupted abruptly.
It could happen due to unexpected network glitch or because connection was
interrupted by remote client. In both cases, remote client will notice
connection breach and handle it on its own. No need in logging this error
on both: server and client side.

This change should reduce the amount of log noise on vmstorage side. In the same time,
it is not expected to lose any information, since important logs should be still
emitted by the vmselect.

To conduct an experiment for testing this change see the following instructions:
1. Setup vmcluster with at least 2 storage nodes, 1 vminsert and 1 vmselect
2. Run vmselect with complexity limit checked on the client side: `-search.maxSamplesPerQuery=1`
3. Ingest some data and query it back: `count({__name__!=""})`
4. Observe the logs on vmselect and vmstorage side

Before the change, vmselect will log message about complexity limits exceeded. When this happens,
vmselect closes network connections to vmstorage nodes signalizing that it doesn't expect any data back.
Both vmstorage processes will try to push data to the connection and will fail with "broken pipe" error,
means that vmselect closed the connection.

After the change, vmstorages should remain silent. And vmselect will continue emittin the error message
about complexity limits exceeded.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-06-08 08:31:05 -07:00
Nikolay
043431093a
app/vmauth: properly handle LOCAL proxy protocol command (#4373)
app/vmauth: properly handle LOCAL proxy protocol command

It is required for handling health checks from load balancers

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3335
(cherry picked from commit f263031fe9)
2023-06-02 13:29:15 +02:00
Haleygo
73a8f763a0
vmagent:support follow_redirects on SD level (#4286)
* vmagent:support follow_redirects on SD level

* fix follow_redirects on sd level

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4282
(cherry picked from commit b3d0ff463a)
2023-06-02 13:19:35 +02:00
Aliaksandr Valialkin
c30f0e51d7
lib/promrelabel: use monospace font at textarea for writing relabel configs on /metric-relabel-debug and /target-relabel-debug pages
This simplifies visual inspection of indentation in yaml configs
2023-05-18 20:49:47 -07:00
Aliaksandr Valialkin
0ebfb91aba
lib/storage: revert the migration from global to per-day index for (MetricName -> TSID)
This reverts the following commits:
- e0e16a2d36
- 2ce02a7fe6

The reason for revert: the updated logic breaks assumptions made
when fixing https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698 .
For example, if a time series stop receiving new samples during the first
day after the indexdb rotation, there are chances that the time series
won't be registered in the new indexdb. This is OK until the next indexdb
rotation, since the time series is registered in the previous indexdb,
so it can be found during queries. But the time series will become invisible
for search after the next indexdb rotation, while its data is still there.

There is also incompletely solved issue with the increased CPU and disk IO resource
usage just after the indexdb rotation. There was an attempt to fix it, but it didn't fix
it in full, while introducing the issue mentioned above. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401

TODO: to find out the solution, which simultaneously solves the following issues:
- increased memory usage for setups high churn rate and long retention (e.g. what the reverted commit does)
- increased CPU and disk IO usage during indexdb rotation ( https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 )
- https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698
2023-05-18 11:28:54 -07:00
Aliaksandr Valialkin
0397b3f0f7
lib/handshake: do not pollute logs with cannot read hello messages on TCP health checks
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1762
2023-05-18 10:37:59 -07:00
Aliaksandr Valialkin
67beb8c856
lib/storage: follow-up after 2ce02a7fe6
- Document the change at docs/CHANGELOG.md
- Clarify comments for non-trivial code touched by the commit
- Improve the logic behind maybeCreateIndexes():
  - Correctly create per-day indexes if the indexdb rotation is performed during
    the first hour or the last hour of the day by UTC.
    Previously there was a possibility of missing index entries on that day.
  - Increase the duration for creating new indexes in the current indexdb for up to 22 hours
    after indexdb rotation. This should reduce the increased resource usage
    after indexdb rotation.
    It is safe to postpone index creation for the current day until the last hour
    of the current day after indexdb rotation by UTC, since the corresponding (date, ...)
    entries exist in the previous indexdb.
- Search for TSID by (date, MetricName) in both the current and the previous indexdb.
  Previously the search was performed only in the current indexdb. This could lead
  to excess creation of per-day indexes for the current day just after indexdb rotation.
- Search for (date, metricID) entries in both the current and the previous indexdb.
  Previously the search was performed only in the current indexdb. This could lead
  to excess creation of per-day indexes for the current day just after indexdb rotation.
2023-05-16 23:31:59 -07:00
Roman Khavronenko
c3b1d9ee21
lib/storage: introduce per-day MetricName=>TSID index (#4252)
The new index substitutes global MetricName=>TSID index
used for locating TSIDs on ingestion path.
For installations with high ingestion and churn rate, global
MetricName=>TSID index can grow enormously making
index lookups too expensive. This also results into bigger
than expected cache growth for indexdb blocks.

New per-day index supposed to be much smaller and more efficient.
This should improve ingestion speed and reliability during
re-routings in cluster.

The negative outcome could be occupied disk size, since
per-day index is more expensive comparing to global index.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-05-16 23:18:11 -07:00
Aliaksandr Valialkin
bc98ea9a8d
lib/storage: reduce the unimportant logging during Storage start / stop
This should improve the visibility of potentially important logs
2023-05-16 15:32:35 -07:00
Aliaksandr Valialkin
05113bba09
lib/mergeset: remove superflouos logging when opening and closing the Table
The logged messages had little useful info, while they were polluting log output during VictoriaMetrics start/stop
2023-05-16 15:32:35 -07:00
Aliaksandr Valialkin
4a5b5c5020
lib/mergeset: close and open the table before making snapshots at TestTableCreateSnapshotAt()
This gives guarantees that all the in-memory data is written to disk at the snapshot time.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4272
See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4316
2023-05-16 15:32:34 -07:00
Aliaksandr Valialkin
f09745f613
lib/{mergeset,storage}: make it clear that DebugFlush() doesn't store all the recently ingested data to disk
DebugFlush() makes sure that the recently ingested data becomes visible to search.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4272
2023-05-16 11:55:58 -07:00
Alexander Marshalov
ad35081066
backup metadata are written in separate file (#560)
Signed-off-by: Alexander Marshalov <_@marshalov.org>
2023-05-16 11:24:44 -07:00
Zakhar Bessarab
a2fc912c43
lib/storage: follow-up after a50d63c376 (#4289)
* lib/storage: follow-up after a50d63c376

- ensure retentionMsecs is rounded to day
- remove localTimeOffset in test as localOffset is ignored when using `UnixMilli`

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* lib/storage: restore retention timezone offset effect on retention deadline

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-05-16 10:13:20 -07:00
Aliaksandr Valialkin
9461d3fdfa
lib/promutils: add ParseTimeAt() function 2023-05-13 20:12:55 -07:00