VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-12-20 23:46:23 +01:00

Author	SHA1	Message	Date
Aliaksandr Valialkin	4cae725edf	lib/encoding/zstd: switch back from atomic.Pointer to atomic.Value for map[...]... The map[...]... is already a pointer type, so atomic.Pointer[map[...]...] results in double pointer. This is a follow-up for `140e7b6b74`	2023-07-20 20:56:11 -07:00
Aliaksandr Valialkin	49bd2905fa	lib/promscrape: follow-up after `6aa50ca954` - Improve docs - Hide `debug relabeling` column when -promscrape.dropOriginalLabels command-line flag is set - Inline the code from the added template functions, since the code is harder to follow with the template functions, especially when these functions have misleading names. Also, these functions are used only in one place, e.g. they do not reduce the amounts of code. - Hide `click to show original labels` title at `labels` column when original labels aren't available. - Show the reason on whey original labels aren't available at /service-discovery page. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4597	2023-07-20 19:14:33 -07:00
Aliaksandr Valialkin	f548adce0b	app/vlinsert/loki: follow-up after `09df5b66fd` - Parse protobuf if Content-Type isn't set to `application/json` - this behavior is documented at https://grafana.com/docs/loki/latest/api/#push-log-entries-to-loki - Properly handle gzip'ped JSON requests. The `gzip` header must be read from `Content-Encoding` instead of `Content-Type` header - Properly flush all the parsed logs with the explicit call to vlstorage.MustAddRows() at the end of query handler - Check JSON field types more strictly. - Allow parsing Loki timestamp as floating-point number. Such a timestamp can be generated by some clients, which store timestamps in float64 instead of int64. - Optimize parsing of Loki labels in Prometheus text exposition format. - Simplify tests. - Remove lib/slicesutil, since there are no more users for it. - Update docs with missing info and fix various typos. For example, it should be enough to have `instance` and `job` labels as stream fields in most Loki setups. - Allow empty of missing timestamps in the ingested logs. The current timestamp at VictoriaLogs side is then used for the ingested logs. This simplifies debugging and testing of the provided HTTP-based data ingestion APIs. The remaining MAJOR issue, which needs to be addressed: victoria-logs binary size increased from 13MB to 22MB after adding support for Loki data ingestion protocol at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4482 . This is because of shitty protobuf dependencies. They must be replaced with another protobuf implementation similar to the one used at lib/prompb or lib/prompbmarshal .	2023-07-20 16:48:21 -07:00
Alexander Marshalov	70773f53d7	allow configuring staleness interval in stream aggregation (#4667 ) (#4670 ) --------- Signed-off-by: Alexander Marshalov <_@marshalov.org> Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>	2023-07-20 16:07:33 +02:00
Haleygo	da60a68d09	vmalert: init unit test (#4596 ) vmalert: support unit tests See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2945 --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com>	2023-07-20 15:07:10 +02:00
Dmytro Kozlov	6aa50ca954	app/vmagent: fix creating target id if `--promscrape.dropOriginalLabels` flag was used (#4616 ) * app/vmagent: fix creating target id if `--promscrape.dropOriginalLabels` flag was used * app/vmagent: hide links if OriginalLabels was dropped * app/vmagent: update CHANGELOG.md and added information to the docs * app/vmagent: fix comments	2023-07-20 10:13:39 +02:00
Zakhar Bessarab	09df5b66fd	app/vlinsert: add support of loki push protocol (#4482 ) * app/vlinsert: add support of loki push protocol - implemented loki push protocol for both Protobuf and JSON formats - added examples in documentation - added example docker-compose Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * app/vlinsert: move protobuf metric into its own file Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * deployment/docker/victorialogs/promtail: update reference to docker image Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * deployment/docker/victorialogs/promtail: make volume name unique Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * app/vlinsert/loki: add license reference Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * deployment/docker/victorialogs/promtail: fix volume name Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * docs/VictoriaLogs/data-ingestion: add stream fields for loki JSON ingestion example Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * app/vlinsert/loki: move entities to places where those are used Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * app/vlinsert/loki: refactor to use common components - use CommonParameters from insertutils - stop ingestion after first error similar to elasticsearch and jsonline Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * app/vlinsert/loki: address review feedback - add missing logstorage.PutLogRows calls - refactor tenant ID parsing to use common function - reduce number of allocations for parsing by reusing logfields slices - add tests and benchmarks for requests processing funcs Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2023-07-20 10:10:55 +02:00
Aliaksandr Valialkin	140e7b6b74	all: replace atomic.Value with atomic.Pointer[T] This eliminates the need in .(*T) casting for results obtained from Load() Leave atomic.Value for map, since atomic.Pointer[map[...]...] makes double pointer to map, because map is already a pointer type.	2023-07-19 17:42:06 -07:00
Roman Khavronenko	c32a01c52e	docs: follow-up after `aec4b5db81` (#4638 ) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-07-19 10:10:51 +02:00
Aliaksandr Valialkin	163572ea97	lib/logstorage: `go fmt` after `a8000b74c5`	2023-07-18 16:04:51 -07:00
Aliaksandr Valialkin	a8000b74c5	lib/logstorage: properly encode `"offset"` search word just after _time filter	2023-07-18 16:00:06 -07:00
Aliaksandr Valialkin	ed00b03ecb	lib/logstorage: add abilty to speficy offset for the selected _time filter The following syntax is supported: _time:filter offset off For example: - _time:5m offset 1h - 5-minute duration one hour before the current time - _time:2023 offset 2w - 2023 year with the 2 weeks offset in the past	2023-07-17 19:07:42 -07:00
Aliaksandr Valialkin	118b093bdd	lib/logstorage: log the -retentionPeriod and -futureRetention values when the ingested log entry has timestamp outside the configured retention This should simplify debugging	2023-07-17 19:07:41 -07:00
Aliaksandr Valialkin	bdfb80668d	lib/logstorage: support for short form of _time:(now-duration, now] filter: _time:duration	2023-07-17 19:07:40 -07:00
Aliaksandr Valialkin	3bf58326e7	lib/logstorage: LogsQL: replace exact_prefix("...") with exact("...") This makes LogsQL queries more consistent with i("...") and i("...") syntax	2023-07-17 19:07:40 -07:00
Aliaksandr Valialkin	8815080030	app/vmselect/promql: add the ability to copy all the labels from `one` side of group_left()/group_right() operation This is performed by specifying `` inside group_left()/group_right(). Also allow specifying prefix for the copied labels via `group_left(...) prefix "..."` and `group_right(...) prefix "..."` syntax. For example, the following query adds all the namespace-related labels to pod info, and prefixes all the copied label names with "ns_" prefix: kube_pod_info on(namespace) group_left(*) prefix "ns_" kube_namespace_labels This resolves the following StackOverflow questions: - https://stackoverflow.com/questions/76661818/how-to-add-namespace-labels-to-pod-labels-in-prometheus - https://stackoverflow.com/questions/76653997/how-can-i-make-a-new-copy-of-kube-namespace-labels-metric-with-a-different-name	2023-07-17 19:07:39 -07:00
Aliaksandr Valialkin	4cb024d8a3	all: add support for `or` filters in series selectors This commit adds ability to select series matching distinct filters via a single series selector. For example, the following selector selects series with either {env="prod",job="a"} or {env="dev",job="b"} labels: {env="prod",job="a" or env="dev",job="b"} The `or` filter is supported in all the VictoriaMetrics tools now. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3997 Uses https://github.com/VictoriaMetrics/metricsql/pull/14	2023-07-16 00:06:33 -07:00
Aliaksandr Valialkin	6685f6ce7c	lib/storage: move series registration in caches from createAllIndexesForMetricName into a separate function - putSeriesToCache This makes the code more clear and easier to read This is a follow-up for `7094fa38bc`	2023-07-13 23:13:23 -07:00
Aliaksandr Valialkin	0c49552849	lib/mergeset: skip common prefix in binarySearchKey() function This should improve performance a bit when the search if performed among items with long common prefix	2023-07-13 22:04:59 -07:00
Aliaksandr Valialkin	3dacdcb707	lib/storage: optimize BenchmarkIndexDBGetTSIDs() - Sort MetricName tags only once before the benchmark loop. - Obtain indexSearch per each benchmark loop in order to give a chance for background merge for the recently created parts	2023-07-13 21:56:53 -07:00
Aliaksandr Valialkin	443661a5da	lib/storage: properly free up resources from newTestStorage() by calling stopTestStorage()	2023-07-13 17:13:24 -07:00
Aliaksandr Valialkin	7094fa38bc	lib/storage: switch from global to per-day index for `MetricName -> TSID` mapping Previously all the newly ingested time series were registered in global `MetricName -> TSID` index. This index was used during data ingestion for locating the TSID (internal series id) for the given canonical metric name (the canonical metric name consists of metric name plus all its labels sorted by label names). The `MetricName -> TSID` index is stored on disk in order to make sure that the data isn't lost on VictoriaMetrics restart or unclean shutdown. The lookup in this index is relatively slow, since VictoriaMetrics needs to read the corresponding data block from disk, unpack it, put the unpacked block into `indexdb/dataBlocks` cache, and then search for the given `MetricName -> TSID` entry there. So VictoriaMetrics uses in-memory cache for speeding up the lookup for active time series. This cache is named `storage/tsid`. If this cache capacity is enough for all the currently ingested active time series, then VictoriaMetrics works fast, since it doesn't need to read the data from disk. VictoriaMetrics starts reading data from `MetricName -> TSID` on-disk index in the following cases: - If `storage/tsid` cache capacity isn't enough for active time series. Then just increase available memory for VictoriaMetrics or reduce the number of active time series ingested into VictoriaMetrics. - If new time series is ingested into VictoriaMetrics. In this case it cannot find the needed entry in the `storage/tsid` cache, so it needs to consult on-disk `MetricName -> TSID` index, since it doesn't know that the index has no the corresponding entry too. This is a typical event under high churn rate, when old time series are constantly substituted with new time series. Reading the data from `MetricName -> TSID` index is slow, so inserts, which lead to reading this index, are counted as slow inserts, and they can be monitored via `vm_slow_row_inserts_total` metric exposed by VictoriaMetrics. Prior to this commit the `MetricName -> TSID` index was global, e.g. it contained entries sorted by `MetricName` for all the time series ever ingested into VictoriaMetrics during the configured -retentionPeriod. This index can become very large under high churn rate and long retention. VictoriaMetrics caches data from this index in `indexdb/dataBlocks` in-memory cache for speeding up index lookups. The `indexdb/dataBlocks` cache may occupy significant share of available memory for storing recently accessed blocks at `MetricName -> TSID` index when searching for newly ingested time series. This commit switches from global `MetricName -> TSID` index to per-day index. This allows significantly reducing the amounts of data, which needs to be cached in `indexdb/dataBlocks`, since now VictoriaMetrics consults only the index for the current day when new time series is ingested into it. The downside of this change is increased indexdb size on disk for workloads without high churn rate, e.g. with static time series, which do no change over time, since now VictoriaMetrics needs to store identical `MetricName -> TSID` entries for static time series for every day. This change removes an optimization for reducing CPU and disk IO spikes at indexdb rotation, since it didn't work correctly - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 . At the same time the change fixes the issue, which could result in lost access to time series, which stop receving new samples during the first hour after indexdb rotation - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698 The issue with the increased CPU and disk IO usage during indexdb rotation will be addressed in a separate commit according to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401#issuecomment-1553488685 This is a follow-up for `1f28b46ae9`	2023-07-13 16:07:30 -07:00
Aliaksandr Valialkin	3b50b94f7a	lib/storage: fix possible test failure in TestStorageAddRowsConcurrent The number of parts in the snapshot partition may be zero if concurrent goroutine just started creating new partition, but didn't put data into it yet when the current goroutine made a snapshot.	2023-07-13 15:03:45 -07:00
Aliaksandr Valialkin	4ba19f6b32	lib/mergeset: simplify fulsuhInmemoryParts() a bit	2023-07-13 12:33:30 -07:00
Aliaksandr Valialkin	a79e53d82a	lib/logstorage: fix TestValuesEncoder() on 32-bit architectures	2023-07-13 11:27:13 -07:00
Dmytro Kozlov	79c42814cf	lib/logstorage: fix panic (#4620 )	2023-07-13 09:53:41 +02:00
Zakhar Bessarab	51a9cc9783	docs: make `httpAuth.` flags description less ambiguous (#4588 ) docs: make `httpAuth.` flags description less ambiguous Currently, it may confuse users whether `httpAuth.` flags are used by HTTP client or server configuration(see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4586 for example). Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * docs: fix a typo Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2023-07-07 13:50:13 +02:00
Aliaksandr Valialkin	152ca00fb8	docs/CHANGELOG.md: clarify description for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4336 bugfix This is a follow-up for `5eb5df96e2`	2023-07-06 17:09:03 -07:00
Aliaksandr Valialkin	8a07621a0c	lib/promscrape: disable support for service discovery and metrics scrape via http2 Reasons for disabling http2: - http2 is used very rarely comparing to http for Prometheus metrics exposition and service discovery - http2 is much harder to debug than http - http2 has very bad security record because of its complexity - see https://portswigger.net/research/http2 VictoriaMetrics components are compiled with nethttpomithttp2 tag because of these issues. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4283 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4274 This is a follow-up for `72c3cd47eb`	2023-07-06 16:03:37 -07:00
Alexander Marshalov	af53c7cc78	fix removing storage data dir before restoring from backup (#598 ) * fix removing storage data dir before restoring from backup Signed-off-by: Alexander Marshalov <_@marshalov.org> * fix review comment Signed-off-by: Alexander Marshalov <_@marshalov.org> * fix review comment Signed-off-by: Alexander Marshalov <_@marshalov.org> * fixes after merge with `enterprise-single-node` branch Signed-off-by: Alexander Marshalov <_@marshalov.org> --------- Signed-off-by: Alexander Marshalov <_@marshalov.org>	2023-07-06 14:16:18 -07:00
Aliaksandr Valialkin	3286ca3318	lib/backup/actions: remove misleading comment about the default value for Concurrency field	2023-07-06 14:07:08 -07:00
Aliaksandr Valialkin	792860db10	lib/promscrape/discoveryutils: re-use checkRedirect function for both client and blockingClient Also document follow_redirects option at https://docs.victoriametrics.com/sd_configs.html#http-api-client-options This is a follow-up for `b3d0ff463a` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4282	2023-07-06 10:51:33 -07:00
Alexander Marshalov	fc67d94e86	vmbackupmanager bugfixes: (#577 ) - error on running with empty -dst dir and without -runOnStart - error on restoring with backup, created before v1.90.0	2023-07-05 22:07:15 -07:00
Aliaksandr Valialkin	3c5623ce7f	lib/logstorage: go fmt	2023-07-04 14:13:14 -07:00
Aliaksandr Valialkin	6d35d21f60	lib/logstorage: fix `make test-pure` tests	2023-07-04 13:14:30 -07:00
Aliaksandr Valialkin	d1dd25122a	lib/httputils: fix test after `b49d04b3dc` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4459	2023-07-04 09:40:12 -07:00
Haleygo	5fc0ee43d4	fix parse for invalid partial RFC3339 format (#4539 ) The validation was needed for covering corner cases when storage is tested with data from 1970. This resulted into unexpected search results, as year was parsed incorrectly from the given timestamp. Co-authored-by: hagen1778 <roman@victoriametrics.com>	2023-07-03 13:11:49 +02:00
Alexander Marshalov	1cc06e39cd	show backup progress percentage in vmbackup log during backup uploading and restoring progress percentage in vmrestore log during backup downloading (#4460 ) (#4530 ) Signed-off-by: Alexander Marshalov <_@marshalov.org>	2023-06-28 14:44:45 +02:00
Aliaksandr Valialkin	83aa78dfb4	app/vlstorage: export vl_active_merges and vl_merges_total metrics	2023-06-21 20:58:57 -07:00
Aliaksandr Valialkin	dde9ceed07	app/vlinsert/jsonline: code prettifying	2023-06-21 19:39:22 -07:00
Aliaksandr Valialkin	7346bb4f03	app/vlselect/logsql: sort query results by _time if their summary size doesnt exceed -select.maxSortBufferSize	2023-06-21 01:11:25 -07:00
Aliaksandr Valialkin	00c3dbd15d	app/victoria-logs: add ability to debug data ingestion by passing `debug` query arg to data ingestion API	2023-06-20 20:02:46 -07:00
Aliaksandr Valialkin	87b66db47d	app/victoria-logs: initial code release	2023-06-19 22:55:12 -07:00
Aliaksandr Valialkin	aeac39cfd1	lib/storage: do not create flock.lock files at partition directories, since it is created at the Storage level	2023-06-19 22:48:37 -07:00
Aliaksandr Valialkin	0f01eea4e9	lib/netutil: ignore arificial timeout generated by net/http.Server This prevents from the inflated vm_tcplistener_read_timeouts_total counter	2023-06-19 22:46:40 -07:00
Aliaksandr Valialkin	298aab3f54	lib/mergeset: do not create flock.lock file at mergeset table, since it is created at the lib/storage.Storage level	2023-06-19 22:45:31 -07:00
Aliaksandr Valialkin	371182f299	lib/fs: add ReaderAt.Path() function This function is going to be used in VictoriaLogs	2023-06-19 22:42:27 -07:00
Aliaksandr Valialkin	497ec3f3e6	lib/encoding: add MarshalBool/UnmarshalBool and GetUint32s/PutUint32s functions These functions are going to be used by VictoriaLogs	2023-06-19 22:40:55 -07:00
Aliaksandr Valialkin	3409317a67	lib/cgroup: add SetGOGC() function This function is going to be used by VictoriaLogs	2023-06-19 22:39:00 -07:00
Aliaksandr Valialkin	c1bed35b39	lib/bytesutil: substitute parentheses with slashes in ByteBuffer.Path() output, so it can be passed to path manipulating functions This is needed for the upcoming VictoriaLogs	2023-06-19 22:37:26 -07:00
Aliaksandr Valialkin	78eaa056c0	app/vmselect: move common http functionality from app/vmselect/searchutils to lib/httputils While at it, move app/vmselect/bufferedwriter to lib/bufferedwriter, since it is going to be used in VictoriaLogs	2023-06-19 22:34:20 -07:00
Aliaksandr Valialkin	b49d04b3dc	lib/promutils.ParseTime(): add support for timestamps in milliseconds See https://stackoverflow.com/questions/76437098/how-to-handle-time-unit-and-step-while-ingesting-or-querying-in-victoriametrics/76438405 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4459	2023-06-19 22:25:04 -07:00
Nikolay	5eb5df96e2	lib/storage: creates parts.json on start-up if it not exists. (#4450 ) * lib/storage: creates parts.json on start-up if it not exists. It fixes migrations from versions below v1.90.0. Previously parts.json was created only after successful merge. But if merge was interruped for some reason (OOM or shutdown), parts.json wasn't created and partitions left after interruped merge weren't properly deleted. Since VM cannot check if it must be removed or not. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4336 * Apply suggestions from code review Co-authored-by: Roman Khavronenko <roman@victoriametrics.com> * Update lib/storage/partition.go Co-authored-by: Roman Khavronenko <roman@victoriametrics.com> --------- Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>	2023-06-15 11:19:22 +02:00
Roman Khavronenko	f50f35a8e0	lib/storage: add comment for how `mustBeDeleted` field should be used (#4454 ) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-06-15 11:17:45 +02:00
Roman Khavronenko	f71cc99a8c	lib/mergeset: add comment for how `mustBeDeleted` field should be used (#4449 ) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-06-14 18:13:16 +02:00
Alexander Marshalov	40d12be607	fixed service name detection for consulagent service discovery in case of a difference in service name and service id (#4390 ) (#4439 ) Signed-off-by: Alexander Marshalov <_@marshalov.org>	2023-06-12 16:16:43 +02:00
Roman Khavronenko	dfe53a36fc	lib/promscrape/discoveryutils: properly check for net.ErrClosed (#4426 ) This error may be wrapped in another error, and should normally be tested using `errors.Is(err, net.ErrClosed)`. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-06-09 09:26:33 +02:00
Roman Khavronenko	3305a6901c	app/vmagent: mention `enable_http2` in changelog (#4403 ) Follow-up after `72c3cd47eb` Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-06-05 16:31:58 +02:00
Haleygo	72c3cd47eb	vmagent:scrape config support enable_http2 (#4295 ) app/vmagent: support `enable_http2` in scrape config This change adds HTTP2 support for scrape config and improves compatibility with Prometheus config. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4283	2023-06-05 15:56:49 +02:00
Nikolay	f263031fe9	app/vmauth: properly handle LOCAL proxy protocol command (#4373 ) app/vmauth: properly handle LOCAL proxy protocol command It is required for handling health checks from load balancers https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3335	2023-05-31 15:37:59 +02:00
Haleygo	b3d0ff463a	vmagent:support follow_redirects on SD level (#4286 ) * vmagent:support follow_redirects on SD level * fix follow_redirects on sd level https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4282	2023-05-26 09:39:45 +02:00
Aliaksandr Valialkin	1f2f74e70e	lib/promrelabel: use monospace font at textarea for writing relabel configs on /metric-relabel-debug and /target-relabel-debug pages This simplifies visual inspection of indentation in yaml configs	2023-05-18 20:48:41 -07:00
Aliaksandr Valialkin	1f28b46ae9	lib/storage: revert the migration from global to per-day index for (MetricName -> TSID) This reverts the following commits: - `e0e16a2d36` - `2ce02a7fe6` The reason for revert: the updated logic breaks assumptions made when fixing https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698 . For example, if a time series stop receiving new samples during the first day after the indexdb rotation, there are chances that the time series won't be registered in the new indexdb. This is OK until the next indexdb rotation, since the time series is registered in the previous indexdb, so it can be found during queries. But the time series will become invisible for search after the next indexdb rotation, while its data is still there. There is also incompletely solved issue with the increased CPU and disk IO resource usage just after the indexdb rotation. There was an attempt to fix it, but it didn't fix it in full, while introducing the issue mentioned above. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 TODO: to find out the solution, which simultaneously solves the following issues: - increased memory usage for setups high churn rate and long retention (e.g. what the reverted commit does) - increased CPU and disk IO usage during indexdb rotation ( https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 ) - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698 Possible solution - to create the new indexdb in one hour before the indexdb rotation and to gradually pre-populate it with the needed index data during the last hour before indexdb rotation. Then the new indexdb will contain all the needed data just after the rotation, so it won't trigger increased CPU and disk IO.	2023-05-18 11:30:49 -07:00
Haleygo	1531d757ea	fix lint check	2023-05-17 13:51:36 +02:00
Aliaksandr Valialkin	e0e16a2d36	lib/storage: follow-up after `2ce02a7fe6` - Document the change at docs/CHANGELOG.md - Clarify comments for non-trivial code touched by the commit - Improve the logic behind maybeCreateIndexes(): - Correctly create per-day indexes if the indexdb rotation is performed during the first hour or the last hour of the day by UTC. Previously there was a possibility of missing index entries on that day. - Increase the duration for creating new indexes in the current indexdb for up to 22 hours after indexdb rotation. This should reduce the increased resource usage after indexdb rotation. It is safe to postpone index creation for the current day until the last hour of the current day after indexdb rotation by UTC, since the corresponding (date, ...) entries exist in the previous indexdb. - Search for TSID by (date, MetricName) in both the current and the previous indexdb. Previously the search was performed only in the current indexdb. This could lead to excess creation of per-day indexes for the current day just after indexdb rotation. - Search for (date, metricID) entries in both the current and the previous indexdb. Previously the search was performed only in the current indexdb. This could lead to excess creation of per-day indexes for the current day just after indexdb rotation.	2023-05-16 23:19:27 -07:00
Roman Khavronenko	2ce02a7fe6	lib/storage: introduce per-day MetricName=>TSID index (#4252 ) The new index substitutes global MetricName=>TSID index used for locating TSIDs on ingestion path. For installations with high ingestion and churn rate, global MetricName=>TSID index can grow enormously making index lookups too expensive. This also results into bigger than expected cache growth for indexdb blocks. New per-day index supposed to be much smaller and more efficient. This should improve ingestion speed and reliability during re-routings in cluster. The negative outcome could be occupied disk size, since per-day index is more expensive comparing to global index. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-05-16 15:46:42 -07:00
Aliaksandr Valialkin	278278af95	lib/storage: reduce the unimportant logging during Storage start / stop This should improve the visibility of potentially important logs	2023-05-16 15:14:21 -07:00
Aliaksandr Valialkin	d330c7e6fc	lib/mergeset: remove superflouos logging when opening and closing the Table The logged messages had little useful info, while they were polluting log output during VictoriaMetrics start/stop	2023-05-16 15:01:25 -07:00
Aliaksandr Valialkin	3cbc0975f6	lib/mergeset: close and open the table before making snapshots at TestTableCreateSnapshotAt() This gives guarantees that all the in-memory data is written to disk at the snapshot time. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4272 See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4316	2023-05-16 14:55:11 -07:00
Aliaksandr Valialkin	09b403d38a	lib/{mergeset,storage}: make it clear that DebugFlush() doesn't store all the recently ingested data to disk DebugFlush() makes sure that the recently ingested data becomes visible to search. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4272	2023-05-16 11:50:17 -07:00
Alexander Marshalov	3b2dc2b098	backup metadata are written in separate file (#560 ) Signed-off-by: Alexander Marshalov <_@marshalov.org>	2023-05-16 11:24:54 -07:00
Zakhar Bessarab	242050ba94	lib/storage: follow-up after `a50d63c376` (#4289 ) * lib/storage: follow-up after `a50d63c376` - ensure retentionMsecs is rounded to day - remove localTimeOffset in test as localOffset is ignored when using `UnixMilli` Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/storage: restore retention timezone offset effect on retention deadline Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2023-05-16 17:14:08 +02:00
Aliaksandr Valialkin	1c47acda11	lib/promutils: add ParseTimeAt() function	2023-05-13 20:12:31 -07:00
Aliaksandr Valialkin	616175b1ce	lib/promutils: properly return error when incorrect Prometheus label names are passed to NewLabelsFromString() Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4284 See also https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4304	2023-05-12 16:52:29 -07:00
Aliaksandr Valialkin	318a87c36f	Revert "lib/promrelabel: show error message if labels not in prometheus exposition format (#4304 )" This reverts commit `193a9c3328`. Reason for revert: the commit doesn't fix the real issue with promutils.NewLabelsFromString() function, which must return error when improperly formatted Prometheus metric with labels is passed to it. See https://github.com/prometheus/docs/blob/main/content/docs/instrumenting/exposition_formats.md#text-format-example E.g. the promutils.NewLabelsFromString() must return error when the following strings are passed to it: - `{foo:"bar"}`, since `:` is disallowed in Prometheus text exposition format. The corect value is `{foo="bar"}` - `{"foo":"bar"}`, since label name shouldn't be quoted. The correct value is `{foo="bar"}`. The reverted commit introduces another set of bugs, which happily accept the following invalid input: - `{foo=~"bar"}` - `{foo!="bar"}` - `{foo!~"bar"}` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4284 See also https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4304	2023-05-12 16:07:37 -07:00
Aliaksandr Valialkin	160453b86c	lib/protoparser/csvimport: properly parse the last empty column in CSV line Do not ignore the last empty column in CSV line. While at it, properly parse CSV columns in single quotes, e.g. `'foo,bar',baz` is parsed as two columns - `foo,bar` and `baz` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4048 See also https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4298	2023-05-12 15:51:41 -07:00
Aliaksandr Valialkin	b7fe7b801c	Revert "lib/protoparser: fix skip csv line when metric can be collect from the line (#4298 )" This reverts commit `410ae99c2e`. Reason for revert: the commit masks the real issue instead of fixing it. The real issue is that the scanner.NextColumn() skips the last column if it is empty. The commit also introduces two bugs: - a panic if all the metric values in CSV line are empty - silent import of CSV lines with too small number of columns Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4048 See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4298	2023-05-12 15:22:27 -07:00
Dmytro Kozlov	193a9c3328	lib/promrelabel: show error message if labels not in prometheus exposition format (#4304 ) lib/promrelabel: show error message if labels not in prometheus exposition format https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4284	2023-05-12 10:42:56 +02:00
Dmytro Kozlov	410ae99c2e	lib/protoparser: fix skip csv line when metric can be collect from the line (#4298 ) * lib/protoparser: fix skip csv line when metric can be collect from the line * lib/protoparser: fix comment	2023-05-12 11:04:16 +03:00
Alexander Marshalov	9855b38da2	fixed error with double slash in vmbackupmanager (#557 ) Signed-off-by: Alexander Marshalov <_@marshalov.org>	2023-05-11 13:38:07 -07:00
Aliaksandr Valialkin	73812c71a5	lib/promutils: properly parse time strings with timezones at ParseTime()	2023-05-11 13:24:00 -07:00
Aliaksandr Valialkin	da037cafc5	lib/bytesutil: `go fmt` after `2ec17bed2c`	2023-05-10 20:29:03 -07:00
Aliaksandr Valialkin	2ec17bed2c	lib/bytesutil: add benchmarks for ToUnsafeString() and ToUnsafeBytes()	2023-05-10 12:59:26 -07:00
Alexander Marshalov	2e494e2375	fixed typos in documentation and commandline flags descriptions (#4275 )	2023-05-10 09:50:41 +02:00
Aliaksandr Valialkin	b9bb64ce55	lib/promscrape/discovery/consulagent: substitute metaPrefix with the `__meta_consulagent_` plaintext string This simplifies future code navigation and search for the specific meta-label starting from __meta_consulagent_* prefix. For example, `grep __meta_consulagent_namespace` finds the exact place where this label is defined. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3953 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4217	2023-05-08 23:40:13 -07:00
Aliaksandr Valialkin	7db647e924	lib/fs: move common code outside arch-specific implementations of mustRemoveDirAtomic() This is a follow-up for `73b6c23271` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70	2023-05-08 23:10:20 -07:00
Aliaksandr Valialkin	887555669e	Revert "lib/streamaggr: discard samples with timestamps outside of aggregation interval (#4199 )" This reverts commit `9e99f2f5b3`. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4068 Reason for revert: this breaks valid use cases: - If timestamps aren't specified in the incoming samples on purpose. For example, if stream aggregation is used as StatsD replacement. StatsD protocol has no timestamp concept for incoming samples. See https://github.com/b/statsd_spec - If all the samples must be aggregated, even if they contain stale timestamps. for example, if the stream aggregation produces some counter of some events, it may be better to count all the events even if they were delayed before being ingested into VictoriaMetrics. Is is also unclear how to determine whether the sample becomes stale. For example, if the aggregation interval equals to 1h, and the previous aggregation cycle just finished 10 minutes ago, what to do with the newly incoming sample with the timestamp 30 minutes older than the current time? The answer highly depends on the context, so it is unsafe to uncoditionally use a single logic for dropping the old samples here.	2023-05-08 16:52:27 -07:00
Aliaksandr Valialkin	74155afb71	docs: clarify docs after `5ee344824f` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4183	2023-05-08 16:11:44 -07:00
Aliaksandr Valialkin	ec3943d14a	app/vmselect: small cleanup after `4f3f9950d0` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3807	2023-05-08 14:57:11 -07:00
Aliaksandr Valialkin	80946f06c2	app/{vmselect,vmctl}: move ParseTime() to lib/promutils Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4091 This is a follow-up for `e2053baf32`	2023-05-08 14:17:57 -07:00
Alexander Marshalov	8225a48b56	fixed `vm_promscrape_config_last_reload_successful` metric value recovery after successful reloading with unchanged content (#4260 ) (#4268 ) Signed-off-by: Alexander Marshalov <_@marshalov.org>	2023-05-08 13:32:51 +02:00
Nikolay	8f4de6fa47	lib/storage: properly update link for entry at dateMetricID cache (#4258 ) previously during sync for mutable and immutable cache parts, link for hotEntry with current date may be not properly updated it corrupts cache for backfilling metrics and increased cpu load	2023-05-05 21:45:47 -07:00
Zakhar Bessarab	4e71003620	lib/promscrape/discovery/kubernetes: follow-up for `d5e94721db` (#4255 ) - add changelog reference to an author - fix tests - add metadata to match Prometheus behavior Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2023-05-05 14:41:17 +02:00
Vasilchenko Anton	22e65402af	Add endpoint labels for pod targets discovered form endpoint but has different ports (#4253 ) Signed-off-by: Vasilchenko Anton <vasilchenko-as@yandex.ru>	2023-05-05 15:46:07 +04:00
Zakhar Bessarab	aca256735c	lib/storage: fix indexdb rotation infinite loop (#4249 ) When using `retentionTimezoneOffset` and having local timezone being more than 4 hours different from UTC indexdb retention calculation could return negative value. This caused indexdb rotation to get in loop. Fix calculation of offset to use `retentionTimezoneOffset` value properly and add test to cover all legit timezone configs. See: - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4207 - https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4206 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Co-authored-by: Nikolay <nik@victoriametrics.com>	2023-05-04 17:16:48 +02:00
Alexander Marshalov	56b84140a9	added new consulagent service discovery (#3953 ) (#4217 )	2023-05-04 11:36:21 +02:00
Alexander Marshalov	2eb27ddb22	max value for `memory.allowedPercent` changed from 200 to 100 (#4171 ) (#4251 ) Signed-off-by: Alexander Marshalov <_@marshalov.org>	2023-05-04 11:34:57 +02:00
justcompile	49b77ec01a	squash commits (#4166 )	2023-05-03 10:51:08 +02:00
Nikolay	4786f036de	lib/backup: fixes path generation for windows (#4133 ) replaces custom fsync function with standard Fsync methods for files. fixes pattern matching for parts and properly generate backup path for local fs. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70	2023-05-03 10:48:53 +02:00
Nikolay	73b6c23271	lib/fs: do not panic at windows at dir deletion (#4132 ) Windows doesn't allow to remove dir with opened files. Usually it's a case for snapshots, hard cannot be removed if file is openned. With this change, dir will be renamed and properly deleted at the next process start. It's recommended to restart vmstorage/vmsingle for snapshots deletion completion periodically. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70	2023-05-03 10:47:02 +02:00
Zakhar Bessarab	bf3b6732bd	lib/promscrape/discovery/kubernetes: add common labels to all ports discovered from endpoints (#4235 ) * lib/promscrape/discovery/kubernetes: add common labels to all ports discovered from endpoints Sets `__meta_kubernetes_endpoints_name` and `__meta_kubernetes_namespace` labels to all ports of pod. Prometheus sets those labels to all ports in pod (`0ab9553611/discovery/kubernetes/endpoints.go (L267C15-L269)`) even if port is not matching any service. See: #4154 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/promscrape/discovery/kubernetes: fix test for updated discovery logic Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2023-05-03 02:17:33 +02:00
Roman Khavronenko	eb746a4dab	Revert "http server: limit max concurrent requests (#4185 )" (#4215 ) This reverts commit `77f76371` Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-04-27 13:02:47 +02:00
Zakhar Bessarab	9e99f2f5b3	lib/streamaggr: discard samples with timestamps outside of aggregation interval (#4199 ) * lib/streamaggr: discard samples with timestamps not matching aggregation interval Samples with timestamps lower than `now - aggregation_interval` are likely to be written via backfilling and should not be used for calculation of aggregation. See #4068 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/streamaggr: make log message more descriptive, fix imports Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2023-04-27 11:59:49 +02:00
Haleygo	03150c8973	lib/opentsdbhttp: fix a typo preventing from using writeconcurrencylimiter (#4208 )	2023-04-27 09:22:42 +02:00
Nikolay	5ee344824f	lib/promscrape: adds filter for consul_sd_configs: (#4184 ) * lib/promscrape: adds filter for consul_sd_configs: it allows advanced filtering for consul service discovery requests https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4183 * typo fix * removes deprecation mentions since it's not relevant * Update docs/CHANGELOG.md Co-authored-by: Roman Khavronenko <roman@victoriametrics.com> --------- Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>	2023-04-26 19:16:27 +02:00
Dmytro Kozlov	bc17f4828c	app/vmagent,lib/persistentqueue: show warning message if `--remoteWrite.maxDiskUsagePerURL` flag lower than 500MB (#4196 ) * app/vmagent,lib/persistentqueue: show warning message if `--remoteWrite.maxDiskUsagePerURL` flag lower than 500MB * app/vmagent,lib/persistentqueue: linter fix * app/vmagent,lib/persistentqueue: fix comment	2023-04-26 13:23:01 +03:00
Yury Molodov	4f3f9950d0	vmui: add metric relabel debug (#3889 ) * feat: add metric relabel debug (#3807) * fix: add link to relabeling cookbook * lib/promrelabel: merge, fix conflicts * lib/promrelabel: fix diff * docs/vmui: add metric relabel playground --------- Co-authored-by: dmitryk-dk <kozlovdmitriyy@gmail.com>	2023-04-26 11:53:29 +03:00
Roman Khavronenko	77f76371d0	http server: limit max concurrent requests (#4185 ) * lib/httpserver: introduce `-http.maxConcurrentRequests` command-line flag Introduce `-http.maxConcurrentRequests` command-line flag to protect VM components from resource exhaustion during unexpected spikes of HTTP requests. By default, the new flag's value is set to 0 which means no limits are applied. Signed-off-by: hagen1778 <roman@victoriametrics.com> * lib/httpserver: mention http.maxConcurrentRequests in docs Signed-off-by: hagen1778 <roman@victoriametrics.com> --------- Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-04-24 14:52:06 +02:00
Zakhar Bessarab	472fe3fd03	lib/httpserver: add handler to serve `/robots.txt` and deny search indexing (#4143 ) This handler will instruct search engines that indexing is not allowed for the content exposed to the internet. This should help to address issues like #4128 when instances are exposed to the internet without authentication.	2023-04-18 16:47:26 +04:00
Aliaksandr Valialkin	2a4c48c59d	lib/{mergeset,storage}: make mustReadPartNames() code more clear	2023-04-14 23:16:59 -07:00
Aliaksandr Valialkin	52006149b2	lib/storage: replace OpenStorage() with MustOpenStorage() Callers of OpenStorage() log the returned error and exit. The error logging and exit can be performed inside MustOpenStorage() alongside with printing the stack trace for better debuggability. This simplifies the code at caller side.	2023-04-14 23:02:40 -07:00
Aliaksandr Valialkin	2a2036160d	lib/storage: fix a bug, which prevents from reading pre-v1.90.0 parts The bug has been introduced in `c0b852d50d`	2023-04-14 22:33:08 -07:00
Aliaksandr Valialkin	3727251910	lib/fs: add MustReadDir() function Use fs.MustReadDir() instead of os.ReadDir() across the code in order to reduce the code verbosity. The fs.MustReadDir() logs the error with the directory name and the call stack on error before exit. This information should be enough for debugging the cause of the error.	2023-04-14 22:10:46 -07:00
Aliaksandr Valialkin	60d92894c5	lib/storage: validate rows in partition.AddRows() only during tests	2023-04-14 20:52:36 -07:00
Aliaksandr Valialkin	df619bdff0	all: consistently use fs.MustClose() for closing lock files	2023-04-14 20:14:21 -07:00
Aliaksandr Valialkin	2a3b19e1d2	lib/fs: convert CreateFlockFile to MustCreateFlockFile Callers of CreateFlockFile log the returned err and exit. It is better to log the error inside the MustCreateFlockFile together with the path to the specified directory and the call stack. This simplifies the code at the callers' side while leaving the debuggability at the same level.	2023-04-14 19:50:01 -07:00
Aliaksandr Valialkin	c0b852d50d	lib/{storage,mergeset}: convert InitFromFilePart to MustInitFromFilePart Callers of InitFromFilePart log the error and exit. It is better to log the error with the path to the part and the call stack directly inside the MustInitFromFilePart() function. This simplifies the code at callers' side while leaving the same level of debuggability.	2023-04-14 15:46:12 -07:00
Aliaksandr Valialkin	9183a439c7	lib/filestream: change Create() to MustCreate() Callers of this function log the returned error and exit. It is better logging the error together with the path to the filename and call stack directly inside the function. This simplifies the code at callers' side without reducing the level of debuggability	2023-04-14 15:12:48 -07:00
Aliaksandr Valialkin	5eb163a08a	lib/filestream: transform Open() -> MustOpen() Callers of this function log the returned error and exit. Let's log the error with the path to the filename and call stack inside the function. This simplifies the code at callers' side without reducing the level of debuggability.	2023-04-14 15:03:42 -07:00
Aliaksandr Valialkin	fda1a54343	lib/fs: improve error logging at ReaderAt.MustReadAt() - Add 'BUG:' prefix to error messages related to programming errors aka bugs. - Consistently log the path to the file in all the messages in order to improve debuggability.	2023-04-14 14:51:06 -07:00
Aliaksandr Valialkin	f341b7b3f8	lib/fs: substitute ReadFullData with MustReadData Callers of ReadFullData() log the error and then exit. So let's log the error with the path to the filename and the call stack inside MustReadData(). This simplifies the code at callers' side, while leaving the debuggability at the same level.	2023-04-14 14:39:29 -07:00
Aliaksandr Valialkin	bd6de6406a	lib/fs: improve error logging inside MustWriteData Log the path to file on errors inside MustWriteData(). This improves debuggability of errors, which may occur inside MustWriteData().	2023-04-14 14:32:45 -07:00
Aliaksandr Valialkin	e0595af2bf	lib/{mergeset,storage}: remove isInMerge flag from parts only when they werent removed yet from the list of active parts This prevents from possible panic during access to pw.p when it is set to nil at partWrapper.decRef() called inside swapSrcWithDstParts()	2023-04-14 00:08:11 -07:00
Aliaksandr Valialkin	9f8209d593	docs/CHANGELOG.md: run at least 4 background mergers on systems with less than 4 CPU cores This reduces the probability of sudden spike in the number of small parts when all the background mergers are busy with big merges.	2023-04-13 23:43:17 -07:00
Aliaksandr Valialkin	550d5c7ea4	lib/{mergeset,storage}: make sure that getFlushToDiskDeadline() takes into account only in-memory parts	2023-04-13 23:43:17 -07:00
Aliaksandr Valialkin	809fbaeaac	lib/fs: add Must prefix to CopyDirectory and CopyFile functions Callers of these functions log the returned error and then exit. Let's log the error with the call stack inside the function itself. This simplifies the code at callers' side, while leaving the same level of debuggability in case of errors.	2023-04-13 23:02:59 -07:00
Aliaksandr Valialkin	780abc3b3b	lib/fs: rename SymlinkRelative to MustSymlinkRelative Callers of this function log the returned error and then exit. Let's log the error with the call stack inside the function itself. This simplifies the code at callers' side, while leaving the same level of debuggability in case of errors.	2023-04-13 22:52:55 -07:00
Aliaksandr Valialkin	5f487ed996	lib/fs: rename HardLinkFiles to MustHardLinkFiles Callers of this function log the returned error and then exit. Let's log the error with the call stack inside the function itself. This simplifies the code at callers' side, while leaving the same level of debuggability in case of errors.	2023-04-13 22:48:07 -07:00
Aliaksandr Valialkin	30425ca81a	lib/fs: rename WriteFileAtomically to MustWriteAtomic Callers of this function log the returned error and exit. So let's just log the error with the given filepath and the call stack inside the function itself and then exit. This simplifies the code at callers' place while leaves the same level of debuggability in case of errors.	2023-04-13 22:41:15 -07:00
Aliaksandr Valialkin	036a7b7365	lib/fs: replace MkdirAllIfNotExist->MustMkdirIfNotExist and MkdirAllFailIfExist->MustMkdirFailIfExist Callers of these functions log the returned error and then exit. The returned error already contains the path to directory, which was failed to be created. So let's just log the error together with the call stack inside these functions. This leaves the debuggability of the returned error at the same level while allows simplifying the code at callers' side. While at it, properly use MustMkdirFailIfExist instead of MustMkdirIfNotExist inside inmemoryPart.MustStoreToDisk(). It is expected that the inmemoryPart.MustStoreToDick() must fail if there is already a directory under the given path.	2023-04-13 22:11:59 -07:00
Aliaksandr Valialkin	344209e5e6	lib/fs: rename MustWriteFileAndSync to MustWriteSync in order to improve readability a bit This is a follow-up for `2a8395be05`	2023-04-13 21:43:32 -07:00
Aliaksandr Valialkin	b15c5961ab	lib/{mergeset,storage}: remove unused `path` field from blockStreamWriter This is a follow-up after `42bba64aa7`	2023-04-13 21:39:59 -07:00
Aliaksandr Valialkin	2a8395be05	lib/fs: replace WriteFileAndSync with MustWriteAndSync When WriteFileAndSync fails, then the caller eventually logs the error message and exits. The error message returned by WriteFileAndSync already contains the path to the file, which couldn't be created. This information alongside the call stack is enough for debugging the issue. So just use log.Panicf("FATAL: ...") inside MustWriteAndSync(). This simplifies error handling at caller side a bit.	2023-04-13 21:33:19 -07:00
Aliaksandr Valialkin	25f089de9d	lib/{mergeset,storage}: properly fsync part directory listing after writing in-memory part to disk This is a follow-up after `42bba64aa7` Previously the part directory listing was fsync'ed implicitly inside partHeader.WriteMetadata() by calling fs.WriteFileAtomically(). Now it must be fsync'ed explicitly. There is no need in fsync'ing the parent directory, since it is fsync'ed by the caller when updating parts.json file.	2023-04-13 21:19:04 -07:00
Aliaksandr Valialkin	42bba64aa7	lib/{mergeset,storage}: explicitly fsync the created part directory listing Previously the created part directory listing was fsynced implicitly when storing metadata.json file in it. Also remove superflouous fsync for part directory listing, which was called at blockStreamWriter.MustClose(). After that the metadata.json file is created, so an additional fsync for the directory contents is needed.	2023-04-13 21:03:08 -07:00
Aliaksandr Valialkin	e1211a1187	app/vmstorage: deprecate -bigMergeConcurrency command-line flag Improperly configured -bigMergeConcurrency command-line flag usually leads to uncontrolled growth of unmerged parts, which, in turn, increases CPU usage and query durations. So it is better deprecating this flag. In rare cases -smallMergeConcurrency command-line flag can be used instead for controlling the concurrency of background merges.	2023-04-13 20:40:24 -07:00
Aliaksandr Valialkin	ca54e58c1f	lib/{fs,persistentqueue}: use filepath.Join() instead of concatenating path parts with `/` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4014	2023-04-13 20:13:45 -07:00
Aliaksandr Valialkin	90b876cd1e	app/vmbackupmanager: sync with enterprise-single-node branch after 41a54c775891c87e3d5ed59ff0769c869dd2fe71	2023-04-13 19:29:06 -07:00
Zakhar Bessarab	81f28f0f1f	lib/backup/actions: store metadata(creation and completion time) in backup files (#4117 ) This makes it easier to understand exact point in time which is included in this backup. Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2023-04-12 18:51:27 +02:00
Haleygo	0ad6010c91	fix sort pendingDateMetricsIDs (#4102 )	2023-04-10 10:23:12 -07:00
Dmytro Kozlov	244c18fa38	app/vmctl: add multiple filters defined in `--vm-native-filter-match` flag to discovered metric names (#4063 ) * app/vmctl: add multiple filters defined in `--vm-native-filter-match` flag to discovered metric names * app/vmctl: fix comments * app/vmctl: move function buildMatchWithFilter to the correct place * app/vmctl: update CHANGELOG.md * app/vmctl: fix CI, remove error wrapping * app/vmctl: fix CI, simplify `Set()`	2023-04-06 15:06:52 -07:00
Aliaksandr Valialkin	593c151831	lib/encoding: fix test after `4725549cb2`	2023-04-05 21:38:37 -07:00
Aliaksandr Valialkin	19b189e9b7	lib/storage: use shorter code after `03bde173b7`	2023-04-02 21:35:52 -07:00
faceair	38fc55976e	lib/storage: fix reuse pendingMetricRow (#4049 )	2023-04-02 21:35:50 -07:00
faceair	f3af8331ec	lib/storage: remove unused code (#4050 )	2023-04-02 21:24:42 -07:00
Aliaksandr Valialkin	f638496298	lib/promscrape: do not re-use previously loaded scrape targets on failed attempt to load updated scrape targets at file_sd_configs The logic employed for re-using the previously loaded scrape target was broken initially. The commit `cc0427897c` tried to fix it, but the new logic became too complex and fragile. So it is better to just remove this logic, since the targets from temporarily broken file should be eventually loaded on next attempts every -promscrape.fileSDCheckInterval This also allows removing fragile hacks around __vm_filepath label. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3989	2023-04-02 21:05:28 -07:00
Dmytro Kozlov	cc0427897c	lib/promscrape: fix the problem with scrape work duplicates when file_sd_config can't be read (#4027 ) * lib/promscrape: fix the problem with scrape work duplicates when file_sd_config can't be read * lib/promscrape: clarified comment * lib/promscrape: made better approach to handle a problem with growing []ScrapeWork on each error when loading config lib/promscrape: added CHANGELOG.md * Update docs/CHANGELOG.md --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-04-02 20:26:13 -07:00
Roman Khavronenko	27b958ba8b	lib/storage: check for free disk space before opening tables (#4035 ) * lib/storage: check for free disk space before opening tables We check for free disk space before call to `openTable`, so `Storage` can be set to ReadOnly before mergeWorkers start. Before the change, there was a chance that merges will start even if Storage has to start in ReadOnly mode because of `-storage.minFreeDiskSpaceBytes` limit. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4023 Signed-off-by: hagen1778 <roman@victoriametrics.com> * lib/storage: chore Signed-off-by: hagen1778 <roman@victoriametrics.com> * Update lib/storage/storage.go --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-03-31 23:50:27 -07:00
Aliaksandr Valialkin	4d00107b92	lib/fs: follow-up for `ec45f1bc5f` Properly close response body before checking for the response code. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4034	2023-03-31 22:42:10 -07:00
Aliaksandr Valialkin	d577657fb7	lib/streamaggr: follow-up for `ff72ca14b9` - Make sure that the last successfully loaded config is used on hot-reload failure - Properly cleanup resources occupied by already initialized aggregators when the current aggregator fails to be initialized - Expose distinct vmagent_streamaggr_config_reload* metrics per each -remoteWrite.streamAggr.config This should simplify monitoring and debugging failed reloads - Remove race condition at app/vminsert/common.MustStopStreamAggr when calling sa.MustStop() while sa could be in use at realoadSaConfig() - Remove lib/streamaggr.aggregator.hasState global variable, since it may negatively impact scalability on system with big number of CPU cores at hasState.Store(true) call inside aggregator.Push(). - Remove fine-grained aggregator reload - reload all the aggregators on config change instead. This simplifies the code a bit. The fine-grained aggregator reload may be returned back if there will be demand from real users for it. - Check -relabelConfig and -streamAggr.config files when single-node VictoriaMetrics runs with -dryRun flag - Return back accidentally removed changelog for v1.87.4 at docs/CHANGELOG.md Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3639	2023-03-31 22:30:38 -07:00

1 2 3 4 5 ...

2118 Commits