Commit Graph

2053 Commits

Author SHA1 Message Date
Aliaksandr Valialkin
43b24164ef
all: add Windows build for VictoriaMetrics
This commit changes background merge algorithm, so it becomes compatible with Windows file semantics.

The previous algorithm for background merge:

1. Merge source parts into a destination part inside tmp directory.
2. Create a file in txn directory with instructions on how to atomically
   swap source parts with the destination part.
3. Perform instructions from the file.
4. Delete the file with instructions.

This algorithm guarantees that either source parts or destination part
is visible in the partition after unclean shutdown at any step above,
since the remaining files with instructions is replayed on the next restart,
after that the remaining contents of the tmp directory is deleted.

Unfortunately this algorithm doesn't work under Windows because
it disallows removing and moving files, which are in use.

So the new algorithm for background merge has been implemented:

1. Merge source parts into a destination part inside the partition directory itself.
   E.g. now the partition directory may contain both complete and incomplete parts.
2. Atomically update the parts.json file with the new list of parts after the merge,
   e.g. remove the source parts from the list and add the destination part to the list
   before storing it to parts.json file.
3. Remove the source parts from disk when they are no longer used.

This algorithm guarantees that either source parts or destination part
is visible in the partition after unclean shutdown at any step above,
since incomplete partitions from step 1 or old source parts from step 3 are removed
on the next startup by inspecting parts.json file.

This algorithm should work under Windows, since it doesn't remove or move files in use.
This algorithm has also the following benefits:

- It should work better for NFS.
- It fits object storage semantics.

The new algorithm changes data storage format, so it is impossible to downgrade
to the previous versions of VictoriaMetrics after upgrading to this algorithm.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3236
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3821
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70
2023-03-19 01:36:51 -07:00
Aliaksandr Valialkin
6460475e3b
lib/{mergeset,storage}: prevent from long wait time when creating a snapshot under high data ingestion rate
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3551
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3873
2023-03-19 00:15:30 -07:00
Aliaksandr Valialkin
a26c6628fd
lib/{fs,mergeset,storage}: substitute os.Open()+os.File.Readdir() with os.ReadDir()
This simplifies code a bit
2023-03-17 21:03:37 -07:00
Zakhar Bessarab
6a5d236245
lib/storage: log original labels set when label value is truncated (#3952)
lib/storage: log original labels set when label value is truncated
2023-03-14 10:59:40 +01:00
Nikolay
927d9da270
lib/storage: correctly handle io.EOF error for pre-fetched metrics (#3946)
io.EOF shouldn't be returned from this function. It breaks all search
API logic and may result in empty query results.
2023-03-11 23:29:43 -08:00
Nikolay
7a3e16e774
lib/netutil: fixes panic at proxy protocol (#3905)
it may occur if non proxy protocol message received by tcp server.
Listener Accept method must return only non-recoverable errors.
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3335
2023-03-07 08:50:18 -08:00
Nikolay
6bfe9cc733
lib{mergset,storage}: prevent possible race condition with logging st… (#3900)
lib{mergset,storage}: prevent possible race condition with logging stats for merges

Previously partwrapper could be release by background process and reference for part may be invalid 
during logging stats. It will lead to panic at vmstorage
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3897
2023-03-03 12:33:42 +01:00
Haleygo
d056be710b
fix some typo (#3898) 2023-03-03 11:02:13 +01:00
Aliaksandr Valialkin
46127b432d
lib/bytesutil: add -internStringDisableCache and -internStringCacheExpireDuration command-line flags
This commit is based on https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3872
2023-02-27 14:16:49 -08:00
Aliaksandr Valialkin
0d3f31f60e
lib/storage: follow-up for 39cdc546dd
- Use flag.Duration instead of flagutil.Duration for -snapshotCreateTimeout,
  since the flagutil.Duration is intended mostly for big durations, e.g. days, months and years,
  while the -snapshotCreateTimeout is usually smaller than one hour.
- Add links to https://docs.victoriametrics.com/#how-to-work-with-snapshots in docs/CHANGELOG.md,
  so readers could easily find the corresponding docs when reading the changelog.
- Properly remove all the created directories on unsuccessful attempt to create
  snapshot in Storage.CreateSnapshot().

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3551
2023-02-27 13:07:38 -08:00
Zakhar Bessarab
39cdc546dd
lib/storage: enhancements for snapshots process (#3873)
* lib/{fs,mergeset,storage}: skip `.must-remove.` dirs when creating snapshot (#3858)

* lib/{mergeset,storage}: add timeout configuration for snapshots creation, remove incomplete snapshots from storage

* docs: fix formatting

* app/vmstorage: add metrics to track status of snapshots

* app/vmstorage: use `vm_http_requests_total` metric for snapshot endpoints metrics, rename new flag to make name more clear

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* app/vmstorage: update flag name in docs

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* app/vmstorage: reflect new metrics names change in docs

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-02-27 12:12:03 -08:00
Zakhar Bessarab
5fadd58cf6
lib/promscrape: correctly register vm_promscrape_config_* metrics (#3876)
* lib/promscrape: set `vm_promscrape_config_last_reload_successful` to 1 if there was no promscrape config provided

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* lib/promscrape: register `vm_promscrape_config_*` metrics only in case promscrape config is used

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-02-27 11:53:53 -08:00
Aliaksandr Valialkin
1a6f2f07fd
lib/httpserver: use github.com/klauspost/compress/gzhttp for compressing http responses
This allows removing gzip-related code from lib/httpserver.
2023-02-27 10:33:43 -08:00
Aliaksandr Valialkin
f7ef80aaad
.golangci.yml: properly enable revive linter and fix all the warnings it detects 2023-02-26 12:18:59 -08:00
Aliaksandr Valialkin
ffa327d6d1
app/vmagent: use the provided auth options when checking whether the remote storage supports VictoriaMetrics remote write protocol
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3847
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1225
2023-02-26 12:07:47 -08:00
Zakhar Bessarab
d8eaa511b0
lib/{fs,mergeset,storage}: skip .must-remove. dirs when creating snapshot (#3858) (#3867) 2023-02-24 12:38:42 -08:00
Aliaksandr Valialkin
c6ad3692ad
lib/promscrape: follow-up for 43e104a83f
- Return immediately on context cancel during the backoff sleep.
  This should help with https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3747

- Add a comment describing why the second attempt to obtain the response from remote side
  is perfromed immediately after the first attempt.

- Remove fasthttp dependency from lib/promscrape/discoveryutils

- Set context deadline before calling doRequestWithPossibleRetry().
  This simplifies the doRequestWithPossibleRetry() a bit.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3293
2023-02-24 12:20:42 -08:00
Zakhar Bessarab
43e104a83f
fix: do not use exponential backoff for first retry of scrape request (#3824)
* fix: do not use exponential backoff for first retry of scrape request (#3293)

* lib/promscrape: refactor `doRequestWithPossibleRetry` backoff to simplify logic

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* Update lib/promscrape/client.go

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>

* lib/promscrape: refactor `doRequestWithPossibleRetry` to make it more straightforward

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
2023-02-24 11:39:56 -08:00
Aliaksandr Valialkin
c734416f86
lib/protoparser: fix golangci-lint warning after f579cac297 2023-02-23 18:50:34 -08:00
Aliaksandr Valialkin
c080443fef
app/vmagent: automatically detect whether the remote storage supports VictoriaMetrics remote write protocol
Substitute -remoteWrite.useVMProto with -remoteWrite.forcePromProto command-line flag,
which can be used for forcing Prometheus remote write protocol in cases when the remote storage
supports VictoriaMetrics remote write protocol.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3847
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1225
2023-02-23 17:36:55 -08:00
Aliaksandr Valialkin
e688121de8
lib/promscrape/discovery/kuma: substitute blocking HTTP call with non-blocking HTTP call at discoveryutils.Client 2023-02-23 15:13:08 -08:00
Mattias Ängehov
6d019a3c37
Azure Service Discovery - Fix token fetch for Container Apps/App Services (#3832)
* Modify API version when running in Container App

* Handle expires on from token response

Response from IMDS does not always contain expires in value which is
currently used to get the token expiry time. An example resources that
doesn't provide it are Container Apps and App Service.

Signed-off-by: Mattias Ängehov <mattias.angehov@castoredc.com>

* Fix client id parameter for user assigned identity

* Apply suggestions from code review

---------

Signed-off-by: Mattias Ängehov <mattias.angehov@castoredc.com>
Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2023-02-22 19:19:53 -08:00
Aliaksandr Valialkin
510f78a96b
all: consistently use http.Method{Get,Post,Put} across the codebase
This is a follow-up after 9dec3c8f80
2023-02-22 18:58:46 -08:00
my-git9
9dec3c8f80
chore: Use http constants to replace numbers (#3846)
Signed-off-by: xin.li <xin.li@daocloud.io>
2023-02-22 18:53:05 -08:00
Aliaksandr Valialkin
9fbd45a22f
lib/promscrape/discovery/kuma: follow-up for 317fef95f9
- Do not generate __meta_server label, since it is unavailable in Prometheus.
- Add a link to https://docs.victoriametrics.com/sd_configs.html#kuma_sd_configs to docs/CHANGELOG.md,
  so users could click it and read the docs without the need to search the corresponding docs.
- Remove kumaTarget struct, since it is easier generating labels for discovered targets
  directly from the response returned by Kuma. This simplifies the code.
- Store the generated labels for discovered targets inside atomic.Value. This allows reading them
  from concurrent goroutines without the need to use mutex.
- Use synchronouse requests to Kuma instead of long polling, since there is a little sense
  in the long polling when the Kuma server may return 304 Not Modified response every -promscrape.kumaSDCheckInterval.
- Remove -promscrape.kuma.waitTime command-line flag, since it is no longer needed when long polling isn't used.
- Set default value for -promscrape.kumaSDCheckInterval to 30s in order to be consistent with Prometheus.
- Remove unnecessary indirections for string literals, which are used only once, in order to improve code readability.
- Remove unused fields from discoveryRequest and discoveryResponse.
- Update tests.
- Document why fetch_timeout and refresh_interval options are missing in kuma_sd_config.
- Add docs to discoveryutils.RequestCallback and discoveryutils.ResponseCallback,
  since these are public types.

Side notes: it is weird that Prometheus implementation for kuma_sd_configs sets `instance` label,
since usually this label is set by the Prometheus itself to __address__ after the relabeling phase.
See https://www.robustperception.io/life-of-a-label/

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3389

See https://github.com/prometheus/prometheus/issues/7919
and https://github.com/prometheus/prometheus/pull/8844
as a reference implementation in Prometheus
2023-02-22 17:51:51 -08:00
Aliaksandr Valialkin
eb08579452
lib/promscrape/discovery: add a comment explaining why duplicates are removed from the generated target labels 2023-02-22 17:51:51 -08:00
Zakhar Bessarab
d2b92d3264
lib/promscrape: fix cancelling in-flight scrape requests during configuration reload (#3853)
* lib/promscrape: fix cancelling in-flight scrape requests during configuration reload (see #3747)

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* lib/promscrape: fix order of params for `doRequestWithPossibleRetry` to follow codestyle

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* lib/promscrape: accept deadline explicitly and extend passed context for local use

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-02-22 17:05:16 +01:00
Alexander Marshalov
317fef95f9
add kuma_sd_config for Kuma Control Plane targets discovery (#3389) (#3840) 2023-02-22 13:59:56 +01:00
Aliaksandr Valialkin
76f2c70be3
app/vmagent: add support for VictoriaMetrics remote write protocol, which allows saving up to 10x on network bandwidth costs under high load
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1225
2023-02-20 19:11:30 -08:00
Aliaksandr Valialkin
5c4f5b83fc
all: rename ParseStream -> stream.Parse
This is a follow-up for 057698f7fb
2023-02-13 10:52:05 -08:00
Aliaksandr Valialkin
ccdddf7996
lib/protoparser/promremotewrite: extract stream parsing code into a separate stream package
This is a follow-up for 057698f7fb
2023-02-13 10:46:54 -08:00
Aliaksandr Valialkin
9be1398b92
lib/protoparser/native: extract stream parsing code into a separate stream package
This is a follow-up for 057698f7fb
2023-02-13 10:43:05 -08:00
Aliaksandr Valialkin
8830607021
lib/protoparser/graphite: extract stream parsing code into a separate stream package 2023-02-13 10:32:36 -08:00
Aliaksandr Valialkin
a646841c07
lib/protoparser/csvimport: extract stream parsing code into a separate stream package
This is a follow-up for 057698f7fb
2023-02-13 10:25:46 -08:00
Aliaksandr Valialkin
7568658c19
lib/protoparser/vmimport: extract stream parsing code into a separate stream package
This is a follow-up for 057698f7fb
2023-02-13 10:20:19 -08:00
Aliaksandr Valialkin
af37717108
lib/protoparser/opentsdbhttp: extract stream parsing code into a separate stream package
This is a follow-up for 057698f7fb
2023-02-13 10:16:03 -08:00
Aliaksandr Valialkin
7720d403c0
lib/protoparser/opentsdb: extract stream parsing code into a separate stream package
This is a follow-up for 057698f7fb
2023-02-13 10:03:16 -08:00
Aliaksandr Valialkin
fe196e0b7a
lib/protoparser/influx: extract stream parsing code into a separate stream package
This is a follow-up for 057698f7fb
2023-02-13 09:58:52 -08:00
Aliaksandr Valialkin
f83d6d69b2
lib/protoparser/datadog: extract stream parsing code into a separate stream package
This is a follow-up for 057698f7fb
2023-02-13 09:51:47 -08:00
Roman Khavronenko
057698f7fb
lib/protoparser/prometheus: move streamparser to subpackage (#3814)
`lib/protoparser/prometheus` is used by various applications,
such as `app/vmalert`. The recent change to the
`lib/protoparser/prometheus` package introduced a new dependency
of `lib/writeconcurrencylimiter` which exposes some metrics.
Because of the dependency, now all applications which have this
dependency also expose these metrics.

Creating a new `lib/protoparser/prometheus/stream` package helps
to remove these metrics from apps which use `lib/protoparser/prometheus`
as dependency.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3761

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-02-13 09:26:07 -08:00
Droxenator
8ea02eaa8e
fixed opentsdbListenAddr timestamp conversion (#3810)
Co-authored-by: Andrei Ivanov <a.ivanov@corp.mail.ru>
2023-02-13 16:07:53 +01:00
Oleksandr Redko
9fff48c3e3
app,lib: fix typos in comments (#3804) 2023-02-13 13:27:13 +01:00
Aliaksandr Valialkin
f9b3409ee3
lib/promscrape/discovery/openstack: use port 80 for the discovered target by default if it isnt specified in the config 2023-02-11 14:41:58 -08:00
Aliaksandr Valialkin
3ec8a4dc80
lib/{mergeset,storage}: allow at least 3 concurrent flushes during background merges on systems with 1 or 2 CPU cores
This should prevent from data ingestion slowdown and query performance degradation
on systems with small number of CPU cores (1 or 2), when big merge is performed.

This should help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3790

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3337
2023-02-11 12:08:52 -08:00
Zakhar Bessarab
f13a255918
lib/promscrape: fix cancelling in-flight scrape requests during configuration reload (#3791)
* lib/promscrape: fix cancelling in-flight scrape requests during configuration reload when using `streamParse` mode (see #3747)

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* Update docs/CHANGELOG.md

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-02-09 11:13:06 -08:00
Aliaksandr Valialkin
a8e88e74cc
lib/backup/azremote: fix after upgrading github.com/Azure/azure-sdk-for-go/sdk/storage/azblob from v0.6.1 to v1.0.0 2023-02-08 09:18:23 -08:00
Karan Sharma
146fd2eca3
sd/nomad: panic in nomad watcher because of nil map (#3784)
properly initialize url.Values
2023-02-08 09:43:29 +01:00
Aliaksandr Valialkin
67b01329a0
lib/writeconcurrencylimiter: initialize concurrencyLimitCh before exporting vm_concurrent_insert_capacity and vm_concurrent_insert_current metrics
This will result in proper calculations for the the alerting rule:

 avg_over_time(vm_concurrent_insert_current[1m]) >= vm_concurrent_insert_capacity

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3761
2023-02-07 11:08:17 -08:00
Aliaksandr Valialkin
8b9ebf625a
lib/promscrape: add a comment explaining the logic behind adding exported_ perfix to metric names
This is a follow-up for 7b87fac8e7

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3557
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3406
2023-02-01 12:00:52 -08:00
Dmytro Kozlov
7b87fac8e7
lib/promscrape: fix honor_labels behavior (#3739)
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-02-01 11:21:44 -08:00
Nikolay
9254e494f9
lib/storage: fixes finalDedup for backfilled data (#3737)
previously historical data backfilling may trigger force merge for previous month every hour
it consumes cpu, disk io and decrease cluster performance.
Following commit fixes it by applying deduplication for InMemoryParts
2023-02-01 09:54:21 -08:00
Aliaksandr Valialkin
ac8bc77688
lib/bytesutil/internstring.go: increase the limit on the maximum string lengths, which can be interned
The limit has been increased from 300 bytes to 500 bytes according to the collected production stats.
This allows reducing CPU usage without significant increase of RAM usage in most practical cases.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3692
2023-01-31 10:56:55 -08:00
Aliaksandr Valialkin
0788be35eb
lib/promscrape/discovery/azure: add __meta_azure_machine_size label in the same way as Prometheus does
See https://github.com/prometheus/prometheus/pull/11650
2023-01-27 17:07:12 -08:00
Aliaksandr Valialkin
ab57b92932
lib/promscrape/discovery/kubernetes: add support for __meta_kubernetes_pod_container_id
See https://github.com/prometheus/prometheus/issues/11843
and https://github.com/prometheus/prometheus/pull/11844
2023-01-27 16:34:06 -08:00
Aliaksandr Valialkin
1b81d8f542
lib/netutil: move IsTrivialNetworkError() function there, since it is used in multiple places across the code 2023-01-27 13:24:30 -08:00
Aliaksandr Valialkin
eedb294754
lib/netutil: typo fix in the error message 2023-01-27 10:38:38 -08:00
Aliaksandr Valialkin
28d92a2f31
lib/netutil: limit the time needed for reading proxy protocol headers
This should prevent from misconfigured proxies and from possible Slowloris-type DoS attacks
(see https://en.wikipedia.org/wiki/Slowloris_(computer_security) )

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3335
2023-01-26 23:46:51 -08:00
Nikolay
73256fe438
lib/netutil: init implimentation of proxy protocol (#3687)
* lib/netutil: init implimentation of proxy protocol
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3335

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-01-26 23:08:35 -08:00
Nikolay
465a285324
lib/storage: properly release parts inMerge lock (#3711)
if storage doesn't have enough disk space, finalDedupWatcher holds inMerge lock for all parts and never release it until storage restart
2023-01-26 08:05:20 -08:00
Aliaksandr Valialkin
d655d6b047
lib/streamaggr: add ability to de-duplicate input samples before aggregation 2023-01-25 09:14:49 -08:00
Roman Khavronenko
c7c4786f3f
discover/ec2: bump API version (#3702)
Switch to the actual API version `2016-11-15`,
since the old version doesn't provide access to all
the fields which implementation expects.
For example, old API missing `zone_id` field
in `DescribeAvailabilityZonesResponse` response.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3700

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-01-24 10:42:55 +01:00
Aliaksandr Valialkin
a971bcc3fe
lib/bytesutil: do not intern long strings, since they may need big amounts of additional memory for the cache
Allow users fine-tuning the maximum string length for interning via -internStringMaxLen command-line flag.
This may be used for fine-tuning RAM vs CPU usage for certain workloads.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3692
2023-01-23 23:36:22 -08:00
Aliaksandr Valialkin
f7acdb13db
app/{vmagent,vminsert}: follow-up for 1cfa183c2b
- Call httpserver.GetQuotedRemoteAddr() and httpserver.GetRequestURI() only when the error occurs.
  This saves CPU time on fast path when there are no parsing errors.
- Create a helper function - httpserver.LogError() - for logging the error with the request uri and remote addr context.
2023-01-23 22:26:53 -08:00
Artem Navoiev
1cfa183c2b
add error handler for parsing prometheus text format to vmagent and v… (#3693)
* add error handler for parsing prometheus text format to vmagent and vminsert

Signed-off-by: Artem Navoiev <tenmozes@gmail.com>

* fix typo

Signed-off-by: Artem Navoiev <tenmozes@gmail.com>

* typo

Signed-off-by: Artem Navoiev <tenmozes@gmail.com>

* fix variables naming and error message

Signed-off-by: Artem Navoiev <tenmozes@gmail.com>

Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2023-01-23 22:14:34 -08:00
Aliaksandr Valialkin
babecd8363
lib/promscrape: follow-up for 393876e52a
- Document the change in docs/CHANGELOG.md
- Reduce memory usage when sending stale markers even more by parsing the response in stream parsing mode
- Update the TestSendStaleSeries

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3668
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3675
2023-01-23 21:52:59 -08:00
Roman Khavronenko
393876e52a
lib/promscrape: limit number of sent stale series at once (#3686)
Stale series are sent when there is a difference between current
and previous scrapes. Those series which disappeared in the current scrape
are marked as stale and sent to the remote storage.

Sending stale series requires memory allocation and in case when too many
series disappear in the same it could result in noticeable memory spike.
For example, re-deploy of a big fleet of service can result into
excessive memory usage for vmagent, because all the series with old
pod name will be marked as stale and sent to the remote write storage.

This change limits the number of stale series which can be sent at once,
so memory usage remains steady.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3668
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3675
Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-01-23 21:15:59 -08:00
Aliaksandr Valialkin
2c4e384f07
lib/promscrape: properly log the actual response size after c4229a1bba 2023-01-23 21:04:50 -08:00
Aliaksandr Valialkin
ba5a6c851c
lib/storage: use deterministic random generator in tests
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3683
2023-01-23 20:10:32 -08:00
Aliaksandr Valialkin
1a3a6ef907
lib/mergeset: use deterministic random generator in tests
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3683
2023-01-23 19:43:49 -08:00
Aliaksandr Valialkin
7030429958
lib/mergeset: fix data race in BenchmarkInmemoryBlockMarshal 2023-01-23 19:43:18 -08:00
Aliaksandr Valialkin
a11dc6689a
lib/decimal: use consistent randomizer in tests
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3683
2023-01-23 19:23:39 -08:00
Aliaksandr Valialkin
0a4d8dc777
lib/uint64set: use repeatable randomizer in tests
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3683
2023-01-23 19:22:58 -08:00
Aliaksandr Valialkin
3d1cb011b6
lib/encoding: make deterministic tests which rely on math/rand
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3683
2023-01-23 18:41:09 -08:00
Tobias Jungel
777038fe44
app/vmbackup: prevent password leaks (#3672)
This prevents vmbackup from leaking passwords into logs like shown below.

2023-01-11T15:00:01.050Z        info    VictoriaMetrics/lib/logger/flag.go:12   build version: vmbackup-20221214-211706-tags-v1.85.1-0-g09a70d3e9
2023-01-11T15:00:01.050Z        info    VictoriaMetrics/lib/logger/flag.go:13   command-line flags
2023-01-11T15:00:01.050Z        info    VictoriaMetrics/lib/logger/flag.go:20     -dst="fs:///vm-backups/latest"
2023-01-11T15:00:01.050Z        info    VictoriaMetrics/lib/logger/flag.go:20     -snapshot.createURL="http://user:super_sercret123@victoriametricspshot/create"
2023-01-11T15:00:01.050Z        info    VictoriaMetrics/lib/logger/flag.go:20     -storageDataPath="/storage"
2023-01-11T15:00:01.050Z        info    VictoriaMetrics/app/vmbackup/main.go:53 Snapshot create url http://user:super_sercret123@victoriametrics:8428/snapshot/create
2023-01-11T15:00:01.050Z        info    VictoriaMetrics/app/vmbackup/main.go:60 Snapshot delete url http://user:super_sercret123@victoriametrics:8428/snapshot/delete
2023-01-18 11:35:21 -08:00
Aliaksandr Valialkin
2ac530eb28
lib/{storage,mergeset}: wake up background merges as soon as there is a potential work for them
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3647
2023-01-18 01:10:18 -08:00
Aliaksandr Valialkin
b8409d6600
lib/{storage,mergeset}: do not run assisted merges when flushing pending samples to parts
Assisted merges are intended to be performed by goroutines, which accept the incoming samples,
in order to limit the data ingestion rate.

The worker, which converts pending samples to parts, shouldn't be penalized by assisted merges,
since this may result in increased number of pending rows as seen at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3647#issuecomment-1385039142
when the assisted merge takes too much time.
2023-01-18 00:20:58 -08:00
Aliaksandr Valialkin
1ac025bbc9
lib/storage: use better naming for a function returning new []rawRows - newRawRowsBlock() -> newRawRows() 2023-01-18 00:01:03 -08:00
Aliaksandr Valialkin
68463c9e87
lib/promscrape: follow-up for d79f1b106c
- Document the fix at docs/CHANGELOG.md
- Limit the concurrency for sendStaleMarkers() function in order to limit its memory usage
  when big number of targets disappear and staleness markers are sent
  for all the metrics exposed by these targets.
- Make sure that the writeRequestCtx is returned to the pool
  when there is no need to send staleness markers.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3668
2023-01-17 23:11:56 -08:00
lzfhust
d79f1b106c
using writeRequestCtxPool when delete kubernetes clusters from kubernetes_sd_configs (#3669) 2023-01-17 22:57:56 -08:00
Zakhar Bessarab
322d96bfe5
discovery/{consul,nomad}: fix cancelling serviceWatcher in-flight requests (#3658)
* lib/promscrape/discovery/{consul,nomad}: fix background service update watches not canceling requests on serviceWatcher stop

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* lib/promscrape/discovery/{consul,nomad}: fix closing serviseWatcher during scrape job restart

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* wip

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3468

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-01-17 21:47:11 -08:00
Scott Kevill
46b3b76d6d
lib/fs: use unix.Statfs() / unix.Statvfs() when using a path (#3663) 2023-01-17 21:19:26 -08:00
Aliaksandr Valialkin
289af65071
lib/promscrape: properly apply series limit
Fix the following issues:

- Series limit wasn't applied when staleness tracking was disabled.
- Series limit didn't prevent from sending staleness markers for new series exceeding the limit.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3660

Thanks to @hagen1778 for the initial attempt to fix the issue
at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3665
2023-01-17 10:14:49 -08:00
Aliaksandr Valialkin
09d7fa2737
lib/{mergeset,storage}: do not slow down concurrently executed queries during assisted merges
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3647
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3641
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/291
2023-01-16 14:31:52 -08:00
Nikolay
20f28eb9d6
/lib/promscrape: use correct err logger for scrape unmarshalling (#3645)
/lib/promscrape: use correct err logger for scrape unmarshalling
It correctly suppresses scrape errors and adds correct context for err msg
2023-01-12 17:40:42 +01:00
Aliaksandr Valialkin
e2498af530
lib/promscrape: log the number of unsuccessful scrapes during the last -promscrape.suppressScrapeErrorsDelay
This commit is based on https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3413
Thanks to @jelmd for the pull request.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2575
2023-01-12 01:09:32 -08:00
Aliaksandr Valialkin
ec23ab6bc2
lib/promscrape/discovery: missing changes after b4ad3a3b4c 2023-01-11 23:02:45 -08:00
Aliaksandr Valialkin
b4ad3a3b4c
lib/promscrape: follow-up for 8537533beb
- Add a comment describing the purpose of the `role` field inside `apiConfig` struct
- Revert changes at lib/promscrape/discovery/dockerswarm/dockerswarm.go ,
  since they reduce code readability. E.g. the reader needs to look up the named string constants
  in order to get their values.
2023-01-11 22:54:18 -08:00
Zakhar Bessarab
8537533beb
lib/promscrape/discovery/dockerswarm: fix discovery filters being applied to all objects (#3632)
* lib/promscrape/discovery/dockerswarm: fix discovery filters being applied to all objects

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* Update docs/CHANGELOG.md

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-01-11 22:50:34 -08:00
Aliaksandr Valialkin
95ce1ba6ce
lib/httpserver: directly pass flag value to CheckAuthFlag()
There is no sense in passing a pointer to flag value there.

This is a follow-up for 4225a0bd75
2023-01-10 15:52:23 -08:00
Zakhar Bessarab
4225a0bd75
Use httpAuth.* flags as a fallback for endpoints protected by *AuthKey flags (#3582)
* {lib/server, app/}: use `httpAuth.*` flag as fallback for `*AuthKey` if it is not set

* lib/ingestserver/opentsdbhttp: fix opentdb HTTP handler not respecting `httpAuth.*` flags

* Apply suggestions from code review

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-01-10 15:46:13 -08:00
Aliaksandr Valialkin
cbe62f23ba
lib/promscrape/discovery/gce: follow-up for b2ccdaaa2f
- Use promutils.Labels.GetLabels() instead of comparing promutils.Labels.Labels to nil.
  This make the code more consistent with other places.

- Mention the release where the issue has been introduced at docs/CHANGELOG.md.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3624
2023-01-10 13:51:03 -08:00
Zakhar Bessarab
b2ccdaaa2f
lib/promscrape/discovery/gce: fix crash in case instance does not have any labels set (#3625)
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-01-10 11:07:11 +01:00
Aliaksandr Valialkin
2f3ddd4884
app/vmselect/promql: avoid memory allocations and copying from source timeseries to the returned result at timeseriesToResult() 2023-01-09 22:38:59 -08:00
Aliaksandr Valialkin
7afcca0c51
all: use metricsql.CompileRegexp instead of regexp.Compile for compiling regexps used in graphite queries
This should speed up repeated queries, since metricsql.CompileRegexp returns regexps from the cache
on subsequent calls for the same input regexp.
2023-01-09 21:43:08 -08:00
Aliaksandr Valialkin
e5eca54951
lib/promscrape/discovery/nomad: sync nomad_sd_configs fields with the Prometheus implementation
See the list of configs supported by Prometheus at f88a0a7d83/discovery/nomad/nomad.go (L76-L84)

- Removed "token" option. In can be set either via NOMAD_TOKEN env var or via `bearer_token` config option.
- Removed "scheme" option. It is automatically detected depending on whether the `tls_config` is set.
- Removed "services" and "tags" options, since they aren't supported by Prometheus.
- Added "region" option. If it is missing, then the region is read from NOMAD_REGION env var.
  If this var is empty, then it is set to "global" in the same way as Nomad client does.
  See 865ee8d37c/api/api.go (L297)
  and 865ee8d37c/api/api.go (L555-L556)
- If the "server" option is missing, then it is read from NOMAD_ADDR in the same way
  as Nomad client does - see 865ee8d37c/api/api.go (L294-L296)

This is a follow-up for 8aee209c53

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3367
2023-01-09 21:14:48 -08:00
Roman Khavronenko
8aee209c53
lib/promscrape: remove datacenter field from nomad_sd_config (#3612)
Looks like `datacenter` field isn't part of `/v1/services` API.
See https://developer.hashicorp.com/nomad/api-docs/services#list-services
and https://developer.hashicorp.com/nomad/api-docs/services#read-service

Related issues:
https://github.com/traefik/traefik/issues/9109
https://github.com/prometheus/prometheus/issues/11776

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-01-09 09:07:40 +01:00
Aliaksandr Valialkin
28f8dc41b0
lib/promscrape/discoveryutils: cleanup after 5df9fddaf2
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3468
2023-01-07 01:26:54 -08:00
Zakhar Bessarab
5df9fddaf2
lib/promscrape/discoveryutils: use correct timeout for blocking requests (#3609)
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-01-07 01:13:03 -08:00
Aliaksandr Valialkin
41e00a0df7
lib/storage: simplify the fix from 488940502c
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3566
2023-01-07 01:04:43 -08:00
Dmytro Kozlov
488940502c
lib/storage: fix returning camelcase label names (#3608)
* lib/storage: fix returning camelcase label names

* doc: add change log

* Update docs/CHANGELOG.md

* Update docs/CHANGELOG.md

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-01-07 00:50:14 -08:00
Aliaksandr Valialkin
5fe7ff24c2
lib/streamaggr: limit the the number of concurrent flushes of the aggregate data to the exact number of available CPUs
This should reduce the maximum memory usage during concurrent flushes of the aggregate data
2023-01-07 00:18:51 -08:00
Aliaksandr Valialkin
ad5bfe3089
lib/promscrape: reduce the number of concurrently executed processScrapedData calls from 2x of the number of CPUs to the number of CPUs
This should reduce the maximum memory usage for processScrapedData() function by 2x.
The only part, which can be IO-bound in the processScrapedData() is pushData() call,
when it buffers data to persistent queue if the remote storage cannot keep up
with the data ingestion speed. In this case it is OK if the scrape pace will be limited.
2023-01-07 00:14:30 -08:00
Aliaksandr Valialkin
af263fe881
all: small improvements in error messages and command-line flag descriptions related to concurrency limiters 2023-01-07 00:11:44 -08:00
Aliaksandr Valialkin
45f39e291e
lib/writeconcurrencylimiter: moved the error generation from incConcurrency() to the caller place 2023-01-06 23:45:58 -08:00
Aliaksandr Valialkin
986a05e18d
lib/promscrape: limit the concurrency during parsing and relabeling the scraped samples
This should reduce memory usage when scraping big number of targets,
since this limits the summary memory usage during concurrent parsing and relabeling
by the number of available CPU cores.
2023-01-06 22:59:17 -08:00
Aliaksandr Valialkin
5c4bd4f7c1
lib/streamaggr: limit the number of concurrent flushes of aggregate metrics in order to limit memory usage 2023-01-06 22:39:13 -08:00
Aliaksandr Valialkin
c63755c316
lib/writeconcurrencylimiter: improve the logic behind -maxConcurrentInserts limit
Previously the -maxConcurrentInserts was limiting the number of established client connections,
which write data to VictoriaMetrics. Some of these connections could be idle.
Such connections do not consume big amounts of CPU and RAM, so there is a little sense in limiting
the number of such connections. So now the -maxConcurrentInserts command-line option
limits the number of concurrently executed insert requests, not including idle connections.

It is recommended removing -maxConcurrentInserts command-line option, since the default value
for this option should work good for most cases.
2023-01-06 22:20:19 -08:00
Aliaksandr Valialkin
463b957e54
lib/promscrape/discovery/{consul,nomad}: wait until the deleted serviceWatchers are stopped inside updateServices() call
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3468
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3367
2023-01-05 21:52:33 -08:00
Aliaksandr Valialkin
f392913d00
lib/promscrape: follow-up after bced9fb978
- Document the bugfix at docs/CHANGELOG.md
- Wait until all the worker goroutines are done in consulWatcher.mustStop()
- Do not log `context canceled` errors when discovering consul serviceNames
- Removed explicit handling of gzipped responses at lib/promscrape/discoveryutils.Client,
  since this handling is automatically performed by net/http.Transport.
  See DisableCompression option at https://pkg.go.dev/net/http#Transport .
- Remove explicit handling of the proxyURL, since it is automatically handled
  by net/http.Transport. See Proxy option at https://pkg.go.dev/net/http#Transport .
- Expliticly set MaxIdleConnsPerHost, since its default value equals to 2.
  Such a small value may result in excess tcp connection churn
  when more than 2 concurrent requests are processed by lib/promscrape/discoveryutils.Client.
- Do not set explicitly the `Host` request header, since it is automatically set by net/http.Client.
- Backport the bugfix to the recently added nomad_sd_configs - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3367

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3468
2023-01-05 21:13:06 -08:00
Zakhar Bessarab
bced9fb978
lib/promscrape/discoveryutils: switch to native http client from fasthttp (#3568) 2023-01-05 19:34:47 -08:00
Roman Khavronenko
5bdd880142
vmstorage: add more context to the flock acquiring msg (#3584)
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3578

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-01-05 18:30:42 -08:00
Aliaksandr Valialkin
9f348cf8a1
lib/promscrape/discovery/nomad: follow-up after 48f371a46c
- Remove undocumented `username` and `password` config options from `nomad_sd_config`.
  TODO: probably, remove these options from `consul_sd_config` too?
  These options exist there for backwards compatibility purposes.

- Add __meta_nomad_service_alloc_id and __meta_nomad_service_job_id meta-labels
  These labels contain AllocID and JobID fields for the discovered Nomad services.

- Various typo fixes.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3367
2023-01-05 18:07:20 -08:00
Aliaksandr Valialkin
1a28f0e5b3
lib/promrelabel: pass query args via query string at /metric-relabel-debug and /target-relabel-debug pages if their length doesnt exceed 1000
This allows copy-n-pasting the url to another browser window and seeing the same result.

The limit in 1000 chars is selected in order to prevent from potential issues with systems
which limit the url length such as Internet Explorer - see https://stackoverflow.com/questions/812925/what-is-the-maximum-possible-length-of-a-query-string

If the limit is exceeded, then query args are sent via POST method and aren't visible in the url.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3580
2023-01-05 16:48:04 -08:00
Karan Sharma
48f371a46c
lib/promscrape: add Prometheus-compatible service discovery for Nomad (#3549)
Add nomad_sd_config support for service discovery
2023-01-05 23:03:58 +01:00
Zakhar Bessarab
185cdcd813
lib/promscrape/discovery/dockerswarm: fix query encoding of filters (#3586)
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-01-05 03:34:25 -08:00
Aliaksandr Valialkin
0dea3b71da
lib/promscrape: pre-fetch metric_relabel_configs rules when debugging metric relabeling for a particular target
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3407
2023-01-05 03:26:49 -08:00
Aliaksandr Valialkin
a1076abcbf
lib/promscrape: follow-up for a7e29c38bc
- Document the bugfix at docs/CHANGELOG.md
- Make the fix more durable against future changes when droppedTargetsMap.Register may be called from other places.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3580
2023-01-05 02:52:08 -08:00
Zakhar Bessarab
a7e29c38bc
lib/promscrape/targetstatus: fix crash during droppedTarget registration (#3595)
* lib/promscrape/targetstatus: fix crash during droppedTarget registration in case original labels are not present

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* lib/promscrape/targetstatus: address review comment

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-01-05 02:39:31 -08:00
Aliaksandr Valialkin
0e1f0ade31
lib/streamaggr: sort by and without labels in the aggregate output metric name
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3460
2023-01-05 02:08:44 -08:00
Aliaksandr Valialkin
66947ee5a2
lib/streamaggr: remove unused fields 2023-01-04 13:33:46 -08:00
Aliaksandr Valialkin
5bca3a5be2
app/vmselect: remove dependency on lib/promscrape from app/vmselect 2023-01-03 23:28:27 -08:00
Aliaksandr Valialkin
fa13bbc48a
app/{vmagent,vminsert}: add support for streaming aggregation
See https://docs.victoriametrics.com/stream-aggregation.html

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3460
2023-01-03 22:19:21 -08:00
Aliaksandr Valialkin
add2c4bf07
lib/bytesutil: add InternBytes() function as a shortcut to InternString(ToUnsafeString(..)) 2023-01-03 22:16:22 -08:00
Aliaksandr Valialkin
7b264b0c23
lib/promrelabel: allow calling Match on nil IfExpression
This simplifies the caller side of IfExpression
2023-01-03 21:44:03 -08:00
Roman Khavronenko
2cedb3e883
csvimport: support empty values (#3565)
Before, if the imported line contained multiple metrics and one
or more of them had an empty values - the whole line was ignored.

Now, only metrics with empty values are ignored, and the rest
of the metrics are accepted successfully.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3540

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-12-29 11:52:10 -08:00
Aliaksandr Valialkin
c4229a1bba
lib/promscrape: log the actual response size in the error message when the response size exceeds -promscrape.maxScrapeSize
This is a follow-up for 7ad9fff7e5
2022-12-28 14:42:11 -08:00
Aliaksandr Valialkin
1b16118e17
lib/{storage,mergeset}: tune the threshold for assisted merge
The https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3425#issuecomment-1359117221
reveals that CPU usage for incoming queries may significantly increase when the number
of in-memory parts becomes too big.

This commit reduces the maximum number of in-memory parts before starting the assisted merge
during data ingestion. This should reduce CPU usage for incoming queries,
since they need to inspect lower number of in-memory parts.

This should help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3425
2022-12-28 14:39:24 -08:00
Clément Nussbaumer
7ad9fff7e5
fix(promscrape): check MaxScrapeSize after gzip decompression (#3550) 2022-12-28 12:19:41 -08:00
Aliaksandr Valialkin
293dda7169
lib/snapshot: improve log message on unexpected status code during attempts to create or delete snapshots
Use "unexpected status code returned from %q: %d; expecting %d" log message format
instead of less clear format "unexpected status code returned from %q; expecting %d; got %d"

This is a follow-up for c612bb165e
2022-12-28 11:41:50 -08:00
Zakhar Bessarab
c612bb165e
lib/snapshot: fix error message format for failed HTTP request (#3559) 2022-12-28 18:04:11 +01:00
Aliaksandr Valialkin
0076422350
lib/promscrape/discovery/azure: typo fix 2022-12-21 21:25:16 -08:00
Aliaksandr Valialkin
fa236c5a84
lib/promrelabel: make fmt after d3de110070 2022-12-21 20:24:57 -08:00
Aliaksandr Valialkin
31886aef3d
lib/promrelabel: add support for keepequal and dropequal relabeling actions
These actions are supported by Prometheus starting from v2.41.0

See https://github.com/prometheus/prometheus/pull/11564 ,
https://github.com/prometheus/prometheus/issues/11556
and https://github.com/prometheus/prometheus/issues/3756

Side note:

It's a pity that Prometheus developers decided inventing `keepequal` and `dropequal`
relabeling actions instead of adding support for `keep_if_equal` and `drop_if_equal` relabeling
actions supported by VictoriaMetrics since June 2020 - see 2a39ba639d .
2022-12-21 20:04:55 -08:00
Aliaksandr Valialkin
3300546eab
lib/bytesutil: make sure that the cleanup code is performed only by a single goroutine out of many concurrently running goroutines
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3466
2022-12-21 13:07:24 -08:00
Zakhar Bessarab
4be4645142
app/vmbackupmanager: add metrics for better observability (#488)
* app/vmbackupmanager: add metrics for better observability, include more information to `/api/v1/backups` API call response

* app/vmbackupmanager: drop old metrics before creating new ones

* app/vmbackupmanager: use `_total` postfix for counter metrics

* app/vmbackupmanager: remove `_total` postfix for gauge-like metrics

* app/vmbackupmanager: add `_last_run_failed` metrics for backups and retention

* app/vmbackupmanager: address review feedback

* app/vmbackupmanager: fix metric name

* app/vmbackupmanager: address review feedback, remove background updates of metrics, add restoring state of `_last_run_failed` metric from remote storage

* app/vmbackupmanager: improve performance for backup size calculation

* app/vmbackupmanager: refactor backup and retention runs to deduplicate each run logic

* {app/vmbackupmanager,lib/formatutil}: move HumanizeBytes into lib package

* app/vmbackupmanager: fix creating new metrics instead of reusing existing ones

* lit/formatutil: add comment to make linter happy

* app/vmbackupmanager: address review feedback
2022-12-20 14:18:06 -08:00
Aliaksandr Valialkin
4e55b67a44
lib/storage: clear the err if it is set to io.EOF when searching for the TSID by metricID
This is expected error after when recently added indexdb data isn't available for search yet
or wasn't flushed to disk after unclean shutdown of VictoriaMetrics.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3515
2022-12-20 14:05:29 -08:00
Aliaksandr Valialkin
944effca54
lib/storage: do not check for the result returned by db.doExtDB() where this isn't necessary
This simplifies the code a bit
2022-12-19 13:23:13 -08:00
Aliaksandr Valialkin
0bf3ae9559
lib/promscrape/discovery/consul: expose service tags in individual labels __meta_consul_tag_<tagname>
This simplifies copying service tags to target labels with the following relabeling rule:

- action: labelmap
  regex: __meta_consul_tag_(.+)

See https://stackoverflow.com/questions/44339461/relabeling-in-prometheus
2022-12-19 13:08:11 -08:00
Aliaksandr Valialkin
6c98b56935
lib/storage: search for TSIDs for the given metricIDs in the previous indexdb if they aren't found in the current indexdb
The issue triggers after the indexdb rotation for time series, which stop receiving new samples.
This results in missing data for such time series in query responses.

This commit should address the https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3502

The issue has been introduced in 2dd93449d8
2022-12-19 12:03:09 -08:00
Aliaksandr Valialkin
dc0b08efb0
lib/storage: optimize partSearch.searchBHS() for common case when the TSID for the current block header is bigger or equal to the current tsid
This should help improving performance at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3425
2022-12-19 10:28:03 -08:00
Aliaksandr Valialkin
057fb2120b
lib/storage: properly set buf capacity inside marshalMetricID
Previously it was always set to 0. In theory this could result into incorrect marshaling
of metricIDs.

The issue has been introduced in 5e4dfe50c6
2022-12-19 10:14:38 -08:00
Aliaksandr Valialkin
4cb83f0f4a
lib/logger: follow-up for 72f8fce107
- Document the change at docs/CHANELOG.md
- Log fatal errors if the -loggerJSONFields contains unexpected values
- Rename -loggerJsonFields to -loggerJSONFields for the sake of consistency naming commonly used in Go

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2348
2022-12-16 17:42:07 -08:00
Michal Kralik
72f8fce107
lib/logger: support for renaming json fields (#3488) 2022-12-16 17:26:32 -08:00
Aliaksandr Valialkin
65f8fc527f
lib/promscrape: stop dropping metric name if relabeling rules do not instruct to do this on the /metric-relabel-debug page 2022-12-16 17:02:41 -08:00
Aliaksandr Valialkin
ad8852759d
lib/storage: skip missing tsids in the current block header by using binary search
This improves performance by up to 10x when big number of the requested TSIDs
are missing in the searched parts.

This should help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3425
2022-12-14 22:06:51 -08:00
Aliaksandr Valialkin
4de9d35458
lib/flagutil/bytes.go: properly handle values bigger than 2GiB on 32-bit architectures
This fixes handling of values bigger than 2GiB for the following command-line flags:

- -storage.minFreeDiskSpaceBytes
- -remoteWrite.maxDiskUsagePerURL
2022-12-14 19:26:31 -08:00
Aliaksandr Valialkin
5d30080555
lib/flagutil: support for TB and TiB suffixes for command-line flags, which accept byte sizes 2022-12-14 17:52:32 -08:00
Zakhar Bessarab
a50120a212
lib/backup/azremote: fix copying for parts larger than 256M by using async copy (#3479)
* lib/backup/azremote: fix copying for parts larger than 256M by using async copy

* lib/backup/azremote: add description of an error for log message
2022-12-13 09:32:57 -08:00
Aliaksandr Valialkin
0d41d933e9
lib/mergeset: reduce the parts threshold before starting assisted merges
This should improve query speed in general case.

This is a follow-up for d1af6046c7
2022-12-13 09:13:49 -08:00
Aliaksandr Valialkin
d1af6046c7
lib/{mergeset,storage}: do not block small merges by pending big merges - assist with small merges instead
Blocked small merges may result into big number of small parts, which, in turn,
may result in increased CPU and memory usage during queries, since queries need to inspect
all the existing small parts.

The issue has been introduced in 8189770c50

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3337
2022-12-12 17:00:50 -08:00
Aliaksandr Valialkin
3b18931050
lib/bytesutil: cache results for all the input strings, which were passed during the last 5 minutes from FastStringMatcher.Match(), FastStringTransformer.Transform() and InternString()
Previously only up to 100K results were cached.
This could result in sub-optimal performance when more than 100K unique strings were actually used.
For example, when the relabeling rule was applied to a million of unique Graphite metric names
like in the https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3466

This commit should reduce the long-term CPU usage for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3466
after all the unique Graphite metrics are registered in the FastStringMatcher.Transform() cache.

It is expected that the number of unique strings, which are passed to FastStringMatcher.Match(),
FastStringTransformer.Transform() and to InternString() during the last 5 minutes,
is limited, so the function results fit memory. Otherwise OOM crash can occur.
This should be the case for typical production workloads.
2022-12-12 14:41:13 -08:00
Aliaksandr Valialkin
7ae744fce6
lib/protoparser/datadog: do not re-use previously parsed field values if they are missing in the currently parsed message
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3432
2022-12-11 13:09:25 -08:00
Aliaksandr Valialkin
a30ae502ef
lib/promscrape: allow editing relabeling configs and labels at /target-relabel-debug page
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3407
2022-12-10 12:44:45 -08:00
Aliaksandr Valialkin
a8b8e23d68
lib/promscrape: implement target-level and metric-level relabel debugging
Target-level debugging is performed by clicking the 'debug' link at the corresponding target
on either http://vmagent:8429/targets page or on http://vmagent:8428/service-discovery page.

Metric-level debugging is perfromed at http://vmagent:8429/metric-relabel-debug page.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3407

See https://docs.victoriametrics.com/vmagent.html#relabel-debug
2022-12-10 02:09:44 -08:00
Aliaksandr Valialkin
2406c0dcfd
docs/CHANGELOG.md: document the bugfix at 05b42601c3
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3247
2022-12-08 18:35:28 -08:00
Zakhar Bessarab
05b42601c3
lib/promscrape/discovery/azure: remove API server from URL returned by azure (#3403)
* lib/promscrape/discovery/azure: remove API server from URL returned by azure

* lib/promscrape/discovery/azure: validate nextLink contains same URL as apiServer
2022-12-08 18:29:10 -08:00
Aliaksandr Valialkin
8434aa142d
lib/querytracer: fix remaining tests after 49ebc48809 2022-12-08 18:18:06 -08:00
Aliaksandr Valialkin
5b9e6b9d24
lib/storage: follow-up after 7c0ae3a86a
- Update docs at https://docs.victoriametrics.com/#deduplication
- Optimize the deduplication loop a bit

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3333
2022-12-08 18:16:57 -08:00
Roman Khavronenko
7c0ae3a86a
lib/storage: keep sample with the biggest value on timestamp conflict (#3421)
The change leaves raw sample with the biggest value for identical
timestamps per each `-dedup.minScrapeInterval` discrete interval
when the deduplication is enabled.

```
benchstat old.txt new.txt
name                                         old time/op    new time/op    delta
DeduplicateSamples/minScrapeInterval=1s-10      817ns ± 2%     832ns ± 3%      ~     (p=0.052 n=10+10)
DeduplicateSamples/minScrapeInterval=2s-10     1.56µs ± 1%    2.12µs ± 0%   +35.19%  (p=0.000 n=9+7)
DeduplicateSamples/minScrapeInterval=5s-10     1.32µs ± 3%    1.65µs ± 2%   +25.57%  (p=0.000 n=10+10)
DeduplicateSamples/minScrapeInterval=10s-10    1.13µs ± 2%    1.50µs ± 1%   +32.85%  (p=0.000 n=10+10)

name                                         old speed      new speed      delta
DeduplicateSamples/minScrapeInterval=1s-10   10.0GB/s ± 2%   9.9GB/s ± 3%      ~     (p=0.052 n=10+10)
DeduplicateSamples/minScrapeInterval=2s-10   5.24GB/s ± 1%  3.87GB/s ± 0%   -26.03%  (p=0.000 n=9+7)
DeduplicateSamples/minScrapeInterval=5s-10   6.22GB/s ± 3%  4.96GB/s ± 2%   -20.37%  (p=0.000 n=10+10)
DeduplicateSamples/minScrapeInterval=10s-10  7.28GB/s ± 2%  5.48GB/s ± 1%   -24.74%  (p=0.000 n=10+10)
```

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3333
Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-12-08 18:06:11 -08:00
Aliaksandr Valialkin
3019ec3da6
lib/querytracer: fix tests after 49ebc48809 2022-12-08 17:21:38 -08:00
Aliaksandr Valialkin
56b8980915
lib/promscrape: allow using sample_limit and series_limit options in stream parsing mode
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3458
2022-12-08 16:33:38 -08:00
Aliaksandr Valialkin
49ebc48809
lib/querytracer: put the version of VictoriaMetrics in the first message of query trace
This should simplify further debugging, since the first thing to start the debugging by query trace
is to know the version of VictoriaMetrics, which produced this trace.
2022-12-07 09:46:39 -08:00
Pedro Gonçalves
1e0666abb4
Datadog - Add device as a tag if it's present as a field in the series object (#3431)
* Datadog - Add device as a tag if it's present as a field in the series object

* address PR comments
2022-12-05 23:06:03 -08:00
Aliaksandr Valialkin
d99d222f0a
lib/{storage,mergeset}: log the duration for flushing in-memory parts on graceful shutdown 2022-12-05 21:30:48 -08:00
Aliaksandr Valialkin
8189770c50
all: add -inmemoryDataFlushInterval command-line flag for controlling the frequency of saving in-memory data to disk
The main purpose of this command-line flag is to increase the lifetime of low-end flash storage
with the limited number of write operations it can perform. Such flash storage is usually
installed on Raspberry PI or similar appliances.

For example, `-inmemoryDataFlushInterval=1h` reduces the frequency of disk write operations
to up to once per hour if the ingested one-hour worth of data fits the limit for in-memory data.

The in-memory data is searchable in the same way as the data stored on disk.
VictoriaMetrics automatically flushes the in-memory data to disk on graceful shutdown via SIGINT signal.
The in-memory data is lost on unclean shutdown (hardware power loss, OOM crash, SIGKILL).

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3337
2022-12-05 15:16:14 -08:00
Aliaksandr Valialkin
544ea89f91
lib/{mergeset,storage}: add start background workers via startBackgroundWorkers() function 2022-12-04 00:01:04 -08:00
Aliaksandr Valialkin
33dda2809b
lib/mergeset: panic when too long item is passed to Table.AddItems() 2022-12-03 23:32:16 -08:00
Aliaksandr Valialkin
932c1f90ae
lib/storage: remove duplicate logging for filepath on errors 2022-12-03 23:15:22 -08:00
Aliaksandr Valialkin
044a304adb
lib/storage: pass a single arg - rowsPerBlock - to getCompressLevel() function instead of two args 2022-12-03 23:10:16 -08:00
Aliaksandr Valialkin
cb44976716
lib/{storage,mergeset}: use a single sync.WaitGroup for all background workers
This simplifies the code
2022-12-03 23:03:08 -08:00
Aliaksandr Valialkin
28e6d9e1ff
lib/storage: properly pass retentionMsecs to OpenStorage() at TestIndexDBRepopulateAfterRotation 2022-12-03 23:02:10 -08:00
Aliaksandr Valialkin
343c69fc15
lib/{mergeset,storage}: pass compressLevel to blockStreamWriter.InitFromInmemoryPart
This allows packing in-memory blocks with different compression levels
depending on its contents. This may save memory usage.
2022-12-03 22:46:48 -08:00
Aliaksandr Valialkin
6d87462f4b
lib/mergeset: use the given compressLevel for index and metaindex compression in in-memory part
Previously only data was compressed with the given compressLevel
2022-12-03 22:34:54 -08:00
Aliaksandr Valialkin
f3e3a3daeb
lib/{mergeset,storage}: take into account byte slice capacity when returning the size of in-memory part
This results in more correct reporting of memory usage for in-memory parts
2022-12-03 22:30:36 -08:00
Aliaksandr Valialkin
c4150995ad
lib/mergeset: reduce the time needed for the slowest tests 2022-12-03 22:26:33 -08:00
Aliaksandr Valialkin
45299efe22
lib/{storage,mergeset}: consistency rename: `flushRaw{Rows,Items} -> flushPending{Rows,Items} 2022-12-03 22:17:46 -08:00
Aliaksandr Valialkin
5ca58cc4fb
lib/storage: optimization: do not scan block for rows outside retention if it is covered by the retention 2022-12-03 22:14:12 -08:00
Aliaksandr Valialkin
152ac564ab
lib/storage: remove logging redundant path values in a single error message 2022-12-03 22:13:13 -08:00
Aliaksandr Valialkin
93764746c2
lib/filestream: remove logging redundant path values in a single error message 2022-12-03 22:01:51 -08:00
Aliaksandr Valialkin
4f28513b1a
lib/fs: remove logging redundant path values in a single error message 2022-12-03 22:00:20 -08:00
Aliaksandr Valialkin
7c3c08d102
lib/backup: remove logging duplicate path values in a single error message 2022-12-03 21:55:06 -08:00
Aliaksandr Valialkin
14660d4df5
all: typo fix: the the -> the 2022-12-03 21:53:01 -08:00
Aliaksandr Valialkin
ddc3d6b5c3
lib/mergeset: drop the crufty code responsible for direct upgrade from releases prior v1.28.0
Upgrade to v1.84.0, wait until the "finished round 2 of background conversion" message
appears in the log and then upgrade to newer release.
2022-12-03 21:17:31 -08:00
Aliaksandr Valialkin
05c65bd83f
lib/storage: speed up search for data block for the given tsids
Use binary search instead of linear scan for looking up the needed
data block inside index block.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3425
2022-12-03 20:58:32 -08:00
Aliaksandr Valialkin
299285b147
lib/storage: fix TestUpdateCurrHourMetricIDs test when it runs on the first hour of the day by UTC 2022-12-02 18:52:37 -08:00
Aliaksandr Valialkin
e9636b4c69
lib/{mergeset,storage}: re-use the code for removing isInMerge flag at parts
Move the common code into releasePartsToMerge() method and consistently use it throughout the code.
2022-12-02 18:52:37 -08:00
Aliaksandr Valialkin
f325410c26
lib/promscrape: optimize service discovery speed
- Return meta-labels for the discovered targets via promutils.Labels
  instead of map[string]string. This improves the speed of generating
  meta-labels for discovered targets by up to 5x.

- Remove memory allocations in hot paths during ScrapeWork generation.
  The ScrapeWork contains scrape settings for a single discovered target.
  This improves the service discovery speed by up to 2x.
2022-11-29 21:26:00 -08:00
Aliaksandr Valialkin
295c84df66
lib/promscrape/discovery: add a benchmark for measuring the performance of creating pod meta-labels 2022-11-29 20:27:48 -08:00
Aliaksandr Valialkin
654e94f420
lib/promscrape: add exported_ prefix to metric names exported by scrape targets if they clash with automatically generated metrics
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3406
2022-11-28 18:37:09 -08:00
匠心零度
fa0ce10275
lib/storage: remove extra error check (#3396) 2022-11-28 16:43:31 -08:00
Aliaksandr Valialkin
58d459e8a8
app/{vminsert,vmagent}: follow-up after 53a63c6c4c
Extend /api/v1/import/prometheus with the support for Pushgateway way of specifying additional labels.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1415
2022-11-25 16:48:14 -08:00
Roman Khavronenko
03d88bc066
vmagent: expose metrics for tracking config state (#3375)
Expose `vm_relabel_config_*` and `vm_promscrape_config_*` metrics
for tracking relabel and scrape configuration hot-reloads.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3345
Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-11-22 00:38:43 +02:00
Aliaksandr Valialkin
95f0266558
lib/promscrape/discovery/gce: do not pass filter arg when discovering zones
The filter arg isn't supported by zones API in GCE.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3202
2022-11-21 22:32:05 +02:00
Aliaksandr Valialkin
353396aa23
lib/workingsetcache: expose -cacheExpireDuration command-line flag for fine-tuning of the cache expiration
While at it, decrease -prevCacheRemovalPercent from 0.2 to 0.1 and increase -cacheExpireDuration from 20 minutes to 30 minutes.

This is needed for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3343
2022-11-17 19:59:13 +02:00
Aliaksandr Valialkin
5955d23232
lib/promscrape: add a benchmark for internLabelStrings() 2022-11-16 23:02:49 +02:00
Aliaksandr Valialkin
a75137c1c2
lib/mergeset: properly reset bsr.bhIdx after the call to blockStreamReader.readNextBHS()
The issue has been introduced in 58b40f514c

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3343
2022-11-16 21:23:35 +02:00
Aliaksandr Valialkin
c3362e3db4
lib/workingsetcache: add -prevCacheRemovalPercent command-line flag for tuning memory usage vs CPU usage ratio
Reduce the default value of this flag from 1% to 0.2% after 71335e6024

This flag should help determining the best ratio for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3343
2022-11-16 12:39:39 +02:00
Aliaksandr Valialkin
4106f197f2
lib/mergeset: retain the buffer with the data used by indexBlock.bhs, inside indexBlock.buf
Previously indexBlock.bhs pointed to the buffer, which could be changed over time.
This could result in incorrect time series search over time.

This is a follow-up for 58b40f514c

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3343
2022-11-16 12:09:23 +02:00
Aliaksandr Valialkin
58b40f514c
lib/mergeset: remove string allocation and copying when unmarshaling blockHeader
This should reduce CPU usage for the case from https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3343
2022-11-15 16:30:54 +02:00
Aliaksandr Valialkin
71335e6024
lib/workingsetcache: tune cache miss threshold for resetting the previous cache from 5% to 1%
It has been appeared that some production workloads could suffer for some time
after every reset of the previous cache when it gets less than 5% of requests
after the needed item isn't found in the current cache. This could result
in reduced cache hit rates, which, in turn, could increase CPU, disk IO and RAM
usage needed for reading, unpacking and caching the missed data from disk.

This commit reduces the cache miss threshold for resetting the previous cache from 5% to 1%.
This should reduce the possible negative impact after each cache reset by at least 5x,
while reducing the total memory used by caches.

This is a follow-up for d906d8573e
2022-11-10 13:31:54 +02:00
Aliaksandr Valialkin
86bce7f5f9
lib/promscrape: add more cases to TestAddRowToTimeseries
This is a follow-up for 16fdd2af8a
2022-11-09 16:13:56 +02:00
Jeremy PLANCKEEL
16fdd2af8a
test(golang): add test to function addRowToTimeseries (#3282)
Co-authored-by: jplanckeel-externe <jplanckeel.externe@bedrockstreaming.com>
2022-11-09 15:41:26 +02:00
Aliaksandr Valialkin
b8839df32c
lib/protoparser/opentsdb: follow-up after 04b0e4e7bf
- Simplify the parser code to be less error prone
- Document the change
- Add a test for OpenTSDB put line with trailing whitespace without tags

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3290
2022-11-09 15:35:05 +02:00
Roman Khavronenko
04b0e4e7bf
protoparser/opentsdb: allow lines without tags (#3303)
According to http://opentsdb.net/docs/build/html/api_telnet/put.html
"At least one tag pair must be present".
However, in VictoriaMetrics datamodel tags aren't required.
This could be confusing for users. Allowing accept lines without
tags seems to do no harm.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3290
Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-11-09 15:32:47 +02:00
Aliaksandr Valialkin
7fa5d043f5
lib/promscrape/discovery/consul: add __meta_consul_partition label in the same way as Prometheus does
See https://github.com/prometheus/prometheus/pull/11482
2022-11-07 15:25:53 +02:00
Aliaksandr Valialkin
daa70e6560
lib/storage: follow-up for 790768f20b
- Document the bugfix at docs/CHANGELOG.md
- Simplify the bugfix a bit
2022-11-07 14:04:08 +02:00
Aliaksandr Valialkin
f9dc3da9e2
lib/storage: typo fix after 32d48f8dfbb03174858c00bdfe6d9d22431dc8d8 2022-11-07 13:58:27 +02:00
Aliaksandr Valialkin
116811d761
lib/envtemplate: allow non-env var names inside "%{ ... }" 2022-11-07 13:58:27 +02:00
Aliaksandr Valialkin
dd88c628aa
lib/storage: remove unused isFull field from hourMetricIDs struct 2022-11-07 13:58:26 +02:00
Łukasz Marszał
790768f20b
Fix issue-3309 - currHourMetricIDs shouldn't contain metrics from prev hour (#3320)
* fix issue-3309 currHourMetricIDs shouldn't contain metrics from prev hour

* Update storage.go
2022-11-07 13:55:37 +02:00
Aliaksandr Valialkin
869e0f9f85
lib/promrelabel: go fmt after 5cec9706dc 2022-10-29 05:17:10 +03:00
Aliaksandr Valialkin
5cec9706dc
lib/promrelabel: add a test from https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3251
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3251
2022-10-29 04:33:38 +03:00
Aliaksandr Valialkin
320ae1c60a
lib/envflag: small refactoring after 518c340ae3 and 02096e06d0 2022-10-29 02:28:58 +03:00
Aliaksandr Valialkin
76e8888272
lib/promscrape: properly add exported_ prefix to labels, which clash with target labels if honor_labels: true option isn't set.
The issue was in the `labels := dst[offset:]` line in the beginning of appendExtraLabels() function.
The `dst` may be re-allocated when adding extra labels to it. In this case the addition of `exported_`
prefix to labels inside `labels` slice become invisible in the returned `dst` labels.

While at it, properly handle some corner cases:

- Add additional `exported_` prefix to clashing metric labels with already existing `exported_` prefix.
- Store scraped metric names in `exported___name__` label if scrape target contains `__name__` label.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3278

Thanks to @jplanckeel for the initial attempt to fix this issue
at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3281
2022-10-28 22:14:26 +03:00
Aliaksandr Valialkin
454baf84d6
lib/promscrape/discovery/kubernetes: do not print an empty kubeconfig_file option in yaml at /config page 2022-10-28 22:14:25 +03:00
Aliaksandr Valialkin
518c340ae3
lib/envtemplate: allow referring env vars from other env vars via %{ENV_VAR} syntax
This is a follow-up for 02096e06d0
2022-10-26 14:49:33 +03:00
Aliaksandr Valialkin
02096e06d0
lib/envflag: allow referring environment variables in command-line flags 2022-10-26 01:52:05 +03:00
Aliaksandr Valialkin
c4265322f4
lib/fs: add canOverwrite arg to WriteFileAtomically when it is allowed to overwrite the file atomically if it already exists 2022-10-26 01:07:34 +03:00
Aliaksandr Valialkin
d9bbf24183
app/{vminsert,vmselect}/netstorage: allow calling Init()+MustStop() in a loop
Previously netstorage.MustStop() call didn't free up all the resources,
so the subsequent call to nestorage.Init() would panic.

This allows writing tests, which call nestorage.Init() + nestorage.MustStop() in a loop.
2022-10-25 17:47:17 +03:00
Aliaksandr Valialkin
8e998aa1a1
lib/storage: add support for retention filters (aka multiple retentions for distinct sets of time series)
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/143
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/289
2022-10-24 16:40:20 +03:00
Aliaksandr Valialkin
dba218a8ce
lib/storage: skip blocks outside the configured retention during search
Blocks outside the configured retention are eventually deleted during background merge.
But such blocks may reside in the storage for long time until background merge.
Previously VictoriaMetrics could spend additional CPU time on processing such blocks
during search queries. Now these blocks are skipped.
2022-10-24 02:52:44 +03:00
Aliaksandr Valialkin
e2f0b76ebf
lib/storage: do not pass retentionMsecs and isReadOnly args explicitly - access them via Storage arg
This makes code easier to read.

This is a follow-up after d2d30581a0
2022-10-24 01:31:04 +03:00
Aliaksandr Valialkin
89a1108b1a
lib/storage: small code cleanups 2022-10-24 01:17:47 +03:00
Aliaksandr Valialkin
05512fdd74
lib/storage: re-use newTestStorage() instead of manually initializing Storage mock
This is a follow-up for d2d30581a0
2022-10-23 16:24:00 +03:00
Aliaksandr Valialkin
d2d30581a0
lib/storage: pass Storage to table and partition instead of getDeletedMetricIDs callback
This improves code readability a bit.
2022-10-23 16:10:04 +03:00
Aliaksandr Valialkin
54f35c175c
lib/storage: small refactoring: move retentionDeadline to blockStreamMerger
This allows defining per-block retention in the future by updating the getRetentionDeadline function
2022-10-23 16:10:02 +03:00
Aliaksandr Valialkin
187e294a53
lib/storage: use a single reference to the currently merged block - bsm.Block during the block merge loop 2022-10-23 14:08:57 +03:00
Aliaksandr Valialkin
d0a9ca1bc2
lib/storage: properly pass uint64 constant to fmt.Errorf on 32-bit platforms 2022-10-23 12:48:00 +03:00
Aliaksandr Valialkin
5e4dfe50c6
lib/storage: subsitute searchTSIDs functions with more lightweight searchMetricIDs function
The searchTSIDs function was searching for metricIDs matching the the given tag filters
and then was locating the corresponding TSID entries for the found metricIDs.

The TSID entries aren't needed when searching for time series names (aka MetricName),
so this commit removes the uneeded TSID search from the implementation of /api/v1/series API.
This improves perfromance of /api/v1/series calls.

This commit also improves performance a bit for /api/v1/query and /api/v1/query_range calls,
since now these calls cache small metricIDs instead of big TSID entries
in the indexdb/tagFilters cache (now this cache is named indexdb/tagFiltersToMetricIDs)
without the need to compress the saved entries in order to save cache space.

This commit also removes concurrency limiter during searching for matching time series,
which was introduced in 8f16388428, since the concurrency
for all the read queries is already limited with -search.maxConcurrentRequests command-line flag.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648
2022-10-23 12:23:47 +03:00
Aliaksandr Valialkin
4128ad71e2
lib/storage: move common code to newRawRowsBlock() function 2022-10-21 14:46:55 +03:00
Aliaksandr Valialkin
b5674164c6
lib/storage: simplify code a bit after 3f5959c053 2022-10-21 14:39:27 +03:00
Aliaksandr Valialkin
fd7c86ae25
lib/{mergeset,storage}: simplify the code a bit after ae55ad8749 2022-10-21 14:33:03 +03:00
Aliaksandr Valialkin
99d67ac8ad
lib/storage: validate timestamps in the block only if they use encoding, which needs validation
This reduces CPU usage when there is no sense in validating timestamps.

This is a follow-up for 5fa9525498

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2998
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3011
2022-10-21 00:52:32 +03:00
Aliaksandr Valialkin
3f5959c053
lib/storage: try generating initial parts from inmemory rows with identical sizes under high ingestion rate
This should improve background merge rate under high load a bit
2022-10-20 23:28:24 +03:00
Aliaksandr Valialkin
891ff6af2a
lib/workingsetcache: increase default cache expiration from 10 minutes to 20 minutes
This increases the maximum time for cache population with new entries from 20 minutes to 40 minutes.
This

This change shouldn't increase memory usage for caches, since the prev cache cleaner
should free up memory by deleting unused prev cache as soon as possible.
See 08ca45d238 for details on prev cache cleaner.
2022-10-20 21:48:25 +03:00
Aliaksandr Valialkin
08ca45d238
lib/workingsetcache: move the cleaner for the prev cache into a separate goroutine
This makes the code more clear after d906d8573e
2022-10-20 21:45:29 +03:00
Aliaksandr Valialkin
4cd173bbaa
lib/procutil: stop immediately after receiving the second SIGINT or SIGTERM signal
Previously VictoriaMetrics apps could stop responding to SIGINT and SIGTERM signals
if they hang for some reason in graceful shutdown procedure.
2022-10-20 21:40:20 +03:00
Aliaksandr Valialkin
150e99d403
lib/{mergeset,storage}: avoid unaligned 64-bit atomic operation panic on 32-bit platforms
The panic has been introduced in 68f3a02589

While at it, add padding to shard structs in order to avoid false sharing on mordern CPUs

This should improve scalability on systems with many CPU cores
2022-10-20 16:25:43 +03:00
Aliaksandr Valialkin
d906d8573e
lib/workingsetcache: drop the previous cache whenever it recieves less than 5% of requests comparing to the current cache
This means that the majority of requests are successfully served from the current cache,
so the previous cache can be reset in order to free up memory.
2022-10-20 10:47:58 +03:00
Aliaksandr Valialkin
817aeafd69
lib/workingsetcache: use per-bucket stats counters instead of global stats counters for cache hits/misses
This should improve cache scalability on systems with many CPU cores.
2022-10-20 09:12:17 +03:00
Aliaksandr Valialkin
9c02c39487
lib/workingsetcache: randomize interval for swapping curr and prev caches
This should make CPU usage smoother over time, since different caches
will be swapped at different times.
2022-10-20 08:42:43 +03:00
Nikolay
1059c4d84a
lib/promscrape/discovery/kubernetes: correctly wrap error (#3250)
* lib/promscrape/discovery/kubernetes: correctly wrap error
follow-up after 1304824201

* Update docs/CHANGELOG.md

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-10-18 20:37:42 +03:00
Aliaksandr Valialkin
069401a304
all: log error when environment variables referred from -promscrape.config are missing
This should prevent from using incorrect config files
2022-10-18 10:47:16 +03:00
Aliaksandr Valialkin
fb50730ba7
lib/storage: double the number of rawRows shards on multi-core systems
This should increase data ingestion scalability on multi-core systems at the cost of slightly higher memory usage
2022-10-17 18:19:51 +03:00
Aliaksandr Valialkin
ae55ad8749
lib/{storage,mergeset}: do not hold per-shard lock in fast path when adding per-shard items to the flush list 2022-10-17 18:01:26 +03:00
Aliaksandr Valialkin
b6e8c1403a
lib/promrelabel: add relabeling tests when the source label is missing 2022-10-17 14:47:52 +03:00
Aliaksandr Valialkin
2e3be68617
lib/bytesutil: make sure that the string passed to FastStringMather.Match() is copied before using it as a key in the internal cache map
This prevents from possible corruption of the internal cache map
when the underlying byte slice used by the string key is modified.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3227
2022-10-14 09:51:19 +03:00
Nikolay
b856581ad3
lib/backup: set s3 default region to us-west-2 (#3224)
* lib/backup: set s3 default region to us-west-2
it should fix an error with region detection for bucket, if AWS_REGION env var is not set

* Update lib/backup/s3remote/s3.go

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-10-13 10:30:07 +03:00
Aliaksandr Valialkin
185cff307b
lib/mergeset: mention in the error message the path to the part, which triggered the error
This should improve debuggability
2022-10-12 09:54:21 +03:00
Aliaksandr Valialkin
50f5eae0e0
lib/promrelabel: remove unconditional sorting of the labels in ParsedConfigs.Apply(), since the sorting isnt needed in many places
Sort labels explicitly after calling the ParsedConfigs.Apply() when needed.

This reduces CPU usage when performing metric-level relabeling, where labels' sorting isn't needed.
2022-10-09 14:51:16 +03:00