VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-12-18 14:40:26 +01:00

Author	SHA1	Message	Date
Aliaksandr Valialkin	f4dce57ebe	lib/streamaggr/streamaggr.go: typo fix after `5e29ef5ed5`: IgnoredNaNSamples -> ignoredNaNSamples	2024-07-15 21:59:03 +02:00
Aliaksandr Valialkin	cbc637d1dd	app/vmagent/remotewrite: follow-up for `f153f54d11` - Move the remaining code responsible for stream aggregation initialization from remotewrite.go to streamaggr.go . This improves code maintainability a bit. - Properly shut down streamaggr.Aggregators initialized inside remotewrite.CheckStreamAggrConfigs(). This prevents from potential resource leaks. - Use separate functions for initializing and reloading of global stream aggregation and per-remoteWrite.url stream aggregation. This makes the code easier to read and maintain. This also fixes INFO and ERROR logs emitted by these functions. - Add an ability to specify `name` option in every stream aggregation config. This option is used as `name` label in metrics exposed by stream aggregation at /metrics page. This simplifies investigation of the exposed metrics. - Add `path` label additionally to `name`, `url` and `position` labels at metrics exposed by streaming aggregation. This label should simplify investigation of the exposed metrics. - Remove `match` and `group` labels from metrics exposed by streaming aggregation, since they have little practical applicability: it is hard to use these labels in query filters and aggregation functions. - Rename the metric `vm_streamaggr_flushed_samples_total` to less misleading `vm_streamaggr_output_samples_total` . This metric shows the number of samples generated by the corresponding streaming aggregation rule. This metric has been added in the commit `861852f262` . See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6462 - Remove the metric `vm_streamaggr_stale_samples_total`, since it is unclear how it can be used in practice. This metric has been added in the commit `861852f262` . See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6462 - Remove Alias and aggrID fields from streamaggr.Options struct, since these fields aren't related to optional params, which could modify the behaviour of the constructed streaming aggregator. Convert the Alias field to regular argument passed to LoadFromFile() function, since this argument is mandatory. - Pass Options arg to LoadFromFile() function by reference, since this structure is quite big. This also allows passing nil instead of Options when default options are enough. - Add `name`, `path`, `url` and `position` labels to `vm_streamaggr_dedup_state_size_bytes` and `vm_streamaggr_dedup_state_items_count` metrics, so they have consistent set of labels comparing to the rest of streaming aggregation metrics. - Convert aggregator.aggrStates field type from `map[string]aggrState` to `[]aggrOutput`, where `aggrOutput` contains the corresponding `aggrState` plus all the related metrics (currently only `vm_streamaggr_output_samples_total` metric is exposed with the corresponding `output` label per each configured output function). This simplifies and speeds up the code responsible for updating per-output metrics. This is a follow-up for the commit `2eb1bc4f81` . See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6604 - Added missing urls to docs ( https://docs.victoriametrics.com/stream-aggregation/ ) in error messages. These urls help users figuring out why VictoriaMetrics or vmagent generates the corresponding error messages. The urls were removed for unknown reason in the commit `2eb1bc4f81` . - Fix incorrect update for `vm_streamaggr_output_samples_total` metric in flushCtx.appendSeriesWithExtraLabel() function. While at it, reduce memory usage by limiting the maximum number of samples per flush to 10K. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5467 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6268	2024-07-15 20:25:36 +02:00
Aliaksandr Valialkin	a8356f3a26	vendor: update github.com/VictoriaMetrics/metrics from v1.34.1 to v1.35.0 Fix potential memory leaks across VictoriaMetrics codebase after metrics.UnregisterSet(s) call because of missing s.UnregisterAllMetrics() call. This is a follow-up for `6a6e34ab8e` . It is OK if some vmauth metrics aren't visible for a few microseconds when the previous metrics are unregistered and new metrics weren't registered yet. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6247 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4690 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6252 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5805	2024-07-15 10:45:39 +02:00
Aliaksandr Valialkin	4878152678	lib/{storage,mergeset}: do not allow setting dataFlushInterval to values smaller than pending{Items,Rows}FlushInterval Pending rows and items unconditionally remain in memory for up to pending{Items,Rows}FlushInterval, so there is no any sense in setting dataFlushInterval (the interval for guaranteed flush of in-memory data to disk) to values smaller than pending{Items,Rows}FlushInterval, since this doesn't affect the interval for flushing pending rows and items from memory to disk. This is a follow-up for `4c80b17027` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6221	2024-07-15 10:11:23 +02:00
Aliaksandr Valialkin	c3d2351948	lib/streamaggr: consistently use alphabetical order of benchmarked stream aggregation outputs	2024-07-15 09:53:26 +02:00
Aliaksandr Valialkin	21f049e211	lib/streamaggr: follow-up for `9c3d44c8c9` - Consistently enumerate stream aggregation outputs in alphabetical order across the source code and docs. This should simplify future maintenance of the corresponding code and docs. - Fix the link to `rate_sum()` at `see also` section of `rate_avg()` docs. - Make more clear the docs for `rate_sum()` and `rate_avg()` outputs. - Encapsulate output metric suffix inside rateAggrState. This eliminates possible bugs related to incorrect suffix passing to newRateAggrState(). - Rename rateAggrState.total field to less misleading rateAggrState.increase name, since it calculates counter increase in the current aggregation window. - Set rateLastValueState.prevTimestamp on the first sample in time series instead of the second sample. This makes more clear the code logic. - Move the code for removing outdated entries at rateAggrState into removeOldEntries() function. This make the code logic inside rateAggrState.flushState() more clear. - Do not write output sample with zero value if there are no input series, which could be used for calculating the rate, e.g. if only a single sample is registered for every input series. - Do not take into account input series with a single registered sample when calculating rate_avg(), since this leads to incorrect results. - Move {rate,total}AggrState.flushState() function to the end of rate.go and total.go files, so they look more similar. This shuld simplify future mantenance. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6243	2024-07-15 08:44:48 +02:00
Aliaksandr Valialkin	bc1f92d7f5	app/vmagent/remotewrite: follow-up for `87fd400dfc` - Drop samples and return true from remotewrite.TryPush() at fast path when all the remote storage systems are configured with the disabled on-disk queue, every in-memory queue is full and -remoteWrite.dropSamplesOnOverload is set to true. This case is quite common, so it should be optimized. Previously additional CPU time was spent on per-remoteWriteCtx relabeling and other processing in this case. - Properly count the number of dropped samples inside remoteWriteCtx.pushInternalTrackDropped(). Previously dropped samples were counted only if -remoteWrite.dropSamplesOnOverload flag is set. In reality, the samples are dropped when they couldn't be sent to the queue because in-memory queue is full and on-disk queue is disabled. The remoteWriteCtx.pushInternalTrackDropped() function is called by streaming aggregation for pushing the aggregated data to the remote storage. Streaming aggregation cannot wait until the remote storage processes pending data, so it drops aggregated samples in this case. - Clarify the description for -remoteWrite.disableOnDiskQueue command-line flag at -help output, so it is clear that this flag can be set individually per each -remoteWrite.url. - Make the -remoteWrite.dropSamplesOnOverload flag global. If some of the remote storage systems are configured with the disabled on-disk queue, then there is no sense in keeping samples on some of these systems, while dropping samples on the remaining systems, since this will result in global stall on the remote storage system with the disabled on-disk queue and with the -remoteWrite.dropSamplesOnOverload=false flag. vmagent will always return false from remotewrite.TryPush() in this case. This will result in infinite duplicate samples written to the remaining remote storage systems. That's why the -remoteWrite.dropSamplesOnOverload is forcibly set to true if more than one -remoteWrite.disableOnDiskQueue flag is set. This allows proceeding with newly scraped / pushed samples by sending them to the remaining remote storage systems, while dropping them on overloaded systems with the -remoteWrite.disableOnDiskQueue flag set. - Verify that the remoteWriteCtx.TryPush() returns true in the TestRemoteWriteContext_TryPush_ImmutableTimeseries test. - Mention in vmagent docs that the -remoteWrite.disableOnDiskQueue command-line flag can be set individually per each -remoteWrite.url. See https://docs.victoriametrics.com/vmagent/#disabling-on-disk-persistence Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6248 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6065	2024-07-13 02:30:10 +02:00
Aliaksandr Valialkin	43fc1183b9	app/vmalert: switch from table-driven tests to f-tests This makes test code more clear and reduces the number of code lines by 500. This also simplifies debugging tests. See https://itnext.io/f-tests-as-a-replacement-for-table-driven-tests-in-go-8814a8b19e9e While at it, consistently use t.Fatal* instead of t.Error* across tests, since t.Error* requires more boilerplate code, which can result in additional bugs inside tests. While t.Error* allows writing logging errors for the same, this doesn't simplify fixing broken tests most of the time. This is a follow-up for `a9525da8a4`	2024-07-12 22:45:50 +02:00
hagen1778	c835a6351e	lib/streamaggr: add missing test cases Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `2f65956259`)	2024-07-12 14:19:17 +02:00
Hui Wang	f3cbd62823	vmagent: fix `vm_streamaggr_flushed_samples_total` counter (#6604 ) We use `vm_streamaggr_flushed_samples_total` to show the number of produced samples by aggregation rule, previously it was overcounted, and doesn't account for `output_relabel_configs`. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6462 --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `2eb1bc4f81`)	2024-07-12 14:19:17 +02:00
hagen1778	7370f84b97	lib/bakcup/azremote: follow-up after `5fd3aef549` Simplify tests by converting them to f-tests. `5fd3aef549` Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `03e4c5c19c`)	2024-07-10 15:17:06 +02:00
justinrush	e65e55e2dd	lib/backup: add support for Azure Managed Identity (#6518 ) ### Describe Your Changes These changes support using Azure Managed Identity for the `vmbackup` utility. It adds two new environment variables: * `AZURE_USE_DEFAULT_CREDENTIAL`: Instructs the `vmbackup` utility to build a connection using the [Azure Default Credential](https://pkg.go.dev/github.com/Azure/azure-sdk-for-go/sdk/azidentity@v1.5.2#NewDefaultAzureCredential) mode. This causes the Azure SDK to check for a variety of environment variables to try and make a connection. By default, it tries to use managed identity if that is set up. This will close https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5984 ### Checklist The following checks are mandatory: - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). ### Testing However you normally test the `vmbackup` utility using Azure Blob should continue to work without any changes. The set up for that is environment specific and not listed out here. Once regression testing has been done you can set up [Azure Managed Identity](https://learn.microsoft.com/en-us/entra/identity/managed-identities-azure-resources/overview) so your resource (AKS, VM, etc), can use that credential method. Once it is set up, update your environment variables according to the updated documentation. I added unit tests to the `FS.Init` function, then made my changes, then updated the unit tests to capture the new branches. I tested this in our environment, but with SAS token auth and managed identity and it works as expected. --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Co-authored-by: Justin Rush <jarush@epic.com> Co-authored-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `5fd3aef549`)	2024-07-10 12:26:21 +02:00
Aliaksandr Valialkin	a1decb5ca1	app/vlinsert/loki: use easyproto instead for parsing Loki protobuf messages	2024-07-10 03:05:55 +02:00
Aliaksandr Valialkin	b8a8d3d6f1	lib/logstorage: drop all the pipes from the query when calculating the number of matching logs at /select/logsql/hits API	2024-07-10 00:39:16 +02:00
Aliaksandr Valialkin	d6415b2572	all: consistently use 'any' instead of 'interface{}' 'any' type is supported starting from Go1.18. Let's consistently use it instead of 'interface{}' type across the code base, since `any` is easier to read than 'interface{}'.	2024-07-10 00:23:26 +02:00
Aliaksandr Valialkin	9edeecabc8	lib: consistently use f-tests instead of table-driven tests This makes easier to read and debug these tests. This also reduces test lines count by 15% from 3K to 2.5K . See https://itnext.io/f-tests-as-a-replacement-for-table-driven-tests-in-go-8814a8b19e9e . While at it, consistently use t.Fatal* instead of t.Error, since t.Error usually leads to more complicated and fragile tests, while it doesn't bring any practical benefits over t.Fatal*.	2024-07-09 22:39:13 +02:00
Aliaksandr Valialkin	f3be3573e7	lib/promscrape/discovery/vultr: follow-up after `17e3d019d2` - Sort the discovered labels in alphabetical order at https://docs.victoriametrics.com/sd_configs/#vultr_sd_configs - Rename VultrConfigs to VultrSDConfigs to be consistent with the naming for other SD configs. - Prepare query arg filters for `list instances API` at newAPIConfig() instead of passing them in a separate listParams struct. This simplifies the code a bit. - Return error when bearer token isn't set at vultr_sd_configs, since this token is mandatory according to https://docs.victoriametrics.com/sd_configs/#vultr_sd_configs - Remove unused fields from the parsed response from Vultr list instances API in order to simplify the code a bit. - Remove double logging of errors inside getInstances() function, since these errors must be already logged by the caller. - Simplify tests, so they are easier to maintain. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6041 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6068	2024-07-05 17:40:39 +02:00
Aliaksandr Valialkin	6397c38a0a	lib/logstorage: use quicktemplate.AppendJSONString instead of strconv.AppendQuote for encoding JSON strings The strconv.AppendQuote improperly encodes special chars such as \x1b . They must be encoded as \u001b . See https://github.com/VictoriaMetrics/victorialogs-datasource/issues/24	2024-07-05 01:22:49 +02:00
Aliaksandr Valialkin	172ae1adf7	Revert `c6c5a5a186` and `b2765c45d0` Reason for revert: There are many statsd servers exist: - https://github.com/statsd/statsd - classical statsd server - https://docs.datadoghq.com/developers/dogstatsd/ - statsd server from DataDog built into DatDog Agent ( https://docs.datadoghq.com/agent/ ) - https://github.com/avito-tech/bioyino - high-performance statsd server - https://github.com/atlassian/gostatsd - statsd server in Go - https://github.com/prometheus/statsd_exporter - statsd server, which exposes the aggregated data as Prometheus metrics These servers can be used for efficient aggregating of statsd data and sending it to VictoriaMetrics according to https://docs.victoriametrics.com/#how-to-send-data-from-graphite-compatible-agents-such-as-statsd ( the https://github.com/prometheus/statsd_exporter can be scraped as usual Prometheus target according to https://docs.victoriametrics.com/#how-to-scrape-prometheus-exporters-such-as-node-exporter ). Adding support for statsd data ingestion protocol into VictoriaMetrics makes sense only if it provides significant advantages over the existing statsd servers, while has no significant drawbacks comparing to existing statsd servers. The main advantage of statsd server built into VictoriaMetrics and vmagent - getting rid of additional statsd server. The main drawback is non-trivial and inconvenient streaming aggregation configs, which must be used for the ingested statsd metrics ( see https://docs.victoriametrics.com/stream-aggregation/ ). These configs are incompatible with the configs for standalone statsd servers. So you need to manually translate configs of the used statsd server to stream aggregation configs when migrating from standalone statsd server to statsd server built into VictoriaMetrics (or vmagent). Another important drawback is that it is very easy to shoot yourself in the foot when using built-in statsd server with the -statsd.disableAggregationEnforcement command-line flag or with improperly configured streaming aggregation. In this case the ingested statsd metrics will be stored to VictoriaMetrics as is without any aggregation. This may result in high CPU usage during data ingestion, high disk space usage for storing all the unaggregated statsd metrics and high CPU usage during querying, since all the unaggregated metrics must be read, unpacked and processed during querying. P.S. Built-in statsd server can be added to VictoriaMetrics and vmagent after figuring out more ergonomic specialized configuration for aggregating of statsd metrics. The main requirements for this configuration: - easy to write, read and update (ideally it should work out of the box for most cases without additional configuration) - hard to misconfigure (e.g. hard to shoot yourself in the foot) It would be great if this configuration will be compatible with the configuration of the most widely used statsd server. In the mean time it is recommended continue using external statsd server. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6265 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5053 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5052 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/206 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4600	2024-07-03 23:57:49 +02:00
Aliaksandr Valialkin	7a60e8abf7	lib/promscrape: use prompbmarshal.MustParsePromMetrics function at parseData() test function The prompbmarshal.MustParsePromMetrics function has been added in the commit `cc4d57d650`	2024-07-03 16:10:37 +02:00
Aliaksandr Valialkin	cd152693c6	Revert "Exemplar support (#5982 )" This reverts commit `5a3abfa041`. Reason for revert: exemplars aren't in wide use because they have numerous issues which prevent their adoption (see below). Adding support for examplars into VictoriaMetrics introduces non-trivial code changes. These code changes need to be supported forever once the release of VictoriaMetrics with exemplar support is published. That's why I don't think this is a good feature despite that the source code of the reverted commit has an excellent quality. See https://docs.victoriametrics.com/goals/ . Issues with Prometheus exemplars: - Prometheus still has only experimental support for exemplars after more than three years since they were introduced. It stores exemplars in memory, so they are lost after Prometheus restart. This doesn't look like production-ready feature. See `0a2f3b3794/content/docs/instrumenting/exposition_formats.md (L153-L159)` and https://prometheus.io/docs/prometheus/latest/feature_flags/#exemplars-storage - It is very non-trivial to expose exemplars alongside metrics in your application, since the official Prometheus SDKs for metrics' exposition ( https://prometheus.io/docs/instrumenting/clientlibs/ ) either have very hard-to-use API for exposing histograms or do not have this API at all. For example, try figuring out how to expose exemplars via https://pkg.go.dev/github.com/prometheus/client_golang@v1.19.1/prometheus . - It looks like exemplars are supported for Histogram metric types only - see https://pkg.go.dev/github.com/prometheus/client_golang@v1.19.1/prometheus#Timer.ObserveDurationWithExemplar . Exemplars aren't supported for Counter, Gauge and Summary metric types. - Grafana has very poor support for Prometheus exemplars. It looks like it supports exemplars only when the query contains histogram_quantile() function. It queries exemplars via special Prometheus API - https://prometheus.io/docs/prometheus/latest/querying/api/#querying-exemplars - (which is still marked as experimental, btw.) and then displays all the returned exemplars on the graph as special dots. The issue is that this doesn't work in production in most cases when the histogram_quantile() is calculated over thousands of histogram buckets exposed by big number of application instances. Every histogram bucket may expose an exemplar on every timestamp shown on the graph. This makes the graph unusable, since it is litterally filled with thousands of exemplar dots. Neither Prometheus API nor Grafana doesn't provide the ability to filter out unneeded exemplars. - Exemplars are usually connected to traces. While traces are good for some I doubt exemplars will become production-ready in the near future because of the issues outlined above. Alternative to exemplars: Exemplars are marketed as a silver bullet for the correlation between metrics, traces and logs - just click the exemplar dot on some graph in Grafana and instantly see the corresponding trace or log entry! This doesn't work as expected in production as shown above. Are there better solutions, which work in production? Yes - just use time-based and label-based correlation between metrics, traces and logs. Assign the same `job` and `instance` labels to metrics, logs and traces, so you can quickly find the needed trace or log entry by these labes on the time range with the anomaly on metrics' graph. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5982	2024-07-03 16:09:18 +02:00
Aliaksandr Valialkin	a5d60ad78e	app/vmagent/remotewrite,lib/streamaggr: re-use common code in tests after `879771808b` - Export streamaggr.LoadFromData() function, so it could be used in tests outside the lib/streamaggr package. This allows removing a hack with creation of temporary files at TestRemoteWriteContext_TryPush_ImmutableTimeseries. - Move common code for mustParsePromMetrics() function into lib/prompbmarshal package, so it could be used in tests for building []prompbmarshal.TimeSeries from string. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6205 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6206	2024-07-03 15:22:51 +02:00
Aliaksandr Valialkin	f8779d1ed2	lib/streamaggr: follow-up for the commit `c0e4ccb7b5` - Clarify docs for `Ignore aggregation intervals on start` feature. - Make more clear the code dealing with ignoreFirstIntervals at aggregator.runFlusher() functions. It is better from readability and maintainability PoV using distinct a.flush() calls for distinct cases instead of merging them into a single a.flush() call. - Take into account the first incomplete interval when tracking the number of skipped aggregation intervals, since this behaviour is easier to understand by the end users. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6137	2024-07-02 21:34:48 +02:00
Andrii Chubatiuk	252aa5a3ab	lib/protoparser/graphite: added -graphite.sanitizeMetricName flag (#6489 ) ### Describe Your Changes Added flag to sanitize graphite metrics fixes #6077 ### Checklist The following checks are mandatory: - [ ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `476faf5578`) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-07-02 17:16:00 +02:00
Aliaksandr Valialkin	7426b40250	lib/logstorage: allow writing `after N` in front of `before N` at `stream_context` pipe	2024-07-02 01:39:45 +02:00
Andrii Chubatiuk	937ae2ca90	lib/streamaggr: added stale samples metric, added metrics labels (#6462 ) ### Describe Your Changes - added stale metrics counters for input and output samples - added labels for aggregator metrics => `name="{rwctx}:{aggrId}:{aggrSuffix}"` - rwctx - global or number starting from 1 - aggrid - aggregator id starting from 1 - aggrSuffix - <interval>_(by\|without)_label1_label2_labeln e.g: `name="global:1:1m_without_instance_pod"` ### Checklist The following checks are mandatory: - [ ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `861852f262`) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-07-01 15:01:49 +02:00
Aliaksandr Valialkin	208a624d4d	lib/logstorage: properly search for the surrounding logs in `stream_context` pipe The set of log fields in the found logs may differ from the set of log fields present in the log stream. So compare only the log fields in the found logs when searching for the matching log entry in the log stream. While at it, return _stream field in the delimiter log entry, since this field is used by VictoriaLogs Web UI for grouping logs by log streams.	2024-07-01 02:33:00 +02:00
Aliaksandr Valialkin	76a58ae08d	lib/logstorage: add ability to store sorted log position into a separate field with `sort ... rank <fieldName>` syntax	2024-07-01 01:46:03 +02:00
Aliaksandr Valialkin	d0dca7b8c5	lib/logstorage: add delimiter between log chunks returned from `\| stream_context` pipe	2024-07-01 01:46:02 +02:00
Aliaksandr Valialkin	4b3477e62b	lib/logstorage: add `stream_context` pipe, which allows selecting surrounding logs for the matching logs	2024-06-28 19:15:19 +02:00
Aliaksandr Valialkin	2f28819bb1	lib/logstorage: it is safe using `\| unroll` pipe in live tailing `\| unroll` pipe can make multiple copies of rows from the input row. This doesn't break live tailing, so allow `\| unroll` pipe in live tailing.	2024-06-27 19:45:12 +02:00
Aliaksandr Valialkin	b26acec9a8	app/vlselect: properly return live tailing results	2024-06-27 15:06:15 +02:00
Aliaksandr Valialkin	dd62a2b9d6	lib/logstorage: work-in-progress	2024-06-27 14:21:03 +02:00
Andrii Chubatiuk	580d02c3f8	added IMDSv2 for YC SD (#6524 ) ### Describe Your Changes Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5513 ### Checklist The following checks are mandatory: - [ ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/).	2024-06-26 19:12:35 +02:00
rtm0	48a5c4cb01	Fix Date metricid cache consistency under concurrent use (#6534 ) ### Describe Your Changes Fix Date metricid cache consistency under concurrent use. When one goroutine calls Has() and does not find the cache entry in the immutable map it will acquire a lock and check the mutable map. And it is possible that before that lock is acquired, the entry is moved from the mutable map to the immutable map by another goroutine causing a cache miss. The fix is to check the immutable map again once the lock is acquired. ### Checklist The following checks are mandatory: - [x ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: Artem Fetishev <wwctrsrx@gmail.com> Co-authored-by: Nikolay <nik@victoriametrics.com>	2024-06-26 19:12:34 +02:00
Aliaksandr Valialkin	d5cbda3424	app/vlstorage: add -retention.maxDiskSpaceUsageBytes command-line flag for limiting the retention at VictoriaLogs by disk space usage	2024-06-25 17:30:46 +02:00
Aliaksandr Valialkin	f24123a776	lib/logstorage: parse syslog structured data into separate fields in order to simplify further querying of this data	2024-06-25 14:54:25 +02:00
Aliaksandr Valialkin	1716c4e609	lib/logstorage: properly parse timezone offset at TryParseTimestampRFC3339Nano() The TryParseTimestampRFC3339Nano() must properly parse RFC3339 timestamps with timezone offsets. While at it, make tryParseTimestampISO8601 function private in order to prevent from improper usage of this function from outside the lib/logstorage package. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6508	2024-06-25 14:54:24 +02:00
Aliaksandr Valialkin	2a7fcba330	lib/logstorage: make golangci-lint happy	2024-06-25 03:06:28 +02:00
Aliaksandr Valialkin	7026498359	lib/httpserver: revert `9b7e532172` Reason for revert: this commit doesn't resolve real security issues, while it complicates the resulting code in subtle ways (aka security circus). Comparison of two strings (passwords, auth keys) takes a few nanoseconds. This comparison is performed in non-trivial http handler, which takes thousands of nanoseconds, and the request handler timing is non-deterministic because of Go runtime, Go GC and other concurrently executed goroutines. The request handler timing is even more non-deterministic when the application is executed in shared environments such as Kubernetes, where many other applications may run on the same host and use shared resources of this host (CPU, RAM bandwidth, network bandwidth). Additionally, it is expected that the passwords and auth keys are passed via TLS-encrypted connections. Establishing TLS connections takes additional non-trivial time (millions of nanoseconds), which depends on many factors such as network latency, network congestion, etc. This makes impossible to conduct timing attack on passwords and auth keys in VictoriaMetrics components. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6423/files Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6392	2024-06-25 01:51:06 +02:00
Aliaksandr Valialkin	7de6f5b4ce	lib/logstorage: work-in-progress	2024-06-25 00:44:57 +02:00
Andrii Chubatiuk	50783fca4d	app/vmagent: add max_scrape_size to scrape config (#6434 ) Related to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6429 ### Checklist The following checks are mandatory: - [ ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `1e83598be3`)	2024-06-20 14:00:22 +02:00
Slava Bobik	a7266785ce	Fixed a typo in the FastQueue mutex comment (#6514 ) ### Describe Your Changes Fixed a small typo in a comment about the mutex inside the FastQueue struct ### Checklist The following checks are mandatory: - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). (cherry picked from commit `d236604d39`)	2024-06-20 14:00:08 +02:00
Aliaksandr Valialkin	d5224f3363	lib/logstorage: work-in-progress	2024-06-20 03:10:37 +02:00
Zakhar Bessarab	886f545f81	lib/fs/fscore: do not trim content from path (#6503 ) ### Describe Your Changes Trimming content which is loaded from an external pass leads to obscure issues in case user-defined input contained trimmed chars. For example. user-defined password "foo\n" will become "foo" while user will expect it to contain a new line. --- For example, a user defines a password which ends with `\n`. This often happens when user Kubernetes secrets and manually encodes value as base64-encoded string. In this case vmauth configuration might look like: ``` users: - url_prefix: - http://vminsert:8480/insert/0/prometheus/api/v1/write name: foo username: foo password: "foobar\n" ``` vmagent configuration for this setup will use the following flags: ``` -remoteWrite.url=http://vmauth:8427/ -remoteWrite.basicAuth.passwordFile=/tmp/vmagent-password -remoteWrite.basicAuth.username="foo" ``` Where `/tmp/vmagent-password` is a file with `foobar\n` password. Before this change such configuration will result in `401 Unauthorized` response received by vmagent since after file content will become `foobar`. --- An example with Kubernetes operator which uses a secret to reference the same password in multiple configurations. <details> <summary>See full manifests</summary> `Secret`: ``` apiVersion: v1 data: name: Zm9v # foo password: Zm9vYmFy # foobar\n username: Zm9v= # foo kind: Secret metadata: name: vmuser ``` `VMUser`: ``` apiVersion: operator.victoriametrics.com/v1beta1 kind: VMUser metadata: name: vmagents spec: generatePassword: false name: vmagents targetRefs: - crd: kind: VMAgent name: some-other-agent namespace: example username: foo # note - the secret above is referenced to provide password passwordRef: name: vmagent key: password ``` `VMAgent`: ``` apiVersion: operator.victoriametrics.com/v1beta1 kind: VMAgent metadata: name: example spec: selectAllByDefault: true scrapeInterval: 5s replicaCount: 1 remoteWrite: - url: "http://vmauth-vmauth-example:8427/api/v1/write" # note - the secret above is referenced as well basicAuth: username: name: vmagent key: username password: name: vmagent key: password ``` </details> Since both config target exactly the same `Secret` object it is expected to work, but apparently the result will be `401 Unauthrized` error. ### Checklist The following checks are mandatory: - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `201fd6de1e`)	2024-06-19 10:37:12 +02:00
Nihal	8fd46caa22	victoria-metrics: constant-time comparison of credentials like authkeys and basic auth credentials (#6423 ) Changes for constant-time comparison of credentials like authkeys and basic auth credentials. See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6392 --------- Signed-off-by: Syed Nihal <syed.nihal@nokia.com> (cherry picked from commit `9b7e532172`)	2024-06-19 10:37:09 +02:00
Aliaksandr Valialkin	c10a646d19	app/vlinsert/syslog: allow accepting syslog messages with different configs at different ports	2024-06-17 23:16:58 +02:00
hagen1778	863f1c2513	lib/streamaggr: remove accidentally committed changes Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `34771ab293`)	2024-06-17 14:25:45 +02:00
Roman Khavronenko	df7e300071	app/vmselect/promql: check for ranged vectors in aggr funcs if implicit conversions are disabled (#6450 ) Check for ranged vector arguments in aggregate expressions when `-search.disableImplicitConversion` or `-search.logImplicitConversion` are enabled. For example, `sum(up[5m])` will fail to execute if these flags are set. ### Describe Your Changes Please provide a brief description of the changes you made. Be as specific as possible to help others understand the purpose and impact of your modifications. ### Checklist The following checks are mandatory: - [*] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `6149adbe10`)	2024-06-17 14:25:43 +02:00
Aliaksandr Valialkin	1750991119	lib/logstorage: work-in-progress	2024-06-17 12:13:25 +02:00
Andrii Chubatiuk	8ca1813bd2	lib/flagutil: use month limit for duration flag for parsed duration assessment (#6486 ) use maxMonths limit for parsed duration flag value https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6330 --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `faf67aa8b5`)	2024-06-14 15:21:32 +02:00
Andrii Chubatiuk	abc233a902	lib/backup/s3remote: fixed credsFilePath flag (#6488 ) properly use credsFilePath flag value https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6353 --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `e678a9aa51`)	2024-06-14 14:14:58 +02:00
Roman Khavronenko	5df50e5645	lib/streamaggr: prevent `rate_sum` and `rate_avg` from producing NaNs (#6482 ) ### Describe Your Changes * check if `lastValue` was seen at least twice with different timestamps. Otherwise, the difference between last timestamp and previous timestamp could be `0` and will result into `NaN` calculation * check if there items left in lastValue map after staleness cleanup. Otherwise, `rate_avg` could have produce `NaN` result. ### Checklist The following checks are mandatory: - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `51d19485bb`)	2024-06-14 13:26:42 +02:00
Aliaksandr Valialkin	2bbf62b6f6	lib/leveledbytebufferpool: do not pool byte slices bigger than 2^18 bytes Previously byte slices up to 2^20 bytes (e.g. 1Mb) were cached because of a typo in the commit `c14dafce43` . This could result in increased memory usage when vmagent scrapes many regular targets, which expose relatively small number of metrics (e.g. up to a few thousand per target) and a few large targets such as kube-state-metrics, which expose more than 10 thousand metrics. This is common case for Kubernetes monitoring. While at it, remove pools for very small byte slices, since they are rarely used during scraping.	2024-06-13 17:02:05 +02:00
Aliaksandr Valialkin	faf07fbc67	lib/bytesutil: optimize internStringMap cleanup - Make it in a separate goroutine, so it doesn't slow down regular intern() calls. - Do not lock internStringMap.mutableLock during the cleanup routine, since now it is called from a single goroutine and reads only the readonly part of the internStringMap. This should prevent from locking regular intern() calls for new strings during cleanups. - Add jitter to the cleanup interval in order to prevent from synchornous increase in resource usage during cleanups. - Run the cleanup twice per -internStringCacheExpireDuration . This should save 30% CPU time spent on cleanup comparing to the previous code, which was running the cleanup 3 times per -internStringCacheExpireDuration .	2024-06-13 15:09:42 +02:00
Zakhar Bessarab	ac16d1dc1b	lib/promscrape: increase default value for promscrape.maxDroppedTargets to 10_000 (#6459 ) ### Describe Your Changes This limit can be increased since after `4513893ead` tracking of dropped targets uses much less memory per entry. See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6381#issuecomment-2156708228 ### Checklist The following checks are mandatory: - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> (cherry picked from commit `34071ac660`) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-06-13 09:28:16 +02:00
LHHDZ	41e4135371	app/vmauth: fix discovering backend IPs when `url_prefix` contains hostname with `srv+` prefix (#6401 ) This change fixes the following panic: ``` 2024-06-04T11:16:52.899Z warn app/vmauth/auth_config.go:353 cannot discover backend SRV records for http://srv+localhost:8080: lookup localhost on 10.100.10.4:53: server misbehaving; use it literally panic: runtime error: integer divide by zero goroutine 9 [running]: github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver.handlerWrapper.func1() /Users/lhhdz/wd/projects/go/VictoriaMetrics/lib/httpserver/httpserver.go:291 +0x58 panic({0x103115100?, 0x10338d700?}) /Users/lhhdz/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.22.3.darwin-arm64/src/runtime/panic.go:770 +0x124 main.getLeastLoadedBackendURL({0x0?, 0x22?, 0x1400014757b?}, 0x1400013c120?) /Users/lhhdz/wd/projects/go/VictoriaMetrics/app/vmauth/auth_config.go:473 +0x210 main.(*URLPrefix).getBackendURL(0x140000aa080) /Users/lhhdz/wd/projects/go/VictoriaMetrics/app/vmauth/auth_config.go:312 +0xb8 ``` --------- Co-authored-by: Haley Wang <haley@victoriametrics.com>	2024-06-12 11:47:44 +02:00
Aliaksandr Valialkin	9135b404d9	lib/logstorage: work-in-progress	2024-06-11 17:51:01 +02:00
Aliaksandr Valialkin	9bd16790c0	lib/streamaggr: prevent from data race inside dedupAggrShard when samplesBuf can be updated in pushSamples() while their values are read in the flush() loop without das.mu lock This issue has been introduced in the commit `253c0cffbe`	2024-06-11 17:31:38 +02:00
Aliaksandr Valialkin	37a8cc0b12	lib/logstorage: work-in-progress	2024-06-10 18:42:31 +02:00
Aliaksandr Valialkin	7e24bf99de	lib/streamaggr: return back string interning to dedupAggr after 78953723200f15ffc417064d1912bdbb7551505c It should reduce memory allocation rate during stream deduplication	2024-06-10 18:06:25 +02:00
Aliaksandr Valialkin	6470eac7dc	lib/bytesutil: reduce the number of memory allocations per each interned string in bytesutil.InternString() from 5 to 1 This should reduce GC overhead when tens of millions of strings are interned (for example, during stream deduplication of millions of active time series).	2024-06-10 18:06:24 +02:00
Roman Khavronenko	8c8d84e30a	lib/protoparser/opentelemetry/firehose: escape requestID before returning it to user (#6451 ) All user input should be sanitized before rendering. This should prevent possible attacks. See https://github.com/VictoriaMetrics/VictoriaMetrics/security/code-scanning/203 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-06-10 18:06:24 +02:00
Aliaksandr Valialkin	883c0e6221	lib/streamaggr: reduce memory allocations by using dedupAggrSample buffer per each dedupAggrShard	2024-06-10 16:39:26 +02:00
Aliaksandr Valialkin	422225bfa5	lib/streamaggr: reduce the number of duplicates per each sample in BenchmarkDedupAggr from 100 to 2 This is closer to typical production setups when deduplication is used for de-duplicating of 2 samples per series.	2024-06-10 16:39:26 +02:00
Aliaksandr Valialkin	d269a95da3	lib/streamaggr: use strings.Clone() instead of bytesutil.InternString() for creating series key in dedupAggr Our internal testing shows that this reduces GC overhead when deduplicating tens of millions of active series.	2024-06-10 16:08:47 +02:00
Aliaksandr Valialkin	9ed9e766e8	lib/streamaggr: improve performance for dedupAggr.sizeBytes() and dedupAggr.itemsCount() These functions are called every time `/metrics` page is scraped, so it would be great if they could be sped up for the cases when dedupAggr tracks tens of millions of active time series.	2024-06-10 16:00:05 +02:00
Aliaksandr Valialkin	387c22da49	lib/streamaggr: remove flushState arg at dedupAggr.flush(), since it is always set to true in production	2024-06-10 16:00:05 +02:00
Hui Wang	028a80613f	lib/httpserver: allow reloadAuthKey and configAuthKey to override htt… (#6338 ) …pAuth.* address https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6329, makes `reloadAuthKey`, `configAuthKey`, `flagsAuthKey`, `pprofAuthKey` behavior the same way, but keys like `-snapshotAuthKey`, `-forceMergeAuthKey` are still protected by httpAuth.*. All the available key are listed in https://docs.victoriametrics.com/single-server-victoriametrics/#security. --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `61dce6f2a1`) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-06-10 12:41:29 +02:00
Aliaksandr Valialkin	32aa0751a1	lib/streamaggr: follow-up for `7cb894a777` - Use bytesutil.InternString() instead of strings.Clone() for inputKey and outputKey in aggregatorpushSamples(). This should reduce string allocation rate, since strings can be re-used between aggrState flushes. - Reduce memory allocations at dedupAggrShard by storing dedupAggrSample by value in the active series map. - Remove duplicate call to bytesutil.InternBytes() at Deduplicator, since it is already called inside dedupAggr.pushSamples(). - Add missing string interning at rateAggrState.pushSamples(). Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6402	2024-06-07 16:35:53 +02:00
Roman Khavronenko	78121642df	lib/streamaggr: reduce number of inuse objects (#6402 ) The main change is getting rid of interning of sample key. It was discovered that for cases with many unique time series aggregated by vmagent interned keys could grow up to hundreds of millions of objects. This has negative impact on the following aspects: 1. It slows down garbage collection cycles, as GC has to scan all inuse objects periodically. The higher is the number of inuse objects, the longer it takes/the more CPU it takes. 2. It slows down the hot path of samples aggregation where each key needs to be looked up in the map first. The change makes code more fragile, but suppose to provide performance optimization for heavy-loaded vmagents with stream aggregation enabled. --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-06-07 16:35:52 +02:00
Roman Khavronenko	fae589bb83	lib/promrelabel: speedup label match by `__name__` (#6432 ) The change adds a fastpath for `equalValue` comparisons against `__name__` label by avoiding calls to `toCanonicalLabelName` func. This speedups matches by metric name like `'foo'`. See bench stats below: ``` benchcmp old.txt new.txt benchmark old ns/op new ns/op delta BenchmarkIfExpression/equal_label:_last-10 35.6 35.1 -1.18% BenchmarkIfExpression/equal_label:_middle-10 18.3 17.3 -5.41% BenchmarkIfExpression/equal_label:_first-10 1.20 1.24 +2.74% BenchmarkIfExpression/equal___name__:_last-10 10.1 4.96 -50.75% BenchmarkIfExpression/equal___name__:_middle-10 5.79 3.16 -45.41% BenchmarkIfExpression/equal___name__:_first-10 1.17 1.05 -9.76% ``` Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-06-07 16:35:52 +02:00
Andrii Chubatiuk	93cd08f15f	lib/streamaggr: metrics to track dropped, nan samples and samples lag (#6358 ) ### Describe Your Changes Added streamaggr metrics to: - `vm_streamaggr_samples_lag_seconds` - samples lag - `vm_streamaggr_ignored_samples_total{reason="nan"}` - ignored NaN samples - `vm_streamaggr_ignored_samples_total{reason="too_old"}` - ignored old samples (cherry picked from commit `185fac03b3`)	2024-06-06 19:22:45 +02:00
Aliaksandr Valialkin	53382ae837	lib/logstorage: work-in-progress	2024-06-06 12:27:11 +02:00
Aliaksandr Valialkin	a200fb433a	lib/logstorage: allow using `eval` keyword instead of `math` keyword in `math` pipe	2024-06-05 10:08:08 +02:00
Aliaksandr Valialkin	b45e466a1b	lib/logstorage: work-in-progress	2024-06-05 03:18:25 +02:00
pludov	2efd97a63c	lib/fs: support NFS implementations that return EEXIST instead of ENOTEMPTY (#6398 ) ### Describe Your Changes Fix for issue #6396: according to rmdir manpage, ENOTEMPTY and EEXIST should be treated equally https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6396 ### Checklist The following checks are mandatory: - [x ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Co-authored-by: Ludovic Pollet <ludovic.pollet@exfo.com> Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `3ddae77c63`)	2024-06-04 15:30:48 +02:00
Aliaksandr Valialkin	1ce8a9a751	lib/logstorage: allow typing `asc` in `sort` pipe for the sake of consistency with `desc`	2024-06-04 02:29:18 +02:00
Aliaksandr Valialkin	b7b3a9e9a3	lib/logstorage: work-in-progress	2024-06-04 01:50:55 +02:00
Aliaksandr Valialkin	540bbb63a2	lib/logstorage: work-in-progress	2024-05-30 16:19:36 +02:00
Roman Khavronenko	189af53142	lib/storage: filter deleted label names and values from `/api/v1/labe… (#6342 ) …ls` and `/api/v1/label/.../values` Check for deleted metrics when `match[]` filter matches small number of time series (optimized path). The issue was introduced [v1.81.0](https://docs.victoriametrics.com/changelog_2022/#v1810). Related issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6300 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2978 Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `b984f4672e`) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-05-29 14:37:00 +02:00
Aliaksandr Valialkin	e83fd4a117	lib/logstorage: work-in-progress	2024-05-29 01:52:34 +02:00
Aliaksandr Valialkin	79c03fc35f	lib/logstorage: work-in-progress	2024-05-28 19:29:50 +02:00
Aliaksandr Valialkin	ce5e4c842a	lib/logstorage: fix golangci-lint warnings	2024-05-26 02:02:41 +02:00
Aliaksandr Valialkin	afa597ce2a	lib/logstorage: work-in-progress	2024-05-26 01:56:12 +02:00
Aliaksandr Valialkin	6427b3c3c0	lib/logstorage: work-in-progress	2024-05-25 22:59:21 +02:00
Aliaksandr Valialkin	9edbeca46b	lib/logstorage: re-use per-shard fields across processed blocks in pipePackJSON and pipeUnroll	2024-05-25 22:13:44 +02:00
Aliaksandr Valialkin	03fe4c8963	lib/logstorage: work-in-progress	2024-05-25 21:36:24 +02:00
Aliaksandr Valialkin	3152df2bce	lib/logstorage: work-in-progress	2024-05-25 00:31:55 +02:00
Nikolay	5025ede7bc	lib/mergeset: adds tracking for indexdb records drop (#6297 ) It allows to create alert for possible item drops at indexdb. It may happen, if ingested metric size exceeds max indexdb item size. --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `69d244e6fb`) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-05-24 16:08:34 +02:00
Aliaksandr Valialkin	7a2a2f173e	lib/logstorage: work-in-progress	2024-05-24 03:07:07 +02:00
Nikolay	dfbd2f8ff7	lib/storage: change default value for maxLabelValueLen to 1024 (#6313 ) * It must reduce memory usage for misbehaving clients. Since VictoriaMetrics stores sparse index inmemory. * Reduce disk space usage for indexdb. * Prevent possible indexDB items drops. * It may trigger slow insert and new timeseries registration due to default value for flag change https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6176 --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-05-22 21:55:21 +02:00
Alexander Marshalov	0b70c4c1f1	[vmlogs] fixed time parsing with millisecond precision time (#6293 ) (#6295 ) fix for #6293 Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-05-22 21:54:50 +02:00
Aliaksandr Valialkin	04d0dd2542	lib/logstorage: work-in-progress	2024-05-22 21:01:28 +02:00
Roman Khavronenko	f3e893f699	lib/backup: add `-s3TLSInsecureSkipVerify` command-line flag (#6318 ) * The new flag can be used for for skipping TLS certificates verification when connecting to S3 endpoint. Affects vmbackup, vmrestore, vmbackupmanager. * replace deprecated `EndpointResolver` with `BaseEndpoint` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1056 Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `ac836bcf6c`)	2024-05-22 16:40:06 +02:00
Hui Wang	5b8c3fc9d0	app/vmalert: support DNS SRV record in `-remoteWrite.url` (#6299 ) part of https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6053, supports [DNS SRV](https://en.wikipedia.org/wiki/SRV_record) address in `-remoteWrite.url` command-line option. (cherry picked from commit `d7b5062917`)	2024-05-22 10:53:22 +02:00
Roman Khavronenko	3e8b5e74d5	lib/streamaggr: skip empty aggregators (#6307 ) Prevent excessive resource usage when stream aggregation config file contains no matchers by prevent pushing data into Aggregators object. Before this change a lot of extra work was invoked without reason. Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `7ce052b32d`) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-05-20 14:46:36 +02:00
Aliaksandr Valialkin	45fbcc74e0	lib/logstorage: fix golangci-lint warnings	2024-05-20 11:04:37 +02:00
Aliaksandr Valialkin	582e7d5439	lib/logstorage: work-in-progress	2024-05-20 04:09:15 +02:00
Andrii Chubatiuk	fe332c3419	app/vmagent: add global aggregator (#6268 ) Add global stream aggregation for VMAgent https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5467 (cherry picked from commit `f153f54d11`)	2024-05-17 14:01:31 +02:00
Nikolay	ee4a94a371	follow-up for `c6c5a5a186` (#6265 ) * adds datadog extensions for statsd: - multiple packed values (v1.1) - additional types distribution, histogram * adds type check and append metric type to the labels with special tag name `__statsd_metric_type__`. It simplifies streaming aggregation config. * remove statsd support from cluster, since cluster doesn't support stream aggregation. --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `b2765c45d0`) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-05-17 13:49:24 +02:00
Aliaksandr Valialkin	28626db066	lib/logstorage: work-in-progress (cherry picked from commit `0aa19a2837`)	2024-05-16 09:35:55 +02:00
Aliaksandr Valialkin	5dbc4ad5ef	lib/streamaggr: properly return output key from getOutputKey The bug has been introduced in `cc2647d212` (cherry picked from commit `b617dc9c0b`)	2024-05-16 09:35:53 +02:00
Aliaksandr Valialkin	b1ee7bca1a	lib/logstorage: work-in-progress	2024-05-14 03:06:02 +02:00
Aliaksandr Valialkin	f52275bbd7	lib/logstorage: work-in-progress Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6258	2024-05-14 01:49:58 +02:00
Aliaksandr Valialkin	207b4bd91d	lib/storage: fix SearchQuery.Unmarshal() after `32193b6059`	2024-05-14 01:39:01 +02:00
Aliaksandr Valialkin	32193b6059	lib/encoding: optimize UnmarshalVarUint64, UnmarshalVarInt64 and UnmarshalBytes a bit Change the return values for these functions - now they return the unmarshaled result plus the size of the unmarshaled result in bytes, so the caller could re-slice the src for further unmarshaling. This improves performance of these functions in hot loops of VictoriaLogs a bit.	2024-05-14 01:30:25 +02:00
Aliaksandr Valialkin	2e12119a9e	lib/stringsutil: add LessNatural() function for natural sorting Natural sorting is needed for sort_by_label_natural() and sort_by_label_natural_desc() functions in MetricsQL - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6192 and https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6256 Natural sorting will be also used by `\| sort ...` pipe in VictoriaLogs - see https://docs.victoriametrics.com/victorialogs/logsql/#sort-pipe (cherry picked from commit `707f3a69db`)	2024-05-13 17:08:56 +02:00
Hui Wang	ec56f4625e	storage: correctly apply `-inmemoryDataFlushInterval` when it's set t… (#6221 ) …o minimum supported value 1s pendingRowsFlushInterval was bumped to 2s in `73f0a805e2` (cherry picked from commit `4c80b17027`)	2024-05-13 16:50:02 +02:00
Andrii Chubatiuk	b9eb527d98	lib/streamaggr: added rate_sum and rate_avg to benchmarks, lint fix (#6264 ) fixed lint for rate outputs (cherry picked from commit `ce25d68b45`)	2024-05-13 16:49:59 +02:00
Andrii Chubatiuk	d9cddf1ad8	lib/streamaggr: added rate and rate_avg output (#6243 ) Added `rate` and `rate_avg` output Resource usage is the same as for increase output, tested on a benchmark --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `9c3d44c8c9`)	2024-05-13 16:49:39 +02:00
hagen1778	84a896cd6e	lib/logstorage: make linter happy Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `17283fab6c`)	2024-05-13 16:49:37 +02:00
Aliaksandr Valialkin	147704aab0	lib/logstorage: initial implementation of pipes in LogsQL See https://docs.victoriametrics.com/victorialogs/logsql/#pipes	2024-05-12 16:36:01 +02:00
Aliaksandr Valialkin	9dc9c892b7	lib/encoding: optimizing UnmarshalVarUint64 and UnmarshalVarInt64 a bit	2024-05-12 16:35:24 +02:00
Aliaksandr Valialkin	87338633b1	lib/slicesutil: add helper functions for setting slice length and extending its capacity The added helper functions - SetLength() and ExtendCapacity() - replace error-prone code with simple function calls.	2024-05-12 11:33:49 +02:00
Aliaksandr Valialkin	9607902289	lib/storage: remove outdated misleading comments	2024-05-12 10:25:06 +02:00
Roman Khavronenko	0bed453737	Feature allow configuring disableOnDiskQueue and dropSamplesOnOverload per url (#6248 ) * FEATURE: [vmagent](https://docs.victoriametrics.com/vmagent.html): allow configuring `-remoteWrite.disableOnDiskQueue` and `-remoteWrite.dropSamplesOnOverload` cmd-line flags per each `-remoteWrite.url`. See this [pull request](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6065). Thanks to @rbizos for implementaion! * FEATURE: [vmagent](https://docs.victoriametrics.com/vmagent.html): add labels `path` and `url` to metrics `vmagent_remotewrite_push_failures_total` and `vmagent_remotewrite_samples_dropped_total`. Now number of failed pushes and dropped samples can be tracked per `-remoteWrite.url`. --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Raphael Bizos <r.bizos@criteo.com> (cherry picked from commit `87fd400dfc`)	2024-05-10 14:32:23 +02:00
Roman Khavronenko	7be6fcd8fd	lib/streamaggr: set correct suffix `<output>_prometheus` (#6228 ) Set correct suffix `<output>_prometheus` for aggregation outputs `increase_prometheus` and `total_prometheus` Before, outputs `total` and `total_prometheus` or `increase` and `increase_prometheus` had the same suffix. Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `8a03e987cb`)	2024-05-10 14:29:01 +02:00
Andrii Chubatiuk	f3d65ba902	streamaggr: made labels compressor shared (#6173 ) Though labels compressor is quite resource intensive, each aggregator and deduplicator instance has it's own compressor. Made it shared across all aggregators to consume less resources while using multiple aggregators. Co-authored-by: Roman Khavronenko <hagen1778@gmail.com> (cherry picked from commit `a9283e06a3`)	2024-05-10 14:28:59 +02:00
Zhu Jiekun	10063e98a6	feature: [vmagent] Add service discovery support for Vultr (#6068 ) ### Describe Your Changes related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6041 #### Added - Added service discovery support for Vultr. #### Docs - `CHANGELOG.md`, `sd_configs.md`, `vmagent.md` are updated. #### Note - Useful links: - Vultr API: https://www.vultr.com/api/#tag/instances/operation/list-instances - Vultr client SDK: https://github.com/vultr/govultr - Prometheus SD: https://github.com/prometheus/prometheus/tree/main/discovery/vultr --- ### Checklist The following checks are mandatory: - [X] I have read the [Contributing Guidelines](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/CONTRIBUTING.md) - [x] All commits are signed and include `Signed-off-by` line. Use `git commit -s` to include `Signed-off-by` your commits. See this [doc](https://git-scm.com/book/en/v2/Git-Tools-Signing-Your-Work) about how to sign your commits. - [x] Tests are passing locally. Use `make test` to run all tests locally. - [x] Linting is passing locally. Use `make check-all` to run all linters locally. Further checks are optional for External Contributions: - [X] Include a link to the GitHub issue in the commit message, if issue exists. - [x] Mention the change in the [Changelog](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/docs/CHANGELOG.md). Explain what has changed and why. If there is a related issue or documentation change - link them as well. Tips for writing a good changelog message:: * Write a human-readable changelog message that describes the problem and solution. * Include a link to the issue or pull request in your changelog message. * Use specific language identifying the fix, such as an error message, metric name, or flag name. * Provide a link to the relevant documentation for any new features you add or modify. - [ ] After your pull request is merged, please add a message to the issue with instructions for how to test the fix or try the feature you added. Here is an [example](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4048#issuecomment-1546453726) - [x] Do not close the original issue before the change is released. Please note, in some cases Github can automatically close the issue once PR is merged. Re-open the issue in such case. - [x] If the change somehow affects public interfaces (a new flag was added or updated, or some behavior has changed) - add the corresponding change to documentation. Signed-off-by: Jiekun <jiekun.dev@gmail.com> (cherry picked from commit `17e3d019d2`)	2024-05-10 14:28:51 +02:00
hagen1778	864fbf9125	Statsd protocol compatibility (#5053 ) In this PR I added compatibility with [statsd protocol](https://github.com/b/statsd_spec) with tags to be able to send metrics directly from statsd clients to vmagent or directly to VM. For example its compatible with [statsd-instrument](https://github.com/Shopify/statsd-instrument) and [dogstatsd-ruby](https://github.com/DataDog/dogstatsd-ruby) gems Related issues: #5052, #206, #4600 (cherry picked from commit `c6c5a5a186`) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-05-10 14:28:37 +02:00
Ted Possible	0206a01d03	Exemplar support (#5982 ) This code adds Exemplars to VMagent and the promscrape parser adhering to OpenMetrics Specifications. This will allow forwarding of exemplars to Prometheus and other third party apps that support OpenMetrics specs. --------- Signed-off-by: Ted Possible <ted_possible@cable.comcast.com> (cherry picked from commit `5a3abfa041`)	2024-05-10 13:14:17 +02:00
Zakhar Bessarab	f6eb8912b0	lib/mergeset: improve test coverage (#6118 ) Add test to cover the code path with overflowing shards buffers and triggering merge to partition. This test covers the code path which leaded to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5959 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> (cherry picked from commit `329c3cbdf0`)	2024-04-30 10:30:12 +02:00
hagen1778	59b3f21708	Revert "app/vmbackup: introduce new flag type URL (#6152 )" This reverts commit `029060af60`. (cherry picked from commit `679844feaf`)	2024-04-24 17:08:26 +02:00
Roman Khavronenko	ff73b66182	app/vmbackup: introduce new flag type URL (#6152 ) The new flag type is supposed to be used for specifying URL values which could contain sensitive information such as auth tokens in GET params or HTTP basic authentication. The URL flag also allows loading its value from files if `file://` prefix is specified. As example, the new flag type was used in app/vmbackup as it requires specifying `authKey` param for making the snapshot. See related issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5973 Thanks to @wasim-nihal for initial implementation https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6060 --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `029060af60`)	2024-04-24 17:08:24 +02:00
hagen1778	342290275e	app/streamaggr: follow-up after `c0e4ccb7b5` * rm vmagent mentions from vminsert flags * improve documentation wording, add links to related sections * mention `ignore_first_intervals` in the stream aggr options * update flags description * add basic test for config parsing validation Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `bae3874e6a`) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-04-22 14:39:23 +02:00
Andrii Chubatiuk	131367fb59	lib/streamaggr: add option to ignore first N aggregation intervals (#6137 ) Stream aggregation may yield inaccurate results if it processes incomplete data. This issue can arise when data is sourced from clients that maintain a queue of unsent data, such as Prometheus or vmagent. If the queue isn't fully cleared within the aggregation interval, only a portion of the time series may be included in that period, leading to distorted calculations. To mitigate this we add an option to ignore first N aggregation intervals. It is expected, that client queues will be cleared during the time while aggregation ignores first N intervals and all subsequent aggregations will be correct. (cherry picked from commit `c0e4ccb7b5`) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-04-22 14:34:36 +02:00
Aliaksandr Valialkin	4318f34644	lib/protoparser: substitute hybrid channel-based pools with plain sync.Pool Using plain sync.Pool simplifies the code without increasing memory usage and CPU usage. So it is better to use plain sync.Pool from readability and maintainability PoV. This is a follow-up for `8942f290eb`	2024-04-20 22:02:39 +02:00
Aliaksandr Valialkin	9004bc098e	all: use clear() built-in Go function for clearing []prompbmarshal.TimeSeries and []prompbmarshal.Label slices This makes the code a bit clear.	2024-04-20 21:00:24 +02:00
Aliaksandr Valialkin	e138f4827e	lib/storage: search for all the values for the given label before applying filters and limits It is incorrect applying the limit on the number of values to search without applying filters, since the returned subset of label values may miss the label values matching the given filters. This is a follow-up for `66630c7960`	2024-04-18 20:40:19 +02:00
Aliaksandr Valialkin	0c87cbe338	all: replace old https://docs.victoriametrics.com/relabeling.html url with the new one - https://docs.victoriametrics.com/relabeling/	2024-04-18 03:23:15 +02:00
Aliaksandr Valialkin	b933001d2d	all: replace old https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html url with the new one - https://docs.victoriametrics.com/single-server-victoriametrics/	2024-04-18 03:11:49 +02:00
Aliaksandr Valialkin	a21d1fcf57	all: replace old https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html url with the new one - https://docs.victoriametrics.com/cluster-victoriametrics/	2024-04-18 02:56:28 +02:00
Aliaksandr Valialkin	513e69c55e	all: replace old https://docs.victoriametrics.com/sd_configs.html url with the new one - https://docs.victoriametrics.com/sd_configs/	2024-04-18 02:28:26 +02:00
Aliaksandr Valialkin	728bcf0585	all: replace old https://docs.victoriametrics.com/stream-aggregation.html url with the new one - https://docs.victoriametrics.com/stream-aggregation/	2024-04-18 02:20:00 +02:00
Aliaksandr Valialkin	6fe21520ce	all: replace old https://docs.victoriametrics.com/vmbackup.html url with the new one - https://docs.victoriametrics.com/vmbackup/	2024-04-18 01:58:00 +02:00
Aliaksandr Valialkin	0211a04a52	all: replace the outdated url https://docs.victoriametrics.com/vmagent.html with the new one - https://docs.victoriametrics.com/vmagent/	2024-04-18 01:32:57 +02:00
Aliaksandr Valialkin	e615bcced7	lib/storage: improve performance for /api/v1/label/labelName/values when match[] contains only a single filter on labelName This speeds up auto-suggestion for metric names in VMUI and Grafana, which use the following query in this case: /api/v1/label/__name__/values?match[]={__name__=~".some_value."} When the user types `some_value` in the query input field.	2024-04-18 01:18:03 +02:00
Aliaksandr Valialkin	164032cd9b	lib/httpserver: add support for automatic issuing of TLS certificates via Lets Encrypt service Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5949	2024-04-17 23:53:51 +02:00
Aliaksandr Valialkin	30c96ba8d7	app/{vminsert,vmselect}: support for `srv+addr` scheme for specifying DNS SRV addresses at -storageNode flag The new scheme is consistent with SRV urls introduced at `b426d10847` and `dc326f70b4` Deprecte the old scheme: `dns+srv:addr` by removing it from the docs. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6053	2024-04-17 23:15:05 +02:00
Aliaksandr Valialkin	2177675b34	lib/netutil: move creation of GetCertificate callback into a separate function This improves code readability a bit	2024-04-17 22:11:10 +02:00
Aliaksandr Valialkin	284d99e269	app/vmagent: support for DNS SRV urls at -remoteWrite.url, scrape target urls and service discovery urls Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6053	2024-04-17 20:56:23 +02:00
Aliaksandr Valialkin	e627810146	app/vmauth: add support for configuring backends via DNS SRV urls	2024-04-17 20:56:21 +02:00
Aliaksandr Valialkin	a9ec77fee5	lib/promscrape/discovery/consul: typo fix in the comment: enteprise -> enterprise	2024-04-17 12:08:01 +02:00
Aliaksandr Valialkin	5320cc3198	lib/{mergeset,storage}: log deleting directories inside partitions if they are missing in parts.json This should improve debuggability of unexpected deletion of directories inside partitions. While at it, log the proper path to parts.json when the directory for big part is missing in the partition. parts.json is located inside directory with small parts, and there is no parts.json file inside directory with big parts.	2024-04-17 12:00:10 +02:00
Aliaksandr Valialkin	9b2a5c8844	lib/storage: improve comments inside functions responsible for creating indexes for newly registered time series	2024-04-17 11:30:51 +02:00
Zakhar Bessarab	da8bb106aa	lib/mergeset: fix flushing incorrect set of inmemoryBlocks (#6089 ) Follow-up for `bace9a2501` Related: - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6069 - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5959 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-04-11 09:26:48 +02:00
wanshuangcheng	52a4ae0b28	chore: fix function names in comment (#6076 ) Signed-off-by: wanshuangcheng <wanshuangcheng@outlook.com>	2024-04-08 15:38:51 +02:00
Aliaksandr Valialkin	222e8b5a7b	lib/streamaggr: update the minimum allowed timestamp for incoming samples before flushing the samples to the storage This should prevent from dropping samples with old timestamps during long flushes. This is a follow-up for `1cedaf61cb`	2024-04-04 02:26:08 +03:00
Aliaksandr Valialkin	ecd782c75e	app/vmagent: follow-up for `b3b29ba6ac` - Automatically reload changed TLS root CA pointed by -remoteWrite.tlsCAFile command-line flag - Automatically reload changed TLS root CA configured via oauth2.tsl_config.ca_file option at -promscrape.config - Document the change as a feature instead of a bug at docs/CHANGELOG.md - Simplify the code at lib/promauth, which is responsible for reloading changed TLS root CA files. - Simplify the usage of lib/promauth.Config.NewRoundTripper() - now it accepts the base http.Transport instead of a callback, which can change the internal http.Transport. - Reuse the default tls config if lib/promauth.Config doesn't contain tls-specific configs. This should reduce memory usage a bit when tls isn't used for scraping big number of targets. - Do not re-read TLS root CA files on every processed request. Re-read them once per second. This should reduce CPU usage when scraping big number of targets over https. - Do not store cert.pem and key.pem files in TestTLSConfigWithCertificatesFilesUpdate, since they can be loaded from byte slices via crypto/tls.X509KeyPair(). - Remove obsolete comparisons of string representations for authConfig and proxyAuthConfig at areEqualScrapeConfigs(). Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5725 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5526 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2171	2024-04-04 01:26:38 +03:00

1 2 3 4 5 ...

2709 Commits