Commit Graph

20 Commits

Author SHA1 Message Date
Aliaksandr Valialkin
62e4baf556
lib/logstorage: use simpler in-memory cache instead of workingsetcache for caching recently ingested _stream values and recently queried set of streams
These caches aren't expected to grow big, so it is OK to use the most simplest cache based on sync.Map.
The benefit of this cache compared to workingsetcache is better scalability on systems with many CPU cores,
since it doesn't use mutexes at fast path.
An additional benefit is lower memory usage on average, since the size of in-memory cache equals
working set for the last 3 minutes.

The downside is that there is no upper bound for the cache size, so it may grow big during workload spikes.
But this is very unlikely for typical workloads.

(cherry picked from commit 0f24078146)
2024-10-18 11:42:16 +02:00
Aliaksandr Valialkin
f9d86a913c
lib/logstorage: do not persist streamIDCache, since it may go out of sync with partition directories, which can be changed manually between VictoriaLogs restarts
Partition directories can be manually deleted and copied from another sources such as backups or other VitoriaLogs instances.
In this case the persisted cache becomes out of sync with partitions. This can result in missing index entries
during data ingestion or in incorrect results during querying. So it is better to do not persist caches.
This shouldn't hurt VictoriaLogs performance just after the restart too much, since its caches usually contain
small amounts of data, which can be quickly re-populated from the persisted data.

(cherry picked from commit 8aa144fa74)
2024-10-18 11:42:16 +02:00
Aliaksandr Valialkin
f627d7f686
app/vlstorage: add support for forced merge via /internal/force_merge HTTP endpoint
(cherry picked from commit 3c73dbbacc)
2024-10-15 11:49:31 +02:00
Aliaksandr Valialkin
180137a377
lib/logstorage: improve the performance of obtaining _stream column value
Substitute global streamTagsCache with per-blockSearch cache for ((stream.id) -> (_stream value)) entries.
This improves scalability of obtaining _stream values on a machine with many CPU cores, since every CPU
has its own blockSearch instance.

This also should reduce memory usage when querying logs over big number of streams, since per-blockSearch
cache of ((stream.id) -> (_stream value)) entries is limited in size, and its lifetime is bounded by a single query.
2024-09-24 20:57:39 +02:00
Aliaksandr Valialkin
08fe7949d1
lib/logstorage: consistently use nsecsPerDay constant and remove nsecPerDay constant 2024-09-06 16:18:15 +02:00
Aliaksandr Valialkin
d5cbda3424
app/vlstorage: add -retention.maxDiskSpaceUsageBytes command-line flag for limiting the retention at VictoriaLogs by disk space usage 2024-06-25 17:30:46 +02:00
Aliaksandr Valialkin
3152df2bce
lib/logstorage: work-in-progress 2024-05-25 00:31:55 +02:00
Aliaksandr Valialkin
147704aab0
lib/logstorage: initial implementation of pipes in LogsQL
See https://docs.victoriametrics.com/victorialogs/logsql/#pipes
2024-05-12 16:36:01 +02:00
Aliaksandr Valialkin
ca1e78bd16
lib/logstorage: consistently use atomic.* types instead of atomic.* functions on regular types
See ea9e2b19a5
2024-02-24 00:29:39 +02:00
Aliaksandr Valialkin
92e098012a
lib/logstorage: consistently use atomic.* type for refCount and mustDrop fields in datadb and storage structs in the same way as it is used in lib/storage
See ea9e2b19a5 and a204fd69f1
2024-02-24 00:28:56 +02:00
Aliaksandr Valialkin
d52fd73f18
all: add up to 10% random jitter to the interval between periodic tasks performed by various components
This should smooth CPU and RAM usage spikes related to these periodic tasks,
by reducing the probability that multiple concurrent periodic tasks are performed at the same time.
2024-01-22 18:39:16 +02:00
Zakhar Bessarab
85b604d414
lib/logstorage: fix free space check (#5113)
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-10-03 17:50:16 +02:00
Aliaksandr Valialkin
120f3bc467
lib/logstorage: follow-up for 8a23d08c21
- Compare the actual free disk space to the value provided via -storage.minFreeDiskSpaceBytes
  directly inside the Storage.IsReadOnly(). This should work fast in most cases.
  This simplifies the logic at lib/storage.

- Do not take into account -storage.minFreeDiskSpaceBytes during background merges, since
  it results in uncontrolled growth of small parts when the free disk space approaches -storage.minFreeDiskSpaceBytes.
  The background merge logic uses another mechanism for determining whether there is enough
  disk space for the merge - it reserves the needed disk space before the merge
  and releases it after the merge. This prevents from out of disk space errors during background merge.

- Properly handle corner cases for flushing in-memory data to disk when the storage
  enters read-only mode. This is better than losing the in-memory data.

- Return back Storage.MustAddRows() instead of Storage.AddRows(),
  since the only case when AddRows() can return error is when the storage is in read-only mode.
  This case must be handled by the caller by calling Storage.IsReadOnly()
  before adding rows to the storage.
  This simplifies the code a bit, since the caller of Storage.MustAddRows() shouldn't handle
  errors returned by Storage.AddRows().

- Properly store parsed logs to Storage if parts of the request contain invalid log lines.
  Previously the parsed logs could be lost in this case.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4737
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4945
2023-10-02 20:38:00 +02:00
Zakhar Bessarab
dfdada055c
lib/logstorage: switch to read-only mode when running out of disk space (#4945)
* lib/logstorage: switch to read-only mode when running out of disk space

Added support of `--storage.minFreeDiskSpaceBytes` command-line flag to allow graceful handling of running out of disk space at `--storageDataPath`.

See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4737
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* lib/logstorage: fix error handling logic during merge

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* lib/logstorage: fix log level

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Nikolay <nik@victoriametrics.com>
2023-10-02 17:09:57 +02:00
crossoverJie
64b27c9217
lib/logstorage: Set ptwHot to nil when the partition pointed by ptwHot is dropped (#4902) 2023-08-29 11:22:53 +02:00
Aliaksandr Valialkin
1e1aa94ffb
lib/logstorage: eliminate data race when clearing s.ptwHot after deleting the corresponding partition
The previous code could result in the following data race:
1. The s.ptwHot partition is marked to be deleted
2. ptw.decRef() is called on it
3. ptw.pt is set to nil
4. s.ptwHot.pt is accessed from concurrent goroutine, which leads to panic.

The change clears s.ptwHot under s.partitionsLock in order to prevent from the data race.

This is a follow-up for 8d50032dd6

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4895
2023-08-29 11:12:07 +02:00
crossoverJie
db0ae3fffb
lib/logstorage: add nil check for ptwHot.pt (#4896)
(cherry picked from commit cde5029bce)
2023-08-27 09:14:28 +02:00
Aliaksandr Valialkin
e1f7e0b455
lib/logstorage: log the -retentionPeriod and -futureRetention values when the ingested log entry has timestamp outside the configured retention
This should simplify debugging
2023-07-17 18:23:45 -07:00
Aliaksandr Valialkin
08634ae612
app/vlinsert/jsonline: code prettifying 2023-07-06 21:35:55 -07:00
Aliaksandr Valialkin
374890294e
app/victoria-logs: initial code release 2023-07-06 17:30:05 -07:00