VictoriaMetrics/docs/VictoriaLogs/data-ingestion/Vector.md
2023-07-06 21:35:22 -07:00

5.0 KiB

Vector setup

Specify Elasticsearch sink type in the vector.toml for sending the collected logs to VictoriaLogs:

[sinks.vlogs]
  inputs = [ "your_input" ]
  type = "elasticsearch"
  endpoints = [ "http://localhost:9428/insert/elasticsearch/" ]
  mode = "bulk"
  api_version = "v8"
  healthcheck.enabled = false

  [sinks.vlogs.query]
    _msg_field = "message"
    _time_field = "timestamp"
    _stream_fields = "host,container_name"

Substitute the localhost:9428 address inside endpoints section with the real TCP address of VictoriaLogs.

Replace your_input with the name of the inputs section, which collects logs. See these docs for details.

See these docs for details on parameters specified in the [sinks.vlogs.query] section.

It is recommended verifying whether the initial setup generates the needed log fields and uses the correct stream fields. This can be done by specifying debug parameter in the [sinks.vlogs.query] section and inspecting VictoriaLogs logs then:

[sinks.vlogs]
  inputs = [ "your_input" ]
  type = "elasticsearch"
  endpoints = [ "http://localhost:9428/insert/elasticsearch/" ]
  mode = "bulk"
  api_version = "v8"
  healthcheck.enabled = false

  [sinks.vlogs.query]
    _msg_field = "message"
    _time_field = "timestamp"
    _stream_fields = "host,container_name"
    debug = "1"

If some log fields must be skipped during data ingestion, then they can be put into ignore_fields parameter. For example, the following config instructs VictoriaLogs to ignore log.offset and event.original fields in the ingested logs:

[sinks.vlogs]
  inputs = [ "your_input" ]
  type = "elasticsearch"
  endpoints = [ "http://localhost:9428/insert/elasticsearch/" ]
  mode = "bulk"
  api_version = "v8"
  healthcheck.enabled = false

  [sinks.vlogs.query]
    _msg_field = "message"
    _time_field = "timestamp"
    _stream_fields = "host,container_name"
    ignore_fields = "log.offset,event.original"

When Vector ingests logs into VictoriaLogs at a high rate, then it may be needed to tune batch.max_events option. For example, the following config is optimized for higher than usual ingestion rate:

[sinks.vlogs]
  inputs = [ "your_input" ]
  type = "elasticsearch"
  endpoints = [ "http://localhost:9428/insert/elasticsearch/" ]
  mode = "bulk"
  api_version = "v8"
  healthcheck.enabled = false

  [sinks.vlogs.query]
    _msg_field = "message"
    _time_field = "timestamp"
    _stream_fields = "host,container_name"

  [sinks.vlogs.batch]
    max_events = 1000

If the Vector sends logs to VictoriaLogs in another datacenter, then it may be useful enabling data compression via compression = "gzip" option. This usually allows saving network bandwidth and costs by up to 5 times:

[sinks.vlogs]
  inputs = [ "your_input" ]
  type = "elasticsearch"
  endpoints = [ "http://localhost:9428/insert/elasticsearch/" ]
  mode = "bulk"
  api_version = "v8"
  healthcheck.enabled = false
  compression = "gzip"

  [sinks.vlogs.query]
    _msg_field = "message"
    _time_field = "timestamp"
    _stream_fields = "host,container_name"

By default, the ingested logs are stored in the (AccountID=0, ProjectID=0) tenant. If you need storing logs in other tenant, then specify the needed tenant via [sinks.vlogq.request.headers] section. For example, the following vector.toml config instructs Vector to store the data to (AccountID=12, ProjectID=34) tenant:

[sinks.vlogs]
  inputs = [ "your_input" ]
  type = "elasticsearch"
  endpoints = [ "http://localhost:9428/insert/elasticsearch/" ]
  mode = "bulk"
  api_version = "v8"
  healthcheck.enabled = false

  [sinks.vlogs.query]
    _msg_field = "message"
    _time_field = "timestamp"
    _stream_fields = "host,container_name"

  [sinks.vlogs.request.headers]
    AccountID = "12"
    ProjectID = "34"

See also: