Commit Graph

2068 Commits

Author SHA1 Message Date
PrometheusBot
5ae22fa2c0
Update common Prometheus files (#2798)
Signed-off-by: prombot <prometheus-team@googlegroups.com>
2023-09-09 20:42:39 +02:00
dependabot[bot]
e590476dc7
build(deps): bump golang.org/x/sys from 0.10.0 to 0.12.0 (#2797)
Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.10.0 to 0.12.0.
- [Commits](https://github.com/golang/sys/compare/v0.10.0...v0.12.0)

---
updated-dependencies:
- dependency-name: golang.org/x/sys
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-09-09 17:09:30 +02:00
Daniel Swarbrick
685b98ec7f
Optionally fetch ARP stats via rtnetlink instead of procfs (#2777)
* Optionally fetch ARP stats via rtnetlink instead of procfs

Implement collection of ARP stats via rtnetlink to work around
shortcomings in the output of /proc/net/arp, which truncates InfiniBand
link-layer addresses.

Fixes: #2776

---------

Signed-off-by: Daniel Swarbrick <daniel.swarbrick@gmail.com>
Co-authored-by: Ben Kochie <superq@gmail.com>
2023-09-09 16:41:09 +02:00
Ben Kochie
cda1d820bb
Update to Go 1.21 (#2796)
* Update Go build to 1.21.
* Update machine images to Ubuntu 22.04 current.

Signed-off-by: Ben Kochie <superq@gmail.com>
2023-09-09 15:44:48 +02:00
Daniel Swarbrick
381f32b1c5 btrfs: close btrfs.FS handle after use
Despite being quite hard to provoke (< 10% in my testing), the btrfs
collector would occasionally leave stale FDs relating to btrfs
mountpoints, making the filesystems unable to be unmounted.

Fixes: #2772.

Signed-off-by: Daniel Swarbrick <daniel.swarbrick@gmail.com>
2023-08-21 16:00:00 +02:00
Josh Bradley
f2b274350a
fix(qdisc) flag naming corrected for consistency (#2782)
* fix collector qdisc flag naming for consistency

---------

Signed-off-by: jbradleynh <jbradley@fastly.com>
2023-08-21 07:48:09 +02:00
John Kordich
e120d958f5 Change log message from Warn to Debug
Signed-off-by: John Kordich <jkordich@gmail.com>

Co-authored-by: Ben Kochie <superq@gmail.com>
Signed-off-by: John Kordich <jkordich@gmail.com>
2023-08-20 13:38:47 +02:00
John Kordich
933b1c1797 Add new node_cpu_frequency_hertz metric
Revert changes to node_cpu_info and add new node_cpu_frequency_hertz
metric for measuring CPU frequency from /proc/cpuinfo

Signed-off-by: John Kordich <jkordich@gmail.com>
2023-08-20 13:38:47 +02:00
John Kordich
e84c278107 Update e2e-output.txt with new expected metric values
Changes the e2e-output.txt file to have the expected CPU MHz values
for the node_cpu_info metric.

Signed-off-by: John Kordich <jkordich@gmail.com>
2023-08-20 13:38:47 +02:00
John Kordich
223ebbd50c Add CPU MHz as the value for "node_cpu_info" metric
For CPUs which don't have an available (or insertable) cpufreq driver,
the /proc/cpuinfo file can sometimes have accurate CPU core frequency
measurements. This change replaces the constant value of "1" for the
"node_cpu_info" metric with the parsed CPU MHz value from
/proc/cpuinfo for each core.

Signed-off-by: John Kordich <jkordich@gmail.com>
2023-08-20 13:38:47 +02:00
takt
6225435677
Upgrade github.com/ema/qdisc to v1.0.0 to improve qdisc collector (#2779)
performance

Signed-off-by: Oliver Geiselhardt-Herms <ogh@deepl.com>
Co-authored-by: Oliver Geiselhardt-Herms <ogh@deepl.com>
2023-08-18 15:20:22 +02:00
Daniel Swarbrick
37ce0bab8c
Sync build tags in *_test.go (#2767)
Ensure that unwanted tests are correctly excluded when various build
tags are specified, i.e. when the code that they test would be excluded
from compilation.

Signed-off-by: Daniel Swarbrick <daniel.swarbrick@gmail.com>
2023-08-15 11:38:13 +02:00
Daniel Swarbrick
3fb5f70b0c Drop redundant GOOS build tags if already in filename
Drop redundant GOOS build tags at start of file if the constraint is
already specified by the filename, e.g. foo_GOOS.go or
foo_GOOS_GOARCH.go, avoiding potential confusion in future.

cf. https://pkg.go.dev/cmd/go#hdr-Build_constraints

Signed-off-by: Daniel Swarbrick <daniel.swarbrick@gmail.com>
2023-08-08 14:30:39 +02:00
dependabot[bot]
c6c28d915c
build(deps): bump github.com/jsimonetti/rtnetlink from 1.3.3 to 1.3.4 (#2765)
Bumps [github.com/jsimonetti/rtnetlink](https://github.com/jsimonetti/rtnetlink) from 1.3.3 to 1.3.4.
- [Release notes](https://github.com/jsimonetti/rtnetlink/releases)
- [Commits](https://github.com/jsimonetti/rtnetlink/compare/v1.3.3...v1.3.4)

---
updated-dependencies:
- dependency-name: github.com/jsimonetti/rtnetlink
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-08-02 18:13:53 +02:00
dependabot[bot]
60f08e0aac
build(deps): bump github.com/prometheus/procfs from 0.11.0 to 0.11.1 (#2763)
Bumps [github.com/prometheus/procfs](https://github.com/prometheus/procfs) from 0.11.0 to 0.11.1.
- [Release notes](https://github.com/prometheus/procfs/releases)
- [Commits](https://github.com/prometheus/procfs/compare/v0.11.0...v0.11.1)

---
updated-dependencies:
- dependency-name: github.com/prometheus/procfs
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-08-02 18:13:17 +02:00
dependabot[bot]
35278b94f8
build(deps): bump github.com/beevik/ntp from 1.1.1 to 1.3.0 (#2762)
Signed-off-by: Ben Kochie <superq@gmail.com>
2023-08-02 17:49:21 +02:00
Benoît Knecht
3b9613cfae
collector/netdev_linux.go: Fallback to 32-bit stats (#2757)
On some platforms, `msg.Attributes.Stats64` is `nil` because the kernel doesn't
expose 64-bit stats. In that case, return `msg.Attributes.Stats` instead, which
are the 32-bit equivalent.

Note that `RXOtherhostDropped` isn't available in that case, so we hardcode it
to zero.

Fixes #2756.

Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
2023-08-01 15:58:53 +02:00
L
5d1b96c936 Include drm collector in README
The DRM collector was missing in the README, this change includes it together with a short description.

Signed-off-by: L <3177243+LukeLR@users.noreply.github.com>
2023-07-31 13:14:13 +01:00
PrometheusBot
8fb4f78ce5
Update common Prometheus files (#2752)
Signed-off-by: prombot <prometheus-team@googlegroups.com>
2023-07-18 21:08:19 +02:00
PrometheusBot
fa481315b5
Synchronize common files from prometheus/prometheus (#2736)
* Update common Prometheus files

Signed-off-by: prombot <prometheus-team@googlegroups.com>

* Fixup linting issues

* Disbale unused-parameter check.
* Fixup minor linting issues.

Signed-off-by: Ben Kochie <superq@gmail.com>

---------

Signed-off-by: prombot <prometheus-team@googlegroups.com>
Signed-off-by: Ben Kochie <superq@gmail.com>
Co-authored-by: Ben Kochie <superq@gmail.com>
2023-07-18 10:46:59 +02:00
Ben Kochie
2d8069208c
Release v1.6.1 (#2747)
Rebuild with latest Go compiler bugfix release.

Signed-off-by: Ben Kochie <superq@gmail.com>
2023-07-17 13:58:44 +02:00
Ben Kochie
7c564bcbef
Fixup hwmon chip include (#2739)
Use the correct include value to the device filter function.
* Add new bogus hwmon fixture.
* Update end-to-end test to use hwmon chip include flag.

Signed-off-by: Ben Kochie <superq@gmail.com>
2023-07-10 12:46:30 +02:00
Conall O'Brien
c241ecf8bd
Update all Include and Exclude variables to use the systemdUnit naming (#2740)
prefix.

Leave an annotation about using regexps instead of device_filter.go, so
@SuperQ doesn't need to remember everything.

Signed-off-by: Conall O'Brien <conall@conall.net>
2023-07-10 12:25:18 +02:00
Gabi Davar
f4344579d5
Add missing ethtool flag documentation (#2743)
Signed-off-by: Gabi Davar <grizzly.nyo@gmail.com>
2023-07-08 09:36:36 +02:00
Conall O'Brien
8b4dc82488
Add include and exclude filter for hwmon collector (#2699)
* Add include and exclude flags chip name flags to hwmon collector, following example in systemd collector

---------

Signed-off-by: Conall O'Brien <conall@conall.net>
Co-authored-by: Ben Kochie <superq@gmail.com>
2023-07-07 10:30:24 +02:00
Ben Kochie
ed57c15e2c
Merge pull request #2644 from v-zhuravlev/mixin_alerts
Mixin: Add and update alerts
2023-07-04 09:12:00 +02:00
dependabot[bot]
a24344d4a8 build(deps): bump github.com/prometheus/client_golang
Bumps [github.com/prometheus/client_golang](https://github.com/prometheus/client_golang) from 1.15.1 to 1.16.0.
- [Release notes](https://github.com/prometheus/client_golang/releases)
- [Changelog](https://github.com/prometheus/client_golang/blob/main/CHANGELOG.md)
- [Commits](https://github.com/prometheus/client_golang/compare/v1.15.1...v1.16.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/client_golang
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-07-03 11:57:59 +02:00
dependabot[bot]
74da9f9d85 build(deps): bump github.com/beevik/ntp from 1.0.0 to 1.1.1
Bumps [github.com/beevik/ntp](https://github.com/beevik/ntp) from 1.0.0 to 1.1.1.
- [Release notes](https://github.com/beevik/ntp/releases)
- [Changelog](https://github.com/beevik/ntp/blob/main/RELEASE_NOTES.md)
- [Commits](https://github.com/beevik/ntp/compare/v1.0.0...v1.1.1)

---
updated-dependencies:
- dependency-name: github.com/beevik/ntp
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-07-03 11:57:04 +02:00
Michal
c31ebb4359
Add cpu vulnerabilities reporting from sysfs (#2721)
* Add cpu vulnerabilities reporting from sysfs

---------

Signed-off-by: Michal Wasilewski <michal@mwasilewski.net>
2023-07-01 14:21:49 +02:00
prombot
3e3ab1778b Update common Prometheus files
Signed-off-by: prombot <prometheus-team@googlegroups.com>
2023-07-01 13:14:03 +02:00
Vitaly Zhuravlev
e8d7f4e8b3 Revert alerts pending durtions
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
2023-06-29 23:26:52 +08:00
Vitaly
3e250a95a0 Update NodeSystemSaturation severity
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
2023-06-29 23:26:52 +08:00
Vitaly Zhuravlev
b7dfb32bfc Set severity to NodeCPUHighUsage to info
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
2023-06-29 23:26:52 +08:00
Vitaly Zhuravlev
6bdc1d9c98 Add thresholds for memory, disk and system alerts
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
2023-06-29 23:26:52 +08:00
Vitaly Zhuravlev
77ae769179 Add thresholds for memory alerts
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
2023-06-29 23:26:52 +08:00
Vitaly Zhuravlev
2111e70ac7 Add comma after 'mounted on'
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
2023-06-29 23:26:52 +08:00
Vitaly Zhuravlev
e48e7909f4 Extend alert description
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
2023-06-29 23:26:52 +08:00
Vitaly Zhuravlev
da32f8de17 Decrease NodeSystemdServiceFailed severity to warning
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
2023-06-29 23:26:52 +08:00
Vitaly Zhuravlev
580c497261 Add NodeSystemSaturation and NodeMemoryMajorPagesFaults
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
2023-06-29 23:26:52 +08:00
Vitaly Zhuravlev
e15e7d6a7b Fix NodeMemoryHighUtilization alert
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
2023-06-29 23:26:52 +08:00
Vitaly Zhuravlev
c3ec6e8af1 Add diskDevice selector
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
2023-06-29 23:26:52 +08:00
Vitaly Zhuravlev
962de6c921 Add %(nodeExporterSelector)s to Network and conntrack alerts
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
2023-06-29 23:26:52 +08:00
Vitaly Zhuravlev
94fc82e418 Add NodeDiskIOSaturation alert
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
2023-06-29 23:26:52 +08:00
Vitaly Zhuravlev
614030bb80 Set 'at' everywhere as preposition for instance
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
2023-06-29 23:26:52 +08:00
Vitaly Zhuravlev
3d8075da7d Decrease NodeNetwork*Errs pending period
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
2023-06-29 23:26:51 +08:00
Vitaly Zhuravlev
74794182a7 Add failed systemd service alert
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
2023-06-29 23:26:51 +08:00
Vitaly Zhuravlev
fd2d62af63 Add CPU and memory alerts
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
2023-06-29 23:26:51 +08:00
Vitaly Zhuravlev
0e0399d41e Decrease NodeFilesystem pending time to 15m
30m is too long and there is a risk of running out of disk space/inodes completely if something is filling up disk very fast (like log file).

Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
2023-06-29 23:26:51 +08:00
Vitaly Zhuravlev
fc967aa992 Add mountpoint to NodeFilesystem alerts
This helps to identify alerting filesystem.

Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
2023-06-29 23:26:51 +08:00
Ben Kochie
a11de2ede5
Update golangci-lint config (#2722)
* Migrate from Python codespell to golangci-lint misspell.
* Inline errcheck exclude list in the golangci-lint config.

Signed-off-by: Ben Kochie <superq@gmail.com>
2023-06-21 10:07:30 +02:00