Commit Graph

250 Commits

Author SHA1 Message Date
Ben Kochie
a516d4de4a
Cleanup cgroups collector (#2414)
* Correctly name collector file.
* Fix cgroup summary type as gauge.
* Use a boolean metric rather than a label for enabled.

Signed-off-by: Ben Kochie <superq@gmail.com>
2022-06-24 17:15:31 +02:00
Kobe Biello
45c75f1dbc
Add cgroup summary collector (#2408)
* add cgroups summary collector

Signed-off-by: biello <bellusa@qq.com>
Co-authored-by: bielu <bielu@zuoyebang.com>
2022-06-24 12:05:13 +02:00
Fionera
9ece38fca9 refactor: Use netlink for tcpstat collector
Signed-off-by: Tim Windelschmidt <t.windelschmidt@babiel.com>
2022-04-25 10:13:06 +02:00
Ben Kochie
9155971e07
Update Go modues
Update to latest releases.
* Fix up perf collector syntax.

Signed-off-by: Ben Kochie <superq@gmail.com>
2022-03-30 11:47:09 +02:00
Ben Kochie
eecc2b1dea
Add device filter flags to arp collector
Allow filtering APR entries based on device. Useful for ignoring
entries for network namespaces (containers).

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-12-16 15:41:10 +01:00
heyitao
7dbf358915 delete duplicate items
Signed-off-by: heyitao <linuxgcc@163.com>
2021-12-09 11:50:10 +01:00
Ben Kochie
1d5afd05b5
Sanitize UTF-8 in dmi collector (#2229)
Replace invalid UTF-8 chars with "�" string.

Fixes: https://github.com/prometheus/node_exporter/issues/2228

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-12-01 11:13:43 +01:00
Jacob Vosmaer
5c8d162ca6
Add node_softirqs_total metric (#2221)
This adds a new Linux metric, node_softirqs_total, which corresponds
to the 'softirq' line in /proc/stat. This metric is disabled by
default and it can be enabled with '--collector.stat.softirq'.

Signed-off-by: Jacob Vosmaer <jacob@gitlab.com>
2021-12-01 09:55:13 +01:00
Martin Kennelly
4065902fe5
Add TCPTimeouts to netstat default filter (#2189)
TCP timeouts count is a useful signal to show
abnormal network performance and is another
signal to aid debugging. This metric can be
used to generate proactive alerts for host
network namespace workloads.

Signed-off-by: Martin Kennelly <mkennell@redhat.com>
2021-11-18 09:34:55 +01:00
Benjamin Drung
d85cbaa17c
ethtool: Prevent duplicate metric names (#2187)
Sanitizing the metric names can lead to duplicate metric names:

```
caller=level.go:63 level=error caller="error gathering metrics: [from Gatherer #2] collected metric \"node_ethtool_giant_hdr\" { label:<name:\"device\" value:\"ens192\" > untyped:<value:0" msg=" > } was collected before with the same name and label values"
```

Generate a map from the sanitized metric names to the metric names from
ethtool. In case of duplicate sanitized metric names drop both metrics,
because it is unknown which one to take.

Fixes: https://github.com/prometheus/node_exporter/issues/2185
Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
2021-11-15 11:22:36 +01:00
Johannes 'fish' Ziemke
85e20238e7
Add clocksource metrics to time collector (#2197)
* Add clocksource metrics to time collector

This closes #1336

Signed-off-by: Johannes 'fish' Ziemke <github@freigeist.org>
2021-11-12 11:45:31 +01:00
Benjamin Drung
9def2f9222
Add DMI collector (#2131)
Add a DMI collector to expose the Desktop Management Interface (DMI)
info from `/sys/class/dmi/id/`. This will expose information about the
BIOS, mainboard, chassis, and product.

Closes: https://github.com/prometheus/node_exporter/issues/303
Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
2021-10-27 13:56:37 +02:00
W. Andrew Denton
56e6dcb1fa ethtool: add newline to settings fixture.
Signed-off-by: W. Andrew Denton <git@flying-snail.net>
2021-10-20 19:11:30 +02:00
W. Andrew Denton
77a3c9bc20 Gather additional link info from ethtool.
The ethtool_cmd	struct from the	linux kernel contains information about the speeds and features	supported by a
network device. This includes speeds and duplex	but also features like autonegotiate and 802.3x pause frames.

Closes #1444

Signed-off-by: W. Andrew Denton <git@flying-snail.net>
2021-10-20 19:11:30 +02:00
Kiril Vladimirov
1721de0c38
collector: Unwrap glob textfile directories (#1985)
* collector: Unwrap glob textfile directories
* collector: Store full path in mtime's file label

The point is to avoid duplicated gauges from files with the same name in
different directories.

This introduces support for exporting from multiple directories matching
given pattern (e.g. `/home/*/metrics/`).

Signed-off-by: Kiril Vladimirov <kiril@vladimiroff.org>
2021-10-18 14:05:21 +02:00
Aleksei Zakharov
0e6b23c338
Lnstat: expose metrics from /proc/net/stat (#1771)
* Lnstat initial commit

Signed-off-by: Aleksei Zakharov <zaharov@selectel.ru>
Co-authored-by: Johannes 'fish' Ziemke <github@freigeist.org>
2021-09-28 10:24:18 +02:00
ventifus
0aec407666
Refactor diskstats (#2141)
* Refactor diskstats_linux to use procfs.
* Add `node_disk_info` metric.

Signed-off-by: W. Andrew Denton <git@flying-snail.net>
Co-authored-by: W. Andrew Denton <git@flying-snail.net>
2021-09-28 10:14:12 +02:00
Sergei Semenchuk
5de46c6bac
collect flag_info and bug_info only for one core (#2156)
Signed-off-by: binjip978 <binjip978@gmail.com>
2021-09-28 07:44:03 +02:00
Sergei Semenchuk
2b490d645e
add path label to rapl collector (#2146)
Signed-off-by: binjip978 <binjip978@gmail.com>
2021-09-27 22:57:03 +02:00
Benjamin Drung
b6215e649c Add os release collector
Currently Node Exporter has a metric called `node_uname_info` which of
course exposes uname info. While this is nice, it does not help if you
are running different OSes which could have similar uname info.

Therefore parse `/etc/os-release` or `/usr/lib/os-release` and expose a
`node_os_info` metric which provide information regarding the OS
release/version of the node. Also expose the major.minor part of the OS
release version as `node_os_version`.

Since the os-release files will not change often, cache the parsed
content and only refresh the cache if the modification time changes.

This `os` collector will read files outside of `/proc` and `/sys`, but
the os-release file is widely used and the format is standardized:
https://www.freedesktop.org/software/systemd/man/os-release.html

Bug: https://github.com/prometheus/node_exporter/issues/1574
Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
2021-08-19 14:04:21 +02:00
Benjamin Drung
26ca609183 ethtool: Expose node_ethtool_info metric
Add a `node_ethtool_info` metric to all ethtool devices to expose driver
information with following labels:

 * bus_info
 * driver
 * expansion_rom_version
 * firmware_version
 * version

This metric is useful to monitor the firmware version to be up-to-date.

Note: The version label might be malformed due to bug #39 in ethtool:
https://github.com/safchain/ethtool/issues/39

Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
2021-08-16 16:09:35 +02:00
Ben Kochie
97d4b01691
Bump prometheus/procfs library
Pull in bug fix for noisy logging.

Fixes: https://github.com/prometheus/node_exporter/issues/2086

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-07-21 21:40:21 +02:00
Ben Kochie
a6ebe10455
Merge branch 'master' into nvme 2021-07-12 17:09:51 +02:00
Luiz Angelo Daros de Luca
00aa2f34ce Add tapestats to collect tape devices statistics
It is based on diskstats to allow metrics reuse by simply
s/disk/tape/ the query.

Signed-off-by: Luiz Angelo Daros de Luca <luizluca@gmail.com>
2021-07-09 21:01:08 -03:00
Benjamin Drung
b23146db3f Add nvme collector
Add a collector for NVMes to expose the firmware versions. This requires
procfs >= 0.7.0.

Fixes #1891
Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
2021-07-06 13:38:15 +02:00
Kozlov Alexander
02ee897c03
Added conntrack statistics metrics (#1155)
* Added conntrack statistics metrics

Signed-off-by: Aleksandr Kozlov <avlkozlov@avito.ru>
Co-authored-by: Aleksandr Kozlov <avlkozlov@avito.ru>
Co-authored-by: Ben Kochie <superq@gmail.com>
2021-06-23 11:52:43 +02:00
W. Andrew Denton
892893ff05 ethtool: Log stats collection errors.
Signed-off-by: W. Andrew Denton <git@flying-snail.net>
2021-05-14 10:07:30 -07:00
W. Andrew Denton
807f3c3af3 ethtool: Remove end-to-end testing.
Signed-off-by: W. Andrew Denton <git@flying-snail.net>
2021-05-03 09:35:49 -07:00
W. Andrew Denton
596ff45f8f ethtool: Add a new ethtool stats collector (metrics equivalent to "ethtool -S")
Signed-off-by: W. Andrew Denton <git@flying-snail.net>
2021-04-29 11:07:26 -07:00
Ben Kochie
9893fca77e
Add flag to ignore network speed if it is unknown
Some devices (ex virtual) don't have a speed and report `-1` as the
speed value. Add a flag to allow ignoring speed on these devices.

Fixes: https://github.com/prometheus/node_exporter/issues/1967

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-03-18 11:36:31 +01:00
Ben Kochie
23e5b245a4
Sanitize strings from /sys/class/power_supply
Avoid panic on invalid UTF-8 from /sys/class/power_supply by
sanitizing strings parsed from the kernel.
* Add a broken string to the test fixtures.

Fixes: https://github.com/prometheus/node_exporter/issues/1979

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-03-03 18:05:51 +01:00
Ben Kochie
a37d3f659c
Release 1.1.0
* Update Build
  - Update CircleCI orb.
  - Update CIrcleCI Machine image.
  - Use golang-builder 1.15.
* Update Go modules.
* Fixup fixtures for XFS bug.

NOTE: We have improved some of the flag naming conventions (PR #1743). The old names are
      deprecated and will be removed in 2.0. They will continue to work for backwards
      compatibility.

* [CHANGE] Improve filter flag names #1743
* [CHANGE] Add btrfs and powersupplyclass to list of exporters enabled by default #1897
* [FEATURE] Add fibre channel collector #1786
* [FEATURE] Expose cpu bugs and flags as info metrics. #1788
* [FEATURE] Add network_route collector #1811
* [FEATURE] Add zoneinfo collector #1922
* [ENHANCEMENT] Add more InfiniBand counters #1694
* [ENHANCEMENT] Add flag to aggr ipvs metrics to avoid high cardinality metrics #1709
* [ENHANCEMENT] Adding backlog/current queue length to qdisc collector #1732
* [ENHANCEMENT] Include TCP OutRsts in netstat metrics #1733
* [ENHANCEMENT] Add pool size to entropy collector #1753
* [ENHANCEMENT] Remove CGO dependencies for OpenBSD amd64 #1774
* [ENHANCEMENT] bcache: add writeback_rate_debug stats #1658
* [ENHANCEMENT] Add check state for mdadm arrays via node_md_state metric #1810
* [ENHANCEMENT] Expose XFS inode statistics #1870
* [ENHANCEMENT] Expose zfs zpool state #1878
* [ENHANCEMENT] Added an ability to pass collector.supervisord.url via SUPERVISORD_URL environment variable #1947
* [BUGFIX] filesystem_freebsd: Fix label values #1728
* [BUGFIX] Fix various procfs parsing errors #1735
* [BUGFIX] Handle no data from powersupplyclass #1747
* [BUGFIX] udp_queues_linux.go: change upd to udp in two error strings #1769
* [BUGFIX] Fix node_scrape_collector_success behaviour #1816
* [BUGFIX] Fix NodeRAIDDegraded to not use a string rule expressions #1827
* [BUGFIX] Fix node_md_disks state label from fail to failed #1862
* [BUGFIX] Handle EPERM for syscall in timex collector #1938
* [BUGFIX] bcache: fix typo in a metric name #1943
* [BUGFIX] Fix XFS read/write stats (https://github.com/prometheus/procfs/pull/343)

Signed-off-by: Ben Kochie <superq@gmail.com>
2021-02-05 21:23:23 +01:00
Ben Kochie
1729558e11
Merge pull request #1922 from kwisniewski98/zone
Add zoneinfo collector
2021-02-05 13:57:54 +01:00
Ben Kochie
78682c80af
Merge pull request #1786 from deusnefum/master
Add fibre channel collector
2021-02-03 18:22:59 +01:00
mhiles
5a28930e2e change fc_host everywhere, update fixtures
Signed-off-by: mhiles <hiles@hpe.com>
2021-02-03 09:35:58 -05:00
mhiles
19790920b2 update fibrechannel fixture
Signed-off-by: mhiles <hiles@hpe.com>
2021-01-29 10:38:45 -05:00
mhiles
6e02d5bff1 invalid_tx_word_total -> invalid_tx_words_total
Signed-off-by: mhiles <hiles@hpe.com>
2021-01-29 10:28:22 -05:00
Wisniewski, Krzysztof2
997a8fbb7f Add zoneinfo collector
Signed-off-by: Wisniewski, Krzysztof2 <Krzysztof2.Wisniewski@intel.com>
2021-01-26 12:00:35 +01:00
Hu Shuai
0900cb5655 bcache: fix typo
Fix typos in collector/bcache_linux.go and collector/fixtures/e2e-output.txt

Signed-off-by: Hu Shuai <hus.fnst@cn.fujitsu.com>
2021-01-25 13:50:16 +08:00
Ben Kochie
9fab63825b
Merge pull request #1753 from binjip978/entropy
add pool size to entropy collector
2021-01-24 14:57:33 +01:00
Artur Molchanov
519203e3d0 Expose zfs zpool state
Create a metric node_zfs_zpool_state.

Signed-off-by: Artur Molchanov <artur.molchanov@easybrain.com>
2020-10-27 17:39:13 +03:00
Ondrej Baudys
ed10485073
Expose XFS inode statistics (#1869) (#1870)
* Expose XFS inode statistics (#1869)

Also fixes #1177

@SuperQ @discordianfish

Signed-off-by: Ondrej Baudys <obaudys@gmail.com>
Co-authored-by: obaudys@gmail.com <ondrej.baudys@nextgen.net>
2020-10-22 18:14:33 +02:00
Christian Rohmann
a3aaf63bb1
Add check state for mdadm arrays via node_md_state metric (#1810)
* Expose metric for state=check for node_md_state
* Added new e2e output fixture including md201 which is in checking state and a the new state=check labeled metric for all other md

Signed-off-by: Christian Rohmann <github@frittentheke.de>
2020-09-27 13:44:45 +02:00
Julius Volz
d05aac43e4 Fix capitalization of CPU acronym throughout
Signed-off-by: Julius Volz <julius.volz@gmail.com>
2020-09-03 23:34:33 +02:00
Aleksei Zakharov
0478ddef69
bcache: add writeback_rate_debug stats (#1658)
* bcache: add writeback_rate_debug export

Signed-off-by: Aleksei Zakharov <zaharov@selectel.ru>
2020-08-24 17:40:31 +02:00
domchan
503e4fc848
Expose cpu bugs and flags as info metrics. (#1788)
* Expose cpu bugs and flags as info metrics with a regexp filter.
* Automatically enable CPU info metrics when using flags or bugs feature.

Signed-off-by: domgoer <domdoumc@gmail.com>
2020-07-17 18:32:23 +02:00
mhiles
10f6841caf counter compliance again
Signed-off-by: mhiles <hiles@hpe.com>
2020-07-13 12:00:43 -04:00
mhiles
d80ae383b7 conform to metric naming conventions
Signed-off-by: mhiles <hiles@hpe.com>
2020-07-13 10:30:52 -04:00
mhiles
076c953488 update fixtures / e2e test for fibre channel
Signed-off-by: mhiles <hiles@hpe.com>
2020-07-13 09:30:19 -04:00
Ben Kochie
5d42d4d99f
Merge pull request #1732 from fach/master
Adding backlog/current queue length to qdisc collector
2020-06-22 20:15:04 +02:00