This enables the collection of pressure stall information as exposed
by the `/proc/pressure` interface added in the 4.20 release of the
Linux kernel.
Closes#1174
Signed-off-by: Daniele Sluijters <daenney@users.noreply.github.com>
The cpu frequency information is not always needed and/or available.
This change allows the cpu frequency metrics to be enabled/disabled
separately from the other cpu metrics, and also prevents a frequency
metric failure (such as a parse error) from failing the main cpu
collector.
Fixes#1241
Signed-off-by: Paul Gier <pgier@redhat.com>
* netclass_linux: remove varying labels from the 'up' metric
This moves the variable label values such as 'operstate' out of
the 'network_up' metric and into a separate metric called '_info'.
This allows the 'up' metric to remain continous over state changes.
Fixes#1236
Signed-off-by: Paul Gier <pgier@redhat.com>
* Rename interface to device in netclass collector
This makes it consistent with other networking metrics like node_network_receive_bytes_total
This closes#1223
Signed-off-by: Johannes 'fish' Ziemke <github@freigeist.org>
* netstat: Add TCP In/Out Segs
In order to get a better idea of TCP packet loss, we need to know how
many `node_netstat_Tcp_OutSegs` there are so we can compare this to
`node_netstat_Tcp_RetransSegs`.
Signed-off-by: Ben Kochie <superq@gmail.com>
* Update fixtures
Signed-off-by: Ben Kochie <superq@gmail.com>
The format of /proc/diskstats is changing in linux-4.19 to include some
additional fields. See: https://www.kernel.org/doc/Documentation/iostats.txt
* collector/diskstats: use constants for some hard coded strings
* collector/diskstats: update diskstats for linux-4.19
* collector/diskstats: remove kernel doc url from individual metrics
Signed-off-by: Paul Gier <pgier@redhat.com>
* infiniband: Add not connected i40iw0/ports/1 fixtures
* infiniband: Handle issue when iWARP* RDMA modules are not available
This is related to #966, and handle this error,
Jun 07 13:33:24 hostname node_exporter[81888]: time="2018-06-07T13:33:24+02:00" level=error msg="ERROR: infiniband
collector failed after 0.000929s: strconv.ParseUint: parsing \"N/A (no PMA)\": invalid syntax" source="collector.go:132"
Signed-off-by: Mario Trangoni <mjtrangoni@gmail.com>
Fix typo on unit description of metric `*read_time_seconds_total` from milliseconds to seconds.
Signed-off-by: Marco Tulio R Braga <marco.tulio@mtulio.eng.br>
* vendor: Update prometheus/procfs
Signed-off-by: Hannes Körber <hannes.koerber@haktec.de>
* mountstats: Use new NFS protocol field
In https://github.com/prometheus/procfs/pull/100, the NFSTransportStats
struct was expanded by a field called protocol that specifies the NFS
protocol in use, either "tcp" or "udp". This commit adds the protocol as
a label to all NFS metrics exported via the mountstats collector.
Signed-off-by: Hannes Körber <hannes.koerber@haktec.de>
* Update fixtures for UDP mount
Signed-off-by: Hannes Körber <hannes.koerber@haktec.de>
* add sys/class/net parsing from procfs and expose its metrics
Signed-off-by: Jan Klat <jenik@klatys.cz>
* change code to use int pointers per procfs change, move netclass to separate collector, change metric naming
Signed-off-by: Jan Klat <jenik@klatys.cz>
* bump year in licence, remove redundant newline, correct fixtures
Signed-off-by: Jan Klat <jenik@klatys.cz>
* fix style
Signed-off-by: Jan Klat <jenik@klatys.cz>
* change carrier changes to counter type
Signed-off-by: Jan Klat <jenik@klatys.cz>
* fix e2e output
Signed-off-by: Jan Klat <jenik@klatys.cz>
* add fixtures
Signed-off-by: Jan Klat <jenik@klatys.cz>
* update vendor, use fixtures correctly
Signed-off-by: Jan Klat <jenik@klatys.cz>
* change fixtures (device in /sys/class/net should be symlinked)
Signed-off-by: Jan Klat <jenik@klatys.cz>
* correct fixtures for 64k page, updated readme
Signed-off-by: Jan Klat <jenik@klatys.cz>
* Send "Personality unknown" to debug, not info, remove unnecessary newline.
* Add support for "linear" personality.
* Always set number of active disks to 0 when a device is inactive.
* Add total disks calculation to unknown personalites.
Signed-off-by: Ben Kochie <superq@gmail.com>
* cpu: Add a 2nd label 'package' to metric node_cpu_core_throttles_total
This commit fixes the node_cpu_core_throttles_total metrics on
multi-socket systems as the core_ids are the same for each package.
I.e. we need to count them seperately.
Rename the node_package_throttles_total metric label `node` to `package`.
Reorganize the sys.ttar archive and use the same symlinks as the Linux
kernel. Also, the new fixtures now use a dual-socket dual-core cpu w/o
HT/SMT (node0: cpu0+1, node1: cpu2+3) as well as processor-less
(memory-only) NUMA node 'node2' (this is a very rare case).
Signed-off-by: Karsten Weiss <knweiss@gmail.com>
* cpu: Use the direct /sys path to the cpu files.
Use the direct path /sys/devices/system/cpu/cpu[0-9]* (without symlinks)
instead of /sys/bus/cpu/devices/cpu[0-9]*.
The latter path also does not exist e.g. on RHEL 6.9's kernel.
Signed-off-by: Karsten Weiss <knweiss@gmail.com>
* cpu: Reverse core+package throttle processing order
Signed-off-by: Karsten Weiss <knweiss@gmail.com>
* cpu: Add documentation URLs
Signed-off-by: Karsten Weiss <knweiss@gmail.com>
Netstat is 40% of the metrics on my laptop, many of which
are highly detailed information about IP internals in the kernel.
~300 such metrics on every machine in your fleet is excessive,
so focus on key metrics by default, overridable by the user.
Fixes#515
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
Vmstat has over 100 fields, most of which are highly
detailed debug information. Trim this down to only
essential fields by default, configurable by flag.
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Only report core throttles per core, not per cpu
* Add topology/core_id to the cpu sysfs fixtures
* Add new cpu fixtures to ttar file
* Merge core_id reading and thermal throttle accounting
* Declare core_id
* updates for zfsonlinux 0.7.5
* add constants for KSTAT_DATA_* types
* added e2e test for negative values represented by uint64 that can result from ZFS bugs
* Unify CPU collector conventions
Add a common CPU metric description.
* All collectors use the same `nodeCpuSecondsDesc`.
* All collectors drop the `cpu` prefix for `cpu` label values.
* Fix subsystem string in cpu_freebsd.
* Fix Linux CPU freq label names.
* Improve stat linux metric names.
cpu is no longer used.
* node_cpu -> node_cpu_seconds_total for Linux
* Improve filesystem metric names with units
* Improve units and names of linux disk stats
Remove sector metrics, the bytes metrics cover those already.
* Infiniband counters should end in _total
* Improve timex metric names, convert to more normal units.
See
3c073991eb/kernel/time/ntp.c (L909)
for what stabil means, looks like a moving average of some form.
* Update test fixture
* For meminfo metrics that had "kB" units, add _bytes
* Interrupts counter should have _total
Linux "guest" metrics for VMs are already accounted for in node_cpu
`user` and `nice` metrics. Separate these into their own metric to
avoid duplication of data.
* cpu: Metric 'package_throttles_total' is per package.
'package_throttles_total' is per package, not per cpu. This also reduces
the total number of cpu time series a lot (esp for multi core cpus).
* cpu: Better handling of a cpulist edge-case.
* cpu: Extract the package number from the directory name.
Do not rely on the range index.
* cpu: Add package_throttle_count for node0 cpu1
This file must be ignored by the cpu collector.