This change adds a new collector called "nfs" that parses the contents
of /proc/net/rpc/nfs and turns it into metrics. It can be used to
inspect the number of operations per type, but also to keep an eye on an
extraneous number of retransmissions, which may indicate connectivity
issues.
I've picked the name "nfs", as most operating systems use "nfs" for the
client component and "nfsd" as the server component. If we want to add
stats for the NFS server as well, we'd better call such a collector
"nfsd".
We seem to have a small number of Linux servers here that have lines in
/proc/mdstat that cannot be parsed by the node exporter, due to them
containing attributes that are not matched by the regular expression
("super 1.2").
Extend the regular expression to skip this data, just like we do for all
of the other status lines.
* Prefer device path based names over exported names
For some sensors (like coretemp) it is possible that multiple
instances exist, thus base the name on the device path and not on
the exported name.
* Update end-to-end test for dual socket machines
Explicitly have 2 coretemp instances with a symlink for the device
such that the hwmon collector must pick that name (or fail)
* Add Linux NUMA "numastat" metrics
Read the `numastat` metrics from /sys/devices/system/node/node* when reading NUMA meminfo metrics.
* Update end-to-end test output.
* Add `numastat` metrics as counters.
* Add tests for error conditions.
* Refactor meminfo numa metrics struct
* Refactor meminfoKey into a simple struct of metric data.
This makes it easier to pass slices of metrics around.
* Refactor tests.
* Fixup: Add suggested fixes.
* Fixup: More fixes
* Add another scanner.Err() return
* Add "_total" to counter metrics.
* Add hwmon support (mainly known from lm-sensors)
This commit adds initial support for linux hardware sensors, exported
through sysfs.
Details of the interface can be found at
https://www.kernel.org/doc/Documentation/hwmon/sysfs-interface
* Add end-to-end test with some real life data
* Cleanup comments on hwmon collector
* Drop raw sensor name from hwmon output
* Let the sensor label be "sensor"
* Add hwmon short description to README.
It turns out, on some kernels (notably - CentOS6) there is an empty line
inserted at the beginning of /sys/devices/system/node/node*/meminfo
files. The leads to node_exporter crash on such kernels.
Fix this by checking for empty string first.
Signed-off-by: Pavel Borzenkov <pavel.borzenkov@gmail.com>
Add new collector which exposes the content of /sys/kernel/mm/ksm
directory. This directory contains control and statistics files for
Kernel Samepage Merging daemon.
The collector is not enabled by default.
Signed-off-by: Pavel Borzenkov <pavel.borzenkov@gmail.com>
It is sometimes useful to understand the distribution of free/occupied
memory between NUMA nodes to deal with performance problems. To do so,
add new meminfo_numa collector that enables exporting of per node
statistics along with unit and end-to-end tests for it.
Signed-off-by: Pavel Borzenkov <pavel.borzenkov@gmail.com>
This test runs a selection of collectors against the fixtures and
compares the output to a reference.
The uname and filesystem collectors are disabled because they use system
calls that cannot be fixtured easily.
Fixed file-nr update function
Fixed file-nr test case
Fixed file-nr test case again
Fixed file-nr separator to tab
Updated file-nr to filenr.
Updated file-nr to filenr.
Fixed file-nr test cases, added comments
Remove reporting the second value from file-nr as it will alwasy be zero in linux 2.6 and greator
Renaming file-nr to filefd
Updated build constraint
Updates and code cleanup for filefd.
Updated enabledCollectors with the correct name for filefd
Fixed filefd test wording
initial work on sockstat work
Fixed package name
Finished implementation of the sockstat plugin
missed a return value
Added sockstat to default plugins to start
Fixed scanner read on sockstat
fixed sockstat linux test for TCP alloc
update sockstat test case
Updated sockstat to return TCP and UDP memory in bytes instead of page count
This collector exposes two metrics:
- net_bonding_slaves: configured slaves per bonding interface
- net_bonding_slaves_active: currently active slaves per bonding
interface
This collector exports the following metrics:
- raid_drive_temperature: drive temperature
- raid_drive_count: drive error and event counters
- raid_adapter_disk_presence: disk presence per adapter