This collector is based on adjtimex(2) system call. The collector returns three values, status if time is synchronised, offset to remote reference, and local clock frequency adjustment. Values are taken from kernel time keeping data structures to avoid getting involved how the synchronisation is implemented. By that I mean one should not care if time is update using ntpd, systemd.timesyncd, ptpd, and so on. Since all time sync implementation will always end up telling to kernel what is the status with time one can simply omit the software in between, and look results of the syncing. As a positive side effect this makes collector very quick and conceptually specific, this does not monitor availability of NTP server, or network in between, or dns resolution, and other unrelated but necessary things. Minimum set of values to keep eye on are the following three: The node_timex_sync_status tells if local clock is in sync with a remote clock. Value is set to zero when synchronisation to a reliable server is lost, or a time sync software is misconfigured. The node_timex_offset_seconds tells how much local clock is off when compared to reference. In case of multiple time references this value is outcome of RFC 5905 adjustment algorithm. Ideally offset should be close to zero, and it depends about use case how large value is acceptable. For example a typical web server is probably fine if offset is about 0.1 or less, but that would not be good enough for mobile phone base station operator. The node_timex_freq tells amount of adjustment to local clock tick frequency. For example if offset is one second and growing the local clock will need instruction to tick quicker. Number value itself is not very important, and occasional small adjustments are fine. When frequency is unusually in stable one can assume quality of time stamps will not be accurate to very far in sub second range. Obviously explaining why local clock frequency behaves like a passenger in roller coaster is different matter. Explanations can vary from system load, to environmental issues such as a machine being physically too hot. Rest of the measurements can help when debugging. If you run a clock server do probably want to collect and keep track of everything. Pull-request: https://github.com/prometheus/node_exporter/pull/664
7.8 KiB
Node exporter
Prometheus exporter for hardware and OS metrics exposed by *NIX kernels, written in Go with pluggable metric collectors.
The WMI exporter is recommended for Windows users.
Collectors
There is varying support for collectors on each operating system. The tables below list all existing collectors and the supported systems.
Which collectors are used is controlled by the --collectors.enabled
flag.
Enabled by default
Name | Description | OS |
---|---|---|
arp | Exposes ARP statistics from /proc/net/arp . |
Linux |
bcache | Exposes bcache statistics from /sys/fs/bcache/ . |
Linux |
conntrack | Shows conntrack statistics (does nothing if no /proc/sys/net/netfilter/ present). |
Linux |
cpu | Exposes CPU statistics | Darwin, Dragonfly, FreeBSD, Linux |
diskstats | Exposes disk I/O statistics. | Darwin, Linux |
edac | Exposes error detection and correction statistics. | Linux |
entropy | Exposes available entropy. | Linux |
exec | Exposes execution statistics. | Dragonfly, FreeBSD |
filefd | Exposes file descriptor statistics from /proc/sys/fs/file-nr . |
Linux |
filesystem | Exposes filesystem statistics, such as disk space used. | Darwin, Dragonfly, FreeBSD, Linux, OpenBSD |
hwmon | Expose hardware monitoring and sensor data from /sys/class/hwmon/ . |
Linux |
infiniband | Exposes network statistics specific to InfiniBand and Intel OmniPath configurations. | Linux |
ipvs | Exposes IPVS status from /proc/net/ip_vs and stats from /proc/net/ip_vs_stats . |
Linux |
loadavg | Exposes load average. | Darwin, Dragonfly, FreeBSD, Linux, NetBSD, OpenBSD, Solaris |
mdadm | Exposes statistics about devices in /proc/mdstat (does nothing if no /proc/mdstat present). |
Linux |
meminfo | Exposes memory statistics. | Darwin, Dragonfly, FreeBSD, Linux |
netdev | Exposes network interface statistics such as bytes transferred. | Darwin, Dragonfly, FreeBSD, Linux, OpenBSD |
netstat | Exposes network statistics from /proc/net/netstat . This is the same information as netstat -s . |
Linux |
sockstat | Exposes various statistics from /proc/net/sockstat . |
Linux |
stat | Exposes various statistics from /proc/stat . This includes boot time, forks and interrupts. |
Linux |
textfile | Exposes statistics read from local disk. The --collector.textfile.directory flag must be set. |
any |
time | Exposes the current system time. | any |
timex | Exposes selected adjtimex(2) system call stats. | Linux |
uname | Exposes system information as provided by the uname system call. | Linux |
vmstat | Exposes statistics from /proc/vmstat . |
Linux |
wifi | Exposes WiFi device and station statistics. | Linux |
xfs | Exposes XFS runtime statistics. | Linux (kernel 4.4+) |
zfs | Exposes ZFS performance statistics. | Linux |
Disabled by default
Name | Description | OS |
---|---|---|
bonding | Exposes the number of configured and active slaves of Linux bonding interfaces. | Linux |
buddyinfo | Exposes statistics of memory fragments as reported by /proc/buddyinfo. | Linux |
devstat | Exposes device statistics | Dragonfly, FreeBSD |
drbd | Exposes Distributed Replicated Block Device statistics (to version 8.4) | Linux |
interrupts | Exposes detailed interrupts statistics. | Linux, OpenBSD |
ksmd | Exposes kernel and system statistics from /sys/kernel/mm/ksm . |
Linux |
logind | Exposes session counts from logind. | Linux |
meminfo_numa | Exposes memory statistics from /proc/meminfo_numa . |
Linux |
mountstats | Exposes filesystem statistics from /proc/self/mountstats . Exposes detailed NFS client statistics. |
Linux |
nfs | Exposes NFS client statistics from /proc/net/rpc/nfs . This is the same information as nfsstat -c . |
Linux |
ntp | Exposes local NTP daemon health to check time | any |
qdisc | Exposes queuing discipline statistics | Linux |
runit | Exposes service status from runit. | any |
supervisord | Exposes service status from supervisord. | any |
systemd | Exposes service and system status from systemd. | Linux |
tcpstat | Exposes TCP connection status information from /proc/net/tcp and /proc/net/tcp6 . (Warning: the current version has potential performance issues in high load situations.) |
Linux |
Deprecated
These collectors will be (re)moved in the future.
Name | Description | OS |
---|---|---|
gmond | Exposes statistics from Ganglia. | any |
megacli | Exposes RAID statistics from MegaCLI. | Linux |
Textfile Collector
The textfile collector is similar to the Pushgateway, in that it allows exporting of statistics from batch jobs. It can also be used to export static metrics, such as what role a machine has. The Pushgateway should be used for service-level metrics. The textfile module is for metrics that are tied to a machine.
To use it, set the --collector.textfile.directory
flag on the Node exporter. The
collector will parse all files in that directory matching the glob *.prom
using the text
format.
To atomically push completion time for a cron job:
echo my_batch_job_completion_time $(date +%s) > /path/to/directory/my_batch_job.prom.$$
mv /path/to/directory/my_batch_job.prom.$$ /path/to/directory/my_batch_job.prom
To statically set roles for a machine using labels:
echo 'role{role="application_server"} 1' > /path/to/directory/role.prom.$$
mv /path/to/directory/role.prom.$$ /path/to/directory/role.prom
Building and running
go get github.com/prometheus/node_exporter
cd ${GOPATH-$HOME/go}/src/github.com/prometheus/node_exporter
make
./node_exporter <flags>
To see all available configuration flags:
./node_exporter -h
Running tests
make test
Using Docker
The node_exporter is designed to monitor the host system. It's not recommended to deploy it as Docker container because it requires access to the host system. If you need to run it on Docker, you can deploy this exporter using the node-exporter Docker image with the following options and bind-mounts:
docker run -d -p 9100:9100 \
-v "/proc:/host/proc:ro" \
-v "/sys:/host/sys:ro" \
-v "/:/rootfs:ro" \
--net="host" \
quay.io/prometheus/node-exporter \
--collector.procfs /host/proc \
--collector.sysfs /host/sys \
--collector.filesystem.ignored-mount-points "^/(sys|proc|dev|host|etc)($|/)"
Be aware though that the mountpoint label in various metrics will now have
/rootfs
as prefix.
Using a third-party repository for RHEL/CentOS/Fedora
There is a community-supplied COPR repository. It closely follows upstream releases.