Commit Graph

499 Commits

Author SHA1 Message Date
Ben Kochie
1ab4a460c7 Update ppc64le end-to-end fixture.
Signed-off-by: Ben Kochie <superq@gmail.com>
2018-04-18 09:12:21 +02:00
Ben Kochie
0f5be132ac
Merge pull request #904 from prometheus/superq/if_alias
Fix parsing of interface aliases in netdev linux
2018-04-17 13:37:21 +02:00
Ben Kochie
a528966dcd Fix parsing of interface aliases in netdev linux
Very old kernels expose interface aliases as `foo0:0`, adjust the line
parsing to handle these names.

Signed-off-by: Ben Kochie <superq@gmail.com>
2018-04-17 13:15:02 +02:00
Ben Kochie
f6008b242b
Merge pull request #901 from mischief/bsd_boottime
collector: implement node_boot_time_seconds for OpenBSD/NetBSD/Darwin
2018-04-17 07:48:39 +02:00
Jürgen Hötzel
de0632c2e9 Fix memory corruption when number of filesystems > 16 (#900)
Signed-off-by: Juergen Hoetzel <juergen@archlinux.org>
2018-04-16 12:39:15 +02:00
mischief
26a385d7ab collector: implement node_boot_time_seconds for OpenBSD/NetBSD/Darwin
Signed-off-by: mischief <mischief@offblast.org>
2018-04-15 08:26:46 +00:00
Ben Kochie
015b86670a
Update ppc64le e2e output.
Signed-off-by: Ben Kochie <superq@gmail.com>
2018-04-14 15:28:06 +02:00
Ben Kochie
0507b0c9a2
Fix formatting.
Signed-off-by: Ben Kochie <superq@gmail.com>
2018-04-14 15:02:20 +02:00
Dmitriy Lukyanchikov
eddd1b9357 Fix netdev collector for linux (#890)
fix variable name, fix transmitHeader extracting
modify fixtures to run tests with updated netdev_linux collector

Signed-off-by: dmitriy-lukyanchikov <d.lukyanchikov@anchorfree.com>
2018-04-14 13:58:56 +02:00
Derek Marcotte
fe86e908da Update ppc64 fixtures to unbreak end-to-end.
efc1fdb added new labels.

Signed-off-by: Derek Marcotte <554b8425@razorfever.net>
2018-04-13 06:33:38 -04:00
Karsten Weiss
7e392e6634 Fix spelling mistakes found by codespell
Signed-off-by: Karsten Weiss <knweiss@gmail.com>
2018-04-09 18:27:17 +02:00
Karsten Weiss
efc1fdb6d0 cpu: Add a 2nd label 'package' to metric node_cpu_core_throttles_total (#871)
* cpu: Add a 2nd label 'package' to metric node_cpu_core_throttles_total

This commit fixes the node_cpu_core_throttles_total metrics on
multi-socket systems as the core_ids are the same for each package.
I.e. we need to count them seperately.

Rename the node_package_throttles_total metric label `node` to `package`.

Reorganize the sys.ttar archive and use the same symlinks as the Linux
kernel. Also, the new fixtures now use a dual-socket dual-core cpu w/o
HT/SMT (node0: cpu0+1, node1: cpu2+3) as well as processor-less
(memory-only) NUMA node 'node2' (this is a very rare case).

Signed-off-by: Karsten Weiss <knweiss@gmail.com>

* cpu: Use the direct /sys path to the cpu files.

Use the direct path /sys/devices/system/cpu/cpu[0-9]* (without symlinks)
instead of /sys/bus/cpu/devices/cpu[0-9]*.

The latter path also does not exist e.g. on RHEL 6.9's kernel.

Signed-off-by: Karsten Weiss <knweiss@gmail.com>

* cpu: Reverse core+package throttle processing order

Signed-off-by: Karsten Weiss <knweiss@gmail.com>

* cpu: Add documentation URLs

Signed-off-by: Karsten Weiss <knweiss@gmail.com>
2018-04-09 18:01:52 +02:00
Brian Brazil
31ce32f1fe
Greatly trim what netstat collector exposes by default (#876)
Netstat is 40% of the metrics on my laptop, many of which
are highly detailed information about IP internals in the kernel.
~300 such metrics on every machine in your fleet is excessive,
so focus on key metrics by default, overridable by the user.

Fixes #515

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
2018-03-30 19:28:08 +01:00
Ben Kochie
cf3edadcbb Update fixtures
* Add oom_kill to fixture.
* Update e2e outputs.
* Put regexp in order.

Signed-off-by: Ben Kochie <superq@gmail.com>
2018-03-29 22:00:02 +01:00
Brian Brazil
499c342fed Greatly reduce the metrics vmstat returns by default.
Vmstat has over 100 fields, most of which are highly
detailed debug information. Trim this down to only
essential fields by default, configurable by flag.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
2018-03-29 22:00:02 +01:00
Brian Brazil
c8c144587e
Enable bonding collector by default. (#872)
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
2018-03-29 15:18:12 +01:00
Ben Kochie
779090db7e
Update ppc64le fixture (#867)
Update to match standard e2e output.

Signed-off-by: Ben Kochie <superq@gmail.com>
2018-03-27 17:05:20 +02:00
Mario Trangoni
1f11a86d59 Fix nfs golint issues (#863)
* procfs: update vendoring

Signed-off-by: Mario Trangoni <mjtrangoni@gmail.com>

* procfs: fix e2e tests after nfs changes

Signed-off-by: Mario Trangoni <mjtrangoni@gmail.com>
2018-03-22 22:25:37 +01:00
Ben Kochie
7b720df1c5
Use lowercase cpu label name in interrupts (#849)
To match other CPU related metric labels, use a lowercase named label.
2018-03-08 15:04:49 +01:00
Johannes 'fish' Ziemke
424ca8e322 Drop exec_ in boot_timestamp_seconds on *bsd (#839)
This closes #827.
2018-03-08 12:59:48 +01:00
colmbuckley
098f975b48 Correct the ClocksPerSec scaling factor on Darwin (#846)
* Update cpu_darwin.go

Change the definition of ClocksPerSec to read from limits.h

* Update cpu_darwin.go
2018-03-07 11:56:57 +01:00
Julius Volz
864a6ee935 Treat custom textfile metric timestamps as errors (#769)
This is clearer behavior and users will notice and fix their textfiles faster
than if we just output a warning.
2018-02-27 19:43:38 +01:00
Rene Treffer
c504c7e264 Only report core throttles per core, not per cpu (#836)
* Only report core throttles per core, not per cpu

* Add topology/core_id to the cpu sysfs fixtures

* Add new cpu fixtures to ttar file

* Merge core_id reading and thermal throttle accounting

* Declare core_id
2018-02-27 19:43:15 +01:00
Ben Kochie
e0d54a509c
Cleanup NFS metrics (#834)
* Cleanup NFS metrics

* Update `nfs` metric names to match `nfsd`.
* Remove uneeded `tcp` label from TCP connections metric.
* Remove uneeded `v` on `nfsd` metrics.
* Enable all `nfs` v4 client metrics.
* Remove `nfs` metric name overrides.

* Add ppc64le fixture.

* Fix typo.
2018-02-21 07:25:41 +01:00
Ben Kochie
3f41a2fecb
Update ppc64le fixture (#832)
Updates fixture for ppc64le arch to latest output.
2018-02-19 20:43:33 +01:00
Ben Kochie
d33a447047
Remove deprecated prometheus.InstrumentHandlerFunc (#831)
Update Prometheus client golang use to use `promhttp.Handler()` instead
of `prometheus.InstrumentHandlerFunc()`.
2018-02-19 15:44:59 +01:00
Richard Elling
d7348a5c78 updates for zfsonlinux 0.7.5 (#779)
* updates for zfsonlinux 0.7.5

* add constants for KSTAT_DATA_* types

* added e2e test for negative values represented by uint64 that can result from ZFS bugs
2018-02-16 15:46:31 +01:00
Ben Kochie
6468e7c80b
Enable NFS client metrics by default. (#828)
Enable NFS client metrics by default now that it nolonger prints errors
on scrape if there are no metrics to display.

Also fixup the nfsd README to match the nfs entry.
2018-02-16 15:42:47 +01:00
Ralf Horstmann
8d9c7ca659 Use swpginuse instead of swpgonly in meminfo_openbsd (#813)
All tools in OpenBSD base system use swpginuse instead of swpgonly
for reporting swap usage (snmpd, swapctl, top, vmstat), so let
memory collector use that as well for consistency.
2018-02-16 11:34:41 +01:00
Kasinath Kottukkal
f6965e1812 Add overlay to defIgnoredFSTypes (#824)
* Add overlay to defIgnoredFSTypes

To avoid statfs() errors if node_exporter is running as non privileged user.

* Updated defIngoredFSTypes values in sorted order
2018-02-16 09:47:50 +01:00
Ben Kochie
01bd99fb1a
Refactor NFS client collector (#816)
* Update vendor github.com/prometheus/procfs/...

* Refactor NFS collector

Use new procfs library to parse NFS client stats.

* Ignore nfs proc file not existing.

* Refactor with reflection to walk the structs.
2018-02-15 13:40:38 +01:00
Brian Brazil
52c031890e
Add _seconds suffix to node_time. (#823) 2018-02-14 16:59:08 +00:00
Ben Kochie
05eabe60fb
Fix error output in nfsd collector. (#821) 2018-02-14 13:57:35 +01:00
Ben Kochie
3de2542d21
Fix NFSd metric type (#819)
RPC Count should be a counter, not a gauge.
2018-02-13 17:03:22 +01:00
Matt Layher
544488ddd6 Fix remaining metric naming issues (#799) 2018-02-12 18:53:31 +01:00
Ben Kochie
6a041692ed
Add NFS Server metrics collector. (#803)
* Add NFS Server metrics collector.

* Add File Handles metrics.

* Add nfsd IO stats.

* Add metrics for NFSd threads.

* Add metrics for NFSd read ahead cache.

* Add NFSd network traffic counters.

* Add RPC metrics.

* Add V2 requests metrics.

* Add NFSv3 metrics.

* Add NFSv4 metrics.

* Update reply cache comment.

* Update help text.
2018-02-12 17:56:05 +01:00
Brian Brazil
1072f2868d Fix log level regression in #533 2018-02-07 15:16:20 +00:00
Brian Brazil
7e41a2b279 Ignore /var/lib/docker by default. (#814)
The node exporter runs unprivileged, so it cannot statfs any filesystems
under this directory causing log spam.  In addition there tends to be
high churn in the filesystems here (as it's basically application
monitoring) which can cause high cardinaltiy and in one case caused
Prometheus's index symbol table to get very large.
Accordingly this should be ignored to reduce log spam and avoid
performance issues. The filesystems themselves can in principle be
monitored via container oriented exporters, and the underlying
filesystems will still be monitored.
2018-02-06 17:10:59 +01:00
Ralf Horstmann
29ac809e48 Use unified CPU metric description on OpenBSD (#810) 2018-02-01 23:59:19 +01:00
Derek Marcotte
fde5d2c6c9 Remove unsafe typecasts from sysctl_bsd getStructTimeval. (#741)
There is a simpler way.
2018-02-01 18:43:40 +01:00
Ben Kochie
14d60958d6
Unify CPU collector conventions (#806)
* Unify CPU collector conventions

Add a common CPU metric description.
* All collectors use the same `nodeCpuSecondsDesc`.
* All collectors drop the `cpu` prefix for `cpu` label values.

* Fix subsystem string in cpu_freebsd.

* Fix Linux CPU freq label names.
2018-02-01 18:42:20 +01:00
Ralf Horstmann
e3c76b1f0c Add OpenBSD CPU collector (#805) 2018-02-01 18:33:49 +01:00
Tom Wilkie
6833eec187 Fix tests. 2018-01-31 15:22:17 +00:00
Tom Wilkie
0316bacceb Only use one dbus connection, required some refactoring. 2018-01-31 15:19:18 +00:00
Tom Wilkie
a7fd6b8743 Export systemd timer last trigger sec. 2018-01-31 15:07:04 +00:00
Ben Kochie
111e3af437
Remove obsolete megacli collector. (#798)
This collector has been replaced by the textfile collector tool
`storcli.py`.
2018-01-23 11:25:42 +01:00
Julius Volz
6cac74f0e0
Add unit suffix to textfile collector mtime metric (#796) 2018-01-22 14:02:19 +01:00
Brian Brazil
a98067a294 Make metrics better follow guidelines (#787)
* Improve stat linux metric names.

cpu is no longer used.

* node_cpu -> node_cpu_seconds_total for Linux

* Improve filesystem metric names with units

* Improve units and names of linux disk stats

Remove sector metrics, the bytes metrics cover those already.

* Infiniband counters should end in _total

* Improve timex metric names, convert to more normal units.

See
3c073991eb/kernel/time/ntp.c (L909)
for what stabil means, looks like a moving average of some form.

* Update test fixture

* For meminfo metrics that had "kB" units, add _bytes

* Interrupts counter should have _total
2018-01-17 17:55:55 +01:00
Ben Kochie
b4d7ba119a
Add fixture for ppc64le (#785)
* Add support for per-architecture fixtures.
* Add output for ppc64le.
2018-01-11 13:56:19 +01:00
Nick Owens
0629a081db multiply page size after float64 coercion to avoid signed integer overflow (#780) 2018-01-08 15:36:49 +01:00