Commit Graph

1139 Commits

Author SHA1 Message Date
Matt Layher
dcb31670d6 Makefile: add checkmetrics target, use in CI (#797) 2018-02-13 18:04:03 +01:00
Ben Kochie
3de2542d21
Fix NFSd metric type (#819)
RPC Count should be a counter, not a gauge.
2018-02-13 17:03:22 +01:00
Matt Layher
544488ddd6 Fix remaining metric naming issues (#799) 2018-02-12 18:53:31 +01:00
Ben Kochie
6a041692ed
Add NFS Server metrics collector. (#803)
* Add NFS Server metrics collector.

* Add File Handles metrics.

* Add nfsd IO stats.

* Add metrics for NFSd threads.

* Add metrics for NFSd read ahead cache.

* Add NFSd network traffic counters.

* Add RPC metrics.

* Add V2 requests metrics.

* Add NFSv3 metrics.

* Add NFSv4 metrics.

* Update reply cache comment.

* Update help text.
2018-02-12 17:56:05 +01:00
Tobias Schmidt
9a5bd5f8e4
Merge pull request #815 from prometheus/debug-log
Fix log level regression in #533
2018-02-07 16:33:14 +01:00
Brian Brazil
1072f2868d Fix log level regression in #533 2018-02-07 15:16:20 +00:00
Brian Brazil
7e41a2b279 Ignore /var/lib/docker by default. (#814)
The node exporter runs unprivileged, so it cannot statfs any filesystems
under this directory causing log spam.  In addition there tends to be
high churn in the filesystems here (as it's basically application
monitoring) which can cause high cardinaltiy and in one case caused
Prometheus's index symbol table to get very large.
Accordingly this should be ignored to reduce log spam and avoid
performance issues. The filesystems themselves can in principle be
monitored via container oriented exporters, and the underlying
filesystems will still be monitored.
2018-02-06 17:10:59 +01:00
tobald
2978728b00 Fix apt.sh syntax (#811)
This patch fixes:

./apt.test: command substitution: line 19: syntax error near unexpected token `|'
./apt.test: command substitution: line 19: `  | /usr/bin/sort   | /usr/bin/uniq -c   | awk '{ gsub(/\\\\/,
2018-02-05 20:43:25 +01:00
Ralf Horstmann
29ac809e48 Use unified CPU metric description on OpenBSD (#810) 2018-02-01 23:59:19 +01:00
Derek Marcotte
fde5d2c6c9 Remove unsafe typecasts from sysctl_bsd getStructTimeval. (#741)
There is a simpler way.
2018-02-01 18:43:40 +01:00
Ben Kochie
14d60958d6
Unify CPU collector conventions (#806)
* Unify CPU collector conventions

Add a common CPU metric description.
* All collectors use the same `nodeCpuSecondsDesc`.
* All collectors drop the `cpu` prefix for `cpu` label values.

* Fix subsystem string in cpu_freebsd.

* Fix Linux CPU freq label names.
2018-02-01 18:42:20 +01:00
Ralf Horstmann
e3c76b1f0c Add OpenBSD CPU collector (#805) 2018-02-01 18:33:49 +01:00
Tom Wilkie
05d14ef9ee
Merge pull request #807 from tomwilkie/systemd-timers
Export systemd timers last trigger seconds.
2018-02-01 13:05:56 +00:00
Tom Wilkie
6833eec187 Fix tests. 2018-01-31 15:22:17 +00:00
Tom Wilkie
0316bacceb Only use one dbus connection, required some refactoring. 2018-01-31 15:19:18 +00:00
Tom Wilkie
a7fd6b8743 Export systemd timer last trigger sec. 2018-01-31 15:07:04 +00:00
Ben Kochie
f9e91156d0
Update vendoring (#801)
* Update vendor github.com/godbus/dbus@v4.1.0

* Update vendor github.com/golang/protobuf/proto

* Update vendor github.com/mdlayher/netlink/...

* Update vendor github.com/prometheus/client_golang/prometheus/...

* Update vendor github.com/prometheus/client_model/go

* Update vendor github.com/prometheus/common/...

* Update vendor github.com/prometheus/procfs/...

* Update vendor github.com/sirupsen/logrus@v1.0.4

* Update vendor golang.org/x/...

* Update vendor gopkg.in/alecthomas/kingpin.v2

* Remove obsolete vendor github.com/mdlayher/netlink/genetlink
2018-01-25 18:20:39 +01:00
Shevchenko Vitaliy
4ed49e73fb Escape double quotes in device model family (#772) 2018-01-24 11:35:14 +01:00
Ben Kochie
111e3af437
Remove obsolete megacli collector. (#798)
This collector has been replaced by the textfile collector tool
`storcli.py`.
2018-01-23 11:25:42 +01:00
Ben Kochie
1ad5ba4dc7
Fix smartmon.sh bugs (#792)
* Fix smartmon.sh info label consistency.

* Fix parsing of SMART-ID attributes <= 99.
2018-01-22 16:51:20 +01:00
Julius Volz
6cac74f0e0
Add unit suffix to textfile collector mtime metric (#796) 2018-01-22 14:02:19 +01:00
Brian Brazil
a98067a294 Make metrics better follow guidelines (#787)
* Improve stat linux metric names.

cpu is no longer used.

* node_cpu -> node_cpu_seconds_total for Linux

* Improve filesystem metric names with units

* Improve units and names of linux disk stats

Remove sector metrics, the bytes metrics cover those already.

* Infiniband counters should end in _total

* Improve timex metric names, convert to more normal units.

See
3c073991eb/kernel/time/ntp.c (L909)
for what stabil means, looks like a moving average of some form.

* Update test fixture

* For meminfo metrics that had "kB" units, add _bytes

* Interrupts counter should have _total
2018-01-17 17:55:55 +01:00
Ben Kochie
b4d7ba119a
Add fixture for ppc64le (#785)
* Add support for per-architecture fixtures.
* Add output for ppc64le.
2018-01-11 13:56:19 +01:00
Ben Kochie
bc38ffc538
Update collect[] param documentation (#784)
Improve recommendations and wording around advanced use of the collect[]
param.

Remove example that causes users to copy-and-paste it.
2018-01-10 15:16:33 +01:00
Bruce Lee
8d3484d0ca Update storcli.py (#783) 2018-01-09 09:10:30 +01:00
Nick Owens
0629a081db multiply page size after float64 coercion to avoid signed integer overflow (#780) 2018-01-08 15:36:49 +01:00
Franz Pletz
d432f9857e Use uint64 in the ZFS collector (#714)
ZFS metrics can also be unsigned 64-bit integers that won't fit in
int64 and causes the whole collector to fail.
2018-01-06 12:36:55 +01:00
zloo
ae280f2b04 Add Prometheus 2.0 compatible example rules file - new YAML format (#739) 2018-01-04 12:31:25 +01:00
Derek Marcotte
477fe4665a Move FreeBSD/DragonflyBSD out of meminfo add kvm. (#547)
* Move FreeBSD/DragonflyBSD out of meminfo add kvm.

This gives us SwapUsed, and everything under one roof.

* Fix typos per review.

* Update to use newer API.

* Remove premature optimization per PR feedback.
2018-01-04 12:23:26 +01:00
Tobias Schmidt
052422ec61 Fix panic by updating github.com/ema/qdisc dependency (#778) 2018-01-04 12:13:02 +01:00
Sevag Hanssian
4329b0a86b Add summary metrics for systemd exporter (#765) 2018-01-04 11:49:36 +01:00
Ben Kochie
8f9c8a060d Update README
Add OpenBSD to supported list for meminfo collector[0].

[0]: https://github.com/prometheus/node_exporter/pull/724
2018-01-04 10:33:57 +01:00
Matthieu Guegan
d6ef10bb56 Add openbsd meminfo (#724)
* Implements meminfo collector for OpenBSD

This is a rework of #151.

* Fix CGO import

* Add some useful metrics

* Rename total -> size for normalization
2018-01-04 10:32:08 +01:00
Ben Kochie
7f6c59e198
Ignore more virtual filesystems (#775)
Add additional Linux virtual filesystem types to the default list.
2018-01-03 17:22:02 +01:00
Netmonk
2aa8d0eb0c [FIX] Exclude Linux proc from filesystem type regexp (#774)
* [FIX] Issue 63, error on excluding proc filesystem on linux, improving regexp

* [FIX] Reordering filter order
2018-01-03 11:40:32 +01:00
Julius Volz
f536857ac6
Fix e2e tests after textfile custom timestamp removal (#768) 2017-12-24 11:54:33 +01:00
Shubheksha Jalan
1f2458f42c Filter out testfile metrics correctly when using collect[] filters (#763)
* remove injection hook for textfile metrics, convert them to prometheus format

* add support for summaries

* add support for histograms

* add logic for handling inconsistent labels within a metric family for counter, gauge, untyped

* change logic for parsing the metrics textfile

* fix logic to adding missing labels

* Export time and error metrics for textfiles

* Add tests for new textfile collector, fix found bugs

* refactor Update() to split into smaller functions

* remove parseTextFiles(), fix import issue

* add mtime metric directly to channel, fix handling of mtime during testing

* rename variables related to labels

* refactor: add default case, remove if guard for metrics, remove extra loop and slice

* refactor: remove extra loop iterating over metric families

* test: add test case for different metric type, fix found bug

* test: add test for metrics with inconsistent labels

* test: add test for histogram

* test: add test for histogram with extra dimension

* test: add test for summary

* test: add test for summary with extra dimension

* remove unnecessary creation of protobuf

* nit: remove extra blank line
2017-12-23 20:21:58 +01:00
Ben Kochie
cd2a17176a
Add full make to CircleCI (#761)
* Add full make to CircleCI

Ensure end-to-end test is run.

* Fix go fmt error.

* Fix end-to-end output.
2017-12-21 16:24:23 +01:00
Mario Trangoni
a40f7e78da StorCli text collector: fix pylint issues and handle StorCli not installed (#758)
* StorCli text collector: fix pylint issues and handle StorCli not installed

* StorCli text collector: Add HELP and TYPE strings.
2017-12-12 18:48:06 +01:00
Filippo Giunchedi
af4cf20b46 apt.sh: handle multiple origins in apt-get output (#757)
It might happen that a given upgrade comes from multiple origins, in
which case the origins are separated by ", " and thus breaking
whitespace-based split. For example:

Inst package [1.2.3] (1.2.4 Debian:8.10/oldstable, Debian-Security:8/oldstable [amd64])

To workaround this case, mangle the apt-get output to remove whitespaces from
the origins list.
2017-12-12 10:45:59 +01:00
Wei Li
1e9bb4ec3a textfile: fix duplicate metrics error (#738)
The textfile gatherer should only be added to gatherer list once.

Signed-off-by: Li Wei <liwei@anbutu.com>
2017-12-06 17:05:40 +01:00
Kristian Klausen
a96f1738b3 netdev: Change valueType to CounterValue (#749)
All the metric only goes up, so the type should be counter.
This also add _total to all the metric name.

Fix: #747
2017-12-06 13:58:35 +01:00
Derek Marcotte
1527789f76 Added text collector conversion for ipmitool output. (#746)
* Added text collector conversion for ipmitool output.

* Sort metrics before exporting, add namespace.

* Added HELP string, tidy up a bit.

* Make status a gauge.
2017-12-01 12:58:39 +01:00
Ben Kochie
2a80537547
Split out guest cpu metrics on Linux. (#744)
Linux "guest" metrics for VMs are already accounted for in node_cpu
`user` and `nice` metrics.  Separate these into their own metric to
avoid duplication of data.
2017-11-23 15:04:47 +01:00
Karsten Weiss
a8d7d1101a cpu: Support processor-less (memory-only) NUMA nodes (#734)
* cpu: Support processor-less (memory-only) NUMA nodes

Processor-less (memory-only) NUMA nodes exist e.g. in systems that use
Intel Optane drives for RAM expansion using Intel Memory Drive
Technology (IMDT).

IMDT RAM expansion supports two modes:

* "Unify Remote Memory domains": present a processor-less (memory-only)
  NUMA domain, which is the default
* "Expand local memory domains": to expand each processor’s memory domain
  with a portion of the memory made available by Optane and IMDT

This commit fixes a crash in the first case (when "cpulist" is empty).

Here's an example of such a system:

$ numastat -m|head -n5

Per-node system memory usage (in MBs):
                          Node 0          Node 1          Node 2           Total
                 --------------- --------------- --------------- ---------------
MemTotal               118239.56       130816.00       464384.00       713439.56

$ for i in {0..2}; do echo -n "$i: " ; cat /sys/bus/node/devices/node$i/cpulist ; done
0: 0-7,16-23
1: 8-15,24-31
2:

$ /opt/vsmp/bin/vsmpversion -vvv
Memory Drive Technology: 8.2.1455.74 (Sep 28 2017 13:09:59)
System configuration:
    Boards:      3
       1 x Proc. + I/O + Memory
       2 x NVM devices (Intel SSDPED1K375GAQ)
    Processors:  2, Cores: 16, Threads: 32
        Intel(R) Xeon(R) CPU E5-2667 v4 @ 3.20GHz Stepping 01
    Memory (MB): 713472 (of 977450), Cache: 251416, Private: 12562
       1 x 249088MB   [262036/   678/12270]
       1 x 232192MB   [357707/125369/  146]  82:00.0#1
       1 x 232192MB   [357707/125369/  146]  83:00.0#1

* cpu: rename some variables (pkg => node)

* cpu: Use %v not %q in log.Debugf() format strings
2017-11-10 15:31:26 +01:00
Matt Layher
f6f9c8d6cc Add and use sysReadFile in hwmon collector (#728) 2017-11-07 07:49:37 +01:00
Ben Kochie
4d7aa57da0
Update vendoring (#722)
* Update vendor github.com/beevik/ntp@v0.2.0

* Update vendor github.com/mdlayher/netlink/...

* Update vendor github.com/mdlayher/wifi/...

Adds vendor github.com/mdlayher/genetlink

* Update vendor github.com/prometheus/common/...

* Update vendor github.com/prometheus/procfs/...

* Update vendor golang.org/x/sys/unix

* Update vendor golang.org/x/sys/windows
2017-11-02 12:30:34 +01:00
david
eb3a917bd8 Use host PID namespace in docker example (#672)
* Use host PID namespace in docker example

See https://github.com/prometheus/node_exporter/issues/671

* Update readme for readability

* Fix comments in readme
2017-11-02 12:07:40 +01:00
Nicholas Johns
defe2f373c Remove travis ci (#702)
This PR closes #690
2017-11-02 12:01:28 +01:00
Tobias Klauser
d73f1e60c4 Simplify Utsname string conversion (#716)
* Update golang.org/x/sys/unix

This allows to use simplified string conversion of Utsname members.

* Simplify Utsname string conversion

Use Utsname from golang.org/x/sys/unix which contains byte array
instead of int8/uint8 array members. This allows to simplify the string
conversions of these members.
2017-11-02 11:57:14 +01:00