Commit Graph

555 Commits

Author SHA1 Message Date
Tom Wilkie
a7fd6b8743 Export systemd timer last trigger sec. 2018-01-31 15:07:04 +00:00
Ben Kochie
111e3af437
Remove obsolete megacli collector. (#798)
This collector has been replaced by the textfile collector tool
`storcli.py`.
2018-01-23 11:25:42 +01:00
Julius Volz
6cac74f0e0
Add unit suffix to textfile collector mtime metric (#796) 2018-01-22 14:02:19 +01:00
Brian Brazil
a98067a294 Make metrics better follow guidelines (#787)
* Improve stat linux metric names.

cpu is no longer used.

* node_cpu -> node_cpu_seconds_total for Linux

* Improve filesystem metric names with units

* Improve units and names of linux disk stats

Remove sector metrics, the bytes metrics cover those already.

* Infiniband counters should end in _total

* Improve timex metric names, convert to more normal units.

See
3c073991eb/kernel/time/ntp.c (L909)
for what stabil means, looks like a moving average of some form.

* Update test fixture

* For meminfo metrics that had "kB" units, add _bytes

* Interrupts counter should have _total
2018-01-17 17:55:55 +01:00
Ben Kochie
b4d7ba119a
Add fixture for ppc64le (#785)
* Add support for per-architecture fixtures.
* Add output for ppc64le.
2018-01-11 13:56:19 +01:00
Nick Owens
0629a081db multiply page size after float64 coercion to avoid signed integer overflow (#780) 2018-01-08 15:36:49 +01:00
Franz Pletz
d432f9857e Use uint64 in the ZFS collector (#714)
ZFS metrics can also be unsigned 64-bit integers that won't fit in
int64 and causes the whole collector to fail.
2018-01-06 12:36:55 +01:00
Derek Marcotte
477fe4665a Move FreeBSD/DragonflyBSD out of meminfo add kvm. (#547)
* Move FreeBSD/DragonflyBSD out of meminfo add kvm.

This gives us SwapUsed, and everything under one roof.

* Fix typos per review.

* Update to use newer API.

* Remove premature optimization per PR feedback.
2018-01-04 12:23:26 +01:00
Sevag Hanssian
4329b0a86b Add summary metrics for systemd exporter (#765) 2018-01-04 11:49:36 +01:00
Matthieu Guegan
d6ef10bb56 Add openbsd meminfo (#724)
* Implements meminfo collector for OpenBSD

This is a rework of #151.

* Fix CGO import

* Add some useful metrics

* Rename total -> size for normalization
2018-01-04 10:32:08 +01:00
Ben Kochie
7f6c59e198
Ignore more virtual filesystems (#775)
Add additional Linux virtual filesystem types to the default list.
2018-01-03 17:22:02 +01:00
Netmonk
2aa8d0eb0c [FIX] Exclude Linux proc from filesystem type regexp (#774)
* [FIX] Issue 63, error on excluding proc filesystem on linux, improving regexp

* [FIX] Reordering filter order
2018-01-03 11:40:32 +01:00
Julius Volz
f536857ac6
Fix e2e tests after textfile custom timestamp removal (#768) 2017-12-24 11:54:33 +01:00
Shubheksha Jalan
1f2458f42c Filter out testfile metrics correctly when using collect[] filters (#763)
* remove injection hook for textfile metrics, convert them to prometheus format

* add support for summaries

* add support for histograms

* add logic for handling inconsistent labels within a metric family for counter, gauge, untyped

* change logic for parsing the metrics textfile

* fix logic to adding missing labels

* Export time and error metrics for textfiles

* Add tests for new textfile collector, fix found bugs

* refactor Update() to split into smaller functions

* remove parseTextFiles(), fix import issue

* add mtime metric directly to channel, fix handling of mtime during testing

* rename variables related to labels

* refactor: add default case, remove if guard for metrics, remove extra loop and slice

* refactor: remove extra loop iterating over metric families

* test: add test case for different metric type, fix found bug

* test: add test for metrics with inconsistent labels

* test: add test for histogram

* test: add test for histogram with extra dimension

* test: add test for summary

* test: add test for summary with extra dimension

* remove unnecessary creation of protobuf

* nit: remove extra blank line
2017-12-23 20:21:58 +01:00
Ben Kochie
cd2a17176a
Add full make to CircleCI (#761)
* Add full make to CircleCI

Ensure end-to-end test is run.

* Fix go fmt error.

* Fix end-to-end output.
2017-12-21 16:24:23 +01:00
Wei Li
1e9bb4ec3a textfile: fix duplicate metrics error (#738)
The textfile gatherer should only be added to gatherer list once.

Signed-off-by: Li Wei <liwei@anbutu.com>
2017-12-06 17:05:40 +01:00
Kristian Klausen
a96f1738b3 netdev: Change valueType to CounterValue (#749)
All the metric only goes up, so the type should be counter.
This also add _total to all the metric name.

Fix: #747
2017-12-06 13:58:35 +01:00
Ben Kochie
2a80537547
Split out guest cpu metrics on Linux. (#744)
Linux "guest" metrics for VMs are already accounted for in node_cpu
`user` and `nice` metrics.  Separate these into their own metric to
avoid duplication of data.
2017-11-23 15:04:47 +01:00
Karsten Weiss
a8d7d1101a cpu: Support processor-less (memory-only) NUMA nodes (#734)
* cpu: Support processor-less (memory-only) NUMA nodes

Processor-less (memory-only) NUMA nodes exist e.g. in systems that use
Intel Optane drives for RAM expansion using Intel Memory Drive
Technology (IMDT).

IMDT RAM expansion supports two modes:

* "Unify Remote Memory domains": present a processor-less (memory-only)
  NUMA domain, which is the default
* "Expand local memory domains": to expand each processor’s memory domain
  with a portion of the memory made available by Optane and IMDT

This commit fixes a crash in the first case (when "cpulist" is empty).

Here's an example of such a system:

$ numastat -m|head -n5

Per-node system memory usage (in MBs):
                          Node 0          Node 1          Node 2           Total
                 --------------- --------------- --------------- ---------------
MemTotal               118239.56       130816.00       464384.00       713439.56

$ for i in {0..2}; do echo -n "$i: " ; cat /sys/bus/node/devices/node$i/cpulist ; done
0: 0-7,16-23
1: 8-15,24-31
2:

$ /opt/vsmp/bin/vsmpversion -vvv
Memory Drive Technology: 8.2.1455.74 (Sep 28 2017 13:09:59)
System configuration:
    Boards:      3
       1 x Proc. + I/O + Memory
       2 x NVM devices (Intel SSDPED1K375GAQ)
    Processors:  2, Cores: 16, Threads: 32
        Intel(R) Xeon(R) CPU E5-2667 v4 @ 3.20GHz Stepping 01
    Memory (MB): 713472 (of 977450), Cache: 251416, Private: 12562
       1 x 249088MB   [262036/   678/12270]
       1 x 232192MB   [357707/125369/  146]  82:00.0#1
       1 x 232192MB   [357707/125369/  146]  83:00.0#1

* cpu: rename some variables (pkg => node)

* cpu: Use %v not %q in log.Debugf() format strings
2017-11-10 15:31:26 +01:00
Matt Layher
f6f9c8d6cc Add and use sysReadFile in hwmon collector (#728) 2017-11-07 07:49:37 +01:00
Tobias Klauser
d73f1e60c4 Simplify Utsname string conversion (#716)
* Update golang.org/x/sys/unix

This allows to use simplified string conversion of Utsname members.

* Simplify Utsname string conversion

Use Utsname from golang.org/x/sys/unix which contains byte array
instead of int8/uint8 array members. This allows to simplify the string
conversions of these members.
2017-11-02 11:57:14 +01:00
Ben Kochie
ea250d73f4
Fix off by one in Linux interrupts collector (#721)
* Fix off by one in Linux interrupts collector

* Fix off by one in CPU column handler.
* Add test.

* Enable interrupts in end-to-end test.
2017-11-02 09:59:46 +01:00
Matt Layher
296b62acb7
netstat: return nothing when /proc/net/snmp6 not found 2017-10-31 15:26:32 -04:00
Derek Marcotte
0eecaa9547 Correct buffer_bytes > INT_MAX on BSD/amd64. (#712)
* Correct buffer_bytes > INT_MAX on BSD/amd64.

The sysctl vfs.bufspace returns either an int or a long, depending on
the value.  Large values of vfs.bufspace will result in error messages
like:

  couldn't get meminfo: cannot allocate memory

This will detect the returned data type, and cast appropriately.

* Added explicit length checks per feedback.

* Flatten Value() to make it easier to read.

* Simplify per feedback.

* Fix style.

* Doc updates.
2017-10-25 20:55:22 +02:00
Matt Layher
f9ad88fc03
xfs: expose correct fields, fix metric names 2017-10-20 18:41:51 -04:00
Siavash Safi
f3a7022602 Add collect[] parameter (#699)
* Add `collect[]` parameter

* Add TODo comment about staticcheck ignored

* Restore promhttp.HandlerOpts

* Log a warning and return HTTP error instead of failing

* Check collector existence and status, cleanups

* Fix warnings and error messages

* Don't panic, return error if collector registration failed

* Update README
2017-10-14 14:23:42 +02:00
Ben Kochie
deadfef4c9 Update vendoring (#685)
* Update vendor github.com/coreos/go-systemd/dbus@v15

* Update vendor github.com/ema/qdisc

* Update vendor github.com/godbus/dbus

* Update vendor github.com/golang/protobuf/proto

* Update vendor github.com/lufia/iostat

* Update vendor github.com/matttproud/golang_protobuf_extensions/pbutil@v1.0.0

* Update vendor github.com/prometheus/client_golang/...

* Update vendor github.com/prometheus/common/...

* Update vendor github.com/prometheus/procfs/...

* Update vendor github.com/sirupsen/logrus@v1.0.3

Adds vendor golang.org/x/crypto

* Update vendor golang.org/x/net/...

* Update vendor golang.org/x/sys/...

* Update end to end output.
2017-10-05 16:20:47 +02:00
Brett Vickers
b62c7bc0ad Updated vendored ntp package (#681)
The github.com/beevik/ntp package was recently updated with some
API changes that broke node_exporter. This commit fetches the
latest version of the ntp package and brings node_exporter in
line with the latest API.
2017-10-04 08:33:49 +02:00
Calle Pettersson
859a825bb8 Replace --collectors.enabled with per-collector flags (#640)
* Move NodeCollector into package collector

* Refactor collector enabling

* Update README with new collector enabled flags

* Fix out-of-date inline flag reference syntax

* Use new flags in end-to-end tests

* Add flag to disable all default collectors

* Track if a flag has been set explicitly

* Add --collectors.disable-defaults to README

* Revert disable-defaults flag

* Shorten flags

* Fixup timex collector registration

* Fix end-to-end tests

* Change procfs and sysfs path flags

* Fix review comments
2017-09-28 15:06:26 +02:00
Sami Kerola
3762191e66 Add timex collector (#664)
This collector is based on adjtimex(2) system call.  The collector returns
three values, status if time is synchronised, offset to remote reference,
and local clock frequency adjustment.

Values are taken from kernel time keeping data structures to avoid getting
involved how the synchronisation is implemented.  By that I mean one should
not care if time is update using ntpd, systemd.timesyncd, ptpd, and so on.
Since all time sync implementation will always end up telling to kernel what
is the status with time one can simply omit the software in between, and
look results of the syncing.  As a positive side effect this makes collector
very quick and conceptually specific, this does not monitor availability of
NTP server, or network in between, or dns resolution, and other unrelated
but necessary things.

Minimum set of values to keep eye on are the following three:

    The node_timex_sync_status tells if local clock is in sync with a remote
    clock.  Value is set to zero when synchronisation to a reliable server
    is lost, or a time sync software is misconfigured.

    The node_timex_offset_seconds tells how much local clock is off when
    compared to reference.  In case of multiple time references this value
    is outcome of RFC 5905 adjustment algorithm.  Ideally offset should be
    close to zero, and it depends about use case how large value is
    acceptable.  For example a typical web server is probably fine if offset
    is about 0.1 or less, but that would not be good enough for mobile phone
    base station operator.

    The node_timex_freq tells amount of adjustment to local clock tick
    frequency.  For example if offset is one second and growing the local
    clock will need instruction to tick quicker.  Number value itself is not
    very important, and occasional small adjustments are fine.  When
    frequency is unusually in stable one can assume quality of time stamps
    will not be accurate to very far in sub second range.  Obviously
    explaining why local clock frequency behaves like a passenger in roller
    coaster is different matter.  Explanations can vary from system load, to
    environmental issues such as a machine being physically too hot.

Rest of the measurements can help when debugging.  If you run a clock server
do probably want to collect and keep track of everything.

Pull-request: https://github.com/prometheus/node_exporter/pull/664
2017-09-19 07:54:06 -07:00
Leonid Evdokimov
c169b4b1c5 Add metrics from SNTPv4 packet to ntp collector & add ntpd sanity check (#655)
* Add metrics from SNTPv4 packet to ntp collector & add ntpd sanity check

1. Checking local clock against remote NTP daemon is bad idea, local
ntpd acting as a  client should do it better and avoid excessive load on
remote NTP server so the collector is refactored to query local NTP
server.

2. Checking local clock against remote one does not check local ntpd
itself. Local ntpd may be down or out of sync due to network issues, but
clock will be OK.

3. Checking NTP server using sanity of it's response is tricky and
depends on ntpd implementation, that's why common `node_ntp_sanity`
variable is exported.

* `govendor add golang.org/x/net/ipv4`, it is dependency of github.com/beevik/ntp

* Update github.com/beevik/ntp to include boring SNTP fix

* Use variable name from RFC5905

* ntp: move code to make export of raw metrics more explicit

* Move NTP math to `github.com/beevik/ntp`

* Make `golint` happy

* Add some brief docs explaining `ntp` #655 and `timex` #664 modules

* ntp: drop XXX comment that got its decision

* ntp: add `_seconds` suffix to relevant metrics

* Better `node_ntp_leap` comment

* s/node_ntp_reftime/node_ntp_reference_timestamp_seconds/ as requested by @discordianfish

* Extract subsystem name to const as suggested by @SuperQ
2017-09-19 10:36:14 +02:00
Karsten Weiss
b0d5c00832 cpu: Metric 'package_throttles_total' is per package. (#657)
* cpu: Metric 'package_throttles_total' is per package.

'package_throttles_total' is per package, not per cpu. This also reduces
the total number of cpu time series a lot (esp for multi core cpus).

* cpu: Better handling of a cpulist edge-case.

* cpu: Extract the package number from the directory name.

Do not rely on the range index.

* cpu: Add package_throttle_count for node0 cpu1

This file must be ignored by the cpu collector.
2017-09-07 23:24:18 +02:00
Matthias Rampke
e1f129c729 Use int64 throughout the ZFS collector.
This avoids issues with integer overflows on 32-bit architectures. The
Prometheus data format is float64, so regardless of the architecture we
should handle large numbers.

Fixes #629.
2017-08-21 16:40:16 +00:00
Ben Kochie
8839640cd1 Ignore wifi collector permission errors (#646)
Ignore the permission denined error when the wifi collector has no
permission to read metrics.
2017-08-18 10:19:48 +02:00
Calle Pettersson
dfe07eaae8 Switch to kingpin flags (#639)
* Switch to kingpin flags

* Fix logrus vendoring

* Fix flags in main tests

* Fix vendoring versions
2017-08-12 15:07:24 +02:00
Ben Kochie
46c31d8a7e Enable IPVS collector by default (#623)
* Silence error output when no IPVS present.
* Enable by default.
* Update end-to-end fixture.
* Update README.
2017-07-26 15:20:28 +02:00
Tobias Schmidt
515b5a933d Fix build tags of loadavg collector
The collector is only implemented for a subset of all operating systems
supported by go. Compilation will fail if attempted for another OS
target.
2017-07-20 15:13:58 -04:00
Tobias Schmidt
016d79535d Fix build tags of meminfo collector
The meminfo collector only supports darwin, dragonfly, freebsd and linux
and must not be included in other archtictures.
2017-07-20 14:37:10 -04:00
Andrea De Pasquale
1369763067 Change raid0 status line regexp for mdadm collector (#619) 2017-07-20 17:04:33 +02:00
Tobias Schmidt
921319c7eb Merge pull request #583 from knweiss/golint
Golint fixes
2017-07-10 23:49:36 +02:00
Aleksey Zhukov
7a914e58f2 Add parsing /proc/net/snmp6 file for netstat-linux (#615)
* Add parsing /proc/net/snmp6 file

* add /proc/net/snmp6 fixture

* fix e2e test

* gofmt

* remove unuser variable

* safe checks

* add tests

* change help format
2017-07-08 20:16:35 +02:00
Matt Layher
6e82fd1c56 Add XFS block mapping and block map B-tree stats (#575) 2017-07-07 07:27:52 +02:00
ideaship
8d90276283 Add bcache collector (#597)
* Add bcache collector for Linux

This collector gathers metrics related to the Linux block cache
(bcache) from sysfs.

* Removed commented out code

* Use project comment style

* Add _sectors to metric name to indicate unit

* Really use project comment style

* Rename bcache.go to bcache_linux.go

* Keep collector namespace clean

Rename:
- metric -> bcacheMetric
- periodStatsToMetrics -> bcachePeriodStatsToMetric

* Shorten slice initialization

* Change label names to backing_device, cache_device

* Remove five minute metrics (keep only total)

* Include units in additional metric names

* Enable bcache collector by default

* Provide metrics in seconds, not nanoseconds

* remove metrics with label "all"

* Add fixtures, update end-to-end for bcache collector

* Move fixtures/sys into tar.gz

This changeset moves the collector/fixtures/sys directory into
collector/fixtures/sys.tar.gz and tweaks the Makefile to unpack the
tarball before tests are run.

The reason for this change is that Windows does not allow colons in a
path (colons are present in some of the bcache fixture files), nor can
it (out of the box) deal with pathnames longer than 260 characters
(which we would be increasingly likely to hit if we tried to replace
colons with longer codes that are guaranteed not the turn up in regular
file names).

* Add ttar: plain text archive, replacement for tar

This changeset adds ttar, a plain text replacement for tar, and uses it
for the sysfs fixture archive. The syntax is loosely based on tar(1).

Using a plain text archive makes it possible to review changes without
downloading and extracting the archive. Also, when working on the repo,
git diff and git log become useful again, allowing a committer to verify
and track changes over time.

The code is written in bash, because bash is available out of the box on
all major flavors of Linux and on macOS. The feature set used is
restricted to bash version 3.2 because that is what Apple is still
shipping.

The programm also works on Windows if bash is installed. Obviously, it
does not solve the Windows limitations (path length limited to 260
characters, no symbolic links) that prompted the move to an archive
format in the first place.
2017-07-07 07:20:18 +02:00
kadota kyohei
a077024f51 add diskstats on Darwin (#593)
* Add diskstats collector for Darwin

* Update year in the header

* Update README.md

* Add github.com/lufia/iostat to vendored packages

* Change stats to follow naming guidelines

* Add a entry of github.com/lufia/iostat into vendor.json

* Remove /proc/diskstats from description
2017-07-06 13:51:24 +02:00
Rene Treffer
56bf8d4b2d Add link to kernel documentation for sysfs/cpufreq files 2017-06-27 11:25:06 +02:00
Rene Treffer
bcc3cd92b8 Fix cpufreq statistics by converting kHz to Hz 2017-06-27 11:05:55 +02:00
Ben Kochie
182810056f Fix Linux cpu errors (#606)
Make the Linux cpu collector soft-error on missing `cpufreq` and
`thermal_throttle` features.
2017-06-20 07:51:26 +02:00
Rene Treffer
2e9f1913b8 Move stat_linux to cpu_linux and add cpufreq stats (#548) 2017-06-13 11:21:53 +02:00
Emanuele Rocca
047003b6bb Add qdisc collector for Linux (#580)
* Add qdisc collector for Linux

This collector gathers basic queueing discipline metrics via netlink,
similarly to what `tc -s qdisc show` does.

* qdisc collector: nl-specific code moved, names fixed

- netlink-specific parts moved to github.com/ema/qdisc
- avoid using shortened names
- counters renamed into XXX_total

* Get rid of parseMessage error checking leftover

* Add github.com/ema/qdisc to vendored packages

* Update help texts and comments

* Add qdisc collector to README file

* qdisc collector end-to-end testing

* Update qdisc dependency to latest version

Update github.com/ema/qdisc dependency to revision 2c7e72d, which
includes unit testing.

* qdisc collector: rename "iface" label into "device"
2017-05-23 11:55:50 +02:00
Karsten Weiss
b2f4fd5776 Remove unused devstatCollector struct member 'bytes_total'.
This also fixes this golint issue:

devstat_freebsd.go:40:2: don't use underscores in Go names; struct field bytes_total should be bytesTotal
2017-05-14 19:51:53 +02:00
Jonas Große Sundrup
e6d031788f Correct typo (#582) 2017-05-14 19:46:23 +02:00
Karsten Weiss
bca09abf1c golint: Fix NewStatCollector() doc string. 2017-05-14 13:51:47 +02:00
Karsten Weiss
b3e7420a27 cpu_darwin.go: s/cpu_ticks/cpuTicks/g 2017-05-14 13:51:42 +02:00
Karsten Weiss
b05c7d8dab cpu_darwin.go: Fix doc strings. 2017-05-14 13:51:34 +02:00
Karsten Weiss
fff03c6c0c Fix NewTCPStatCollector doc string. 2017-05-14 13:23:57 +02:00
Karsten Weiss
6720cfdbfe golint: Fix comment on exported function NewDevstatCollector. 2017-05-14 13:21:39 +02:00
Karsten Weiss
b73af72853 Explicitly check for the rc 3 in call to getloadavg(). Reorder logic. 2017-05-14 13:07:54 +02:00
Karsten Weiss
af358ec800 golint fixes: if block ends with a return statement, so drop this else and outdent its block. 2017-05-14 12:55:44 +02:00
Karsten Weiss
732f839810 sysctl_bsd.go: golint fixes. Typo fix. 2017-05-14 12:51:57 +02:00
Robert Clark
58f50b31f2 Multiply port data XMIT/RCV metrics by 4 (#579)
According to Mellanox, it is standard practice that the port_xmit_data and port_rcv_data
files are split into 4 lanes. To get the actual transmit and receive values for each
port, the metric needs to be multiplied by 4.

Signed-Off-By: Robert Clark <robert.d.clark@hpe.com>
2017-05-12 07:28:53 +02:00
Ben Kochie
8f3cddf734 Merge pull request #568 from mdlayher/xfs-init
Initial XFS collector
2017-04-25 09:54:28 +02:00
Kai S
59f9b8c5c1 Handle nonexisting bonding_masters file (#569)
* silently ignore nonexisting bonding_masters file

Add an empty fixtures dir without a bonding_masters file to test.

* Moved the check to the Update() method

Dropped the empty test dir.
2017-04-24 23:19:17 +04:00
Matt Layher
1feb091b36
Initial XFS collector 2017-04-22 11:53:07 -04:00
Ben Kochie
e9aad0157c Merge pull request #550 from derekmarcotte/dm-boottime
Add exec_boot_time for freebsd, dragonfly
2017-04-22 09:18:05 +02:00
Derek Marcotte
5b557bf973 Fix metric name per review. 2017-04-21 16:25:31 -04:00
Derek Marcotte
db8ec9c6b4 Add exec_boot_time for freebsd, dragonfly
Adds new sysctl type, bsdSysctlTypeStructTimeval to enable parsing of
timevals from raw memory.
2017-04-21 10:23:19 -04:00
Daniele Sluijters
bb9d4ade0b uname_linux: Build for 32bit MIPS too
Since Go 1.8 32bit MIPS Big/Little Endian are supported assuming the
target runs Linux and the kernel either emulates an FPU or can access
the CPU one.

This allows the node_collector to build for mips and mipsle opening up
the possibility of running it on things like home routers
(DD-|Open|ASUS-)Wrt firmware usually has the necessary bits in place.
2017-04-20 13:30:40 +02:00
Brian Brazil
f291d2d6dd Get full resolution for node_time (#555) 2017-04-19 18:31:21 +01:00
Karsten Weiss
d9703ff7c6 edac: Fix typo in csrow label of node_edac_csrow_uncorrectable_errors_total metric. 2017-04-18 12:45:06 +02:00
Tobias Schmidt
266f0958d2 Merge pull request #561 from derekmarcotte/dm-fix-dfly-build
Fixes broken build on Dragonfly.
2017-04-17 17:31:12 +02:00
Derek Marcotte
83cecfa696 Fixes broken build on Dragonfly.
Undefined err:

84eaa8fecd/collector/devstat_dragonfly.go (L145)
2017-04-17 10:50:49 -04:00
Karsten Weiss
45ca8db352 Support the 'guest_nice' cpu mode of /proc/stat.
'guest_nice' is available since Linux 2.6.33.
2017-04-14 12:50:37 +02:00
Sam Kottler
6eafa51fa8 Add ARP collector for Linux (#540)
* Implement commonalities and linux support for ARP collection

* Add ARP collector to fixtures and run as part of e2e tests

* Bubble up scanner errors

* Use single return values where it makes sense

* Add missing annotation

* Move arp_common into arp_linux

* Add license header to arp_linux.go

* Address initial feedback

* Use strings.Fields instead of strings.Split

* Deal with scanner.Err() rather than throwing away errors

* Check for scan errors in-line before interacting with the entries map

* Don't interact with potentially empty text from scan

* Check for scan errors outside the scan loop

* Add comment about moving procfs parsing

* Add more direct comment

* Update initialism style to match go style guide

* Put function args on the same line

* Add TODO in front of comment about procfs extraction

* Guard against strings.Fields returning an empty slice

* Be more defensive about ARP table format and use upcase more broadly

* Enable the ARP collector by default

* Add ARP collector to the README

* Remove 'entry'
2017-04-11 17:45:19 +02:00
Tobias Schmidt
8aec44617a Remove Windows support
Use https://github.com/martinlindhe/wmi_exporter instead.
2017-04-10 23:27:23 -03:00
Tobias Schmidt
41a44a4d24 Merge pull request #532 from prometheus/grobie/remove-extra-file-check
mdadm: Remove extra file existence check
2017-03-31 05:35:12 +02:00
Ben Kochie
5f43211f67 Blacklist systemd scope units
Blacklist `scope` units from systemd collector by default.

These units are created with unique IDs programatically[0].  This leads to
huge cardinality problems.

[0]: https://www.freedesktop.org/software/systemd/man/systemd.scope.html
2017-03-23 14:02:46 +01:00
Tobias Schmidt
d290ea94b8 Fix export of stale device error metrics for unmounted filesystems
Instead of maintaining a counter metric for device errors in memory,
this change exports a gauge and uses const metrics to avoid leaking
metrics for unmounted filesystems.
2017-03-22 21:48:18 -03:00
Tobias Schmidt
7b93b52010 Fix lint issues on filesystem BSD implementation 2017-03-22 21:48:12 -03:00
Tobias Schmidt
445ed44082 mdadm: Remove extra file existence check 2017-03-22 10:11:19 -03:00
Johannes 'fish' Ziemke
9676f5f2dc Merge pull request #523 from roclark/support-legacy-infiniband
Add support for legacy InfiniBand drivers
2017-03-21 10:52:07 +01:00
Johannes 'fish' Ziemke
620e9937e6 Merge pull request #524 from mdlayher/wifi-expand
Expand wifi collector for more interface types
2017-03-21 10:32:44 +01:00
Juergen Hoetzel
aef2601cf6 Add missing dependency for static FreeBSD build 2017-03-20 16:59:45 +00:00
Matt Layher
2bfe410fb7
Expand wifi collector for more interface types 2017-03-20 12:25:01 -04:00
Robert Clark
3a5917dfdc Add support for legacy InfiniBand drivers
Older versions of the OFED drivers contain 64-bit variants of the port counters and are located in a directory named 'counters_ext'. This patch includes these older metrics that have since been deprecated with OFED 4.0.

Signed-Off-By: Robert Clark <robert.d.clark@hpe.com>
2017-03-20 10:37:21 -05:00
Tobias Schmidt
0400e437be Fix and simplify parsing of raid metrics
Fixes the wrong reporting of active+total disk metrics for inactive
raids. Also simplifies the code and removes a couple of redundant
comments.
2017-03-19 08:03:58 -03:00
Matt Layher
42c8a20545
Unexport wifiCollector metrics 2017-03-16 17:11:09 -04:00
Matt Layher
69368b7f9c Add synthetic node_wifi_station_info metric for BSS information 2017-03-16 16:24:23 -04:00
Brian Brazil
a02e469b07 Report collector success/failure and duration per scrape. (#516)
This is in line with best practices, and also saves us
63 timeseries on a default Linux setup.
2017-03-16 17:21:00 +00:00
Robert Clark
413e5af502 Skip metric files that don't exist
In case a metric file within the InfiniBand collector doesn't exist, skip the metric in order to allow collection of the remaining valid InfiniBand metrics.

Signed-Off-By: Robert Clark <robert.d.clark@hpe.com>
2017-03-09 11:05:36 -06:00
Derek Marcotte
72d8576185 Refactor meminfo_bsd.go to use sysctl_bsd.go (#501)
* Refactor meminfo_bsd.go to use sysctl_bsd.go

* Fixed spelling.
2017-03-07 21:54:28 -04:00
Ben Kochie
5d22d41ed7 Merge pull request #484 from prometheus/grobie/update-vendored-packages
Update vendored packages
2017-03-01 08:05:45 +01:00
Derek Marcotte
bdc2131332 Added node_memory_buffer, node_memory_swaptotal to meminfo_bsd (#451) 2017-03-01 01:36:02 -04:00
Tobias Schmidt
ce117d7a40 Update vendored packages 2017-02-28 18:20:24 -04:00
Tobias Schmidt
84eaa8fecd Remove more unnecessarily named return values 2017-02-28 17:33:46 -04:00
Derek Marcotte
5c28ab044d Add BSD exec statistics collector (#457)
* First pass of a sysctl_bsd source, exec_bsd + exec metrics

* Incorportate PR feedback, including removing pre-build descriptions, unit conversion callback.

* Remove redundant cached_description field, per PR feedback

* Incorporate PR feedback
2017-02-28 17:23:10 -04:00
Tobias Schmidt
1bd94074dd Delete unused code 2017-02-28 17:20:16 -04:00
Tobias Schmidt
922e74d58f Remove unnecessarily named return variables
Named return variables should only be used to describe the returned type
further, e.g. `err error` doesn't add any new information and is just
stutter.
2017-02-28 16:04:25 -04:00
Tobias Schmidt
084e585c2a Fix scanner usage without error handling 2017-02-28 16:04:25 -04:00
Tobias Schmidt
d1dfda86ee Fix wrong end-to-end expectation 2017-02-28 16:02:43 -04:00
Tobias Schmidt
abdebef47c Fix gofmt -s and spelling issues 2017-02-28 14:01:28 -04:00
Tobias Schmidt
195b4d596c Merge pull request #480 from prometheus/grobie/gosimple
Simplify go code
2017-02-28 13:59:01 -04:00
Tobias Schmidt
694294baf5 Remove unnecessary conversions 2017-02-28 13:57:49 -04:00
Tobias Schmidt
21e13c7f52 Simplify code 2017-02-28 13:54:27 -04:00
Tobias Schmidt
c703435790 Fix all open go lint and vet issues 2017-02-28 13:05:38 -04:00
Ben Kochie
38cd07ebb9 Merge pull request #450 from roclark/add-infiniband
infiniband: Add new collector for InfiniBand statistics
2017-02-16 14:33:19 +01:00
Ben Kochie
a097dd36b3 Merge pull request #459 from joehandzik/wip-zpool-io-cherrypick
ZFS Collector: Add zpool IO statistics
2017-02-16 08:16:55 +01:00
Thorhallur Sverrisson
19813d3e02 Changing datastructure for BuddyInfo 2017-02-15 10:15:44 -06:00
Thorhallur Sverrisson
5ab285e098 Adding buddyinfo to end to end test. 2017-02-15 10:15:44 -06:00
Thorhallur Sverrisson
55417d7688 Moving buddyinfo logic to procfs 2017-02-15 10:15:44 -06:00
Thorhallur Sverrisson
492c96f6b6 Moving buddyinfo_test.go to procfs library 2017-02-15 10:15:43 -06:00
Thorhallur Sverrisson
3ba15c1ddb Adding support for /proc/buddyinfo for linux free memory fragmentation.
/prod/buddyinfo returns data on the free blocks fragments available
for use from the kernel.  This data is useful when diagnosing
possible memory fragmentation.

More info can be found in:
* https://lwn.net/Articles/7868/
* https://andorian.blogspot.com/2014/03/making-sense-of-procbuddyinfo.html
2017-02-15 10:15:43 -06:00
Joe Handzik
bb8b3fca88 ZFS Collector: Add zpool IO statistics
Signed-Off-By: Joe Handzik <joseph.t.handzik@hpe.com>
2017-02-10 13:31:25 -06:00
Robert Clark
36f81282b7 Add unit tests for InfiniBand collector
Signed-Off-By: Robert Clark <robert.d.clark@hpe.com>
2017-02-07 11:09:08 -06:00
Robert Clark
4866adcb71 Add new collector for InfiniBand statistics
Add new metrics for the InfiniBand network protocol including the amount of packets sent and received, the number of times the link has been downed and how many times the link has recovered from an error state.

Signed-Off-By: Robert Clark <robert.d.clark@hpe.com>
2017-02-07 11:09:08 -06:00
Joe Handzik
8c23f5ff54 ZFS Collector: Convert dashes to underscores for metrics
This fixes #442, and prevents other ZFS metrics from slipping through in the future.

Signed-Off-By: Joe Handzik <joseph.t.handzik@hpe.com>
2017-01-31 14:11:56 -06:00
Ben Kochie
7cfa5e75b8 Merge pull request #439 from mdlayher/collector-staticcheck
Fix two staticcheck issues in IPVS collector tests
2017-01-31 08:53:10 -05:00
Ben Kochie
71362d45eb Merge pull request #432 from joehandzik/wip-zfs-zfetchstats
Update ZFS Collector with most non-zpool metrics
2017-01-31 08:52:41 -05:00
Joe Handzik
e5ee274a32 ZFS Collector: Move from camelcase to underscores for metric prefixes
Signed-Off-By: Joe Handzik <joseph.t.handzik@hpe.com>
2017-01-29 15:59:01 -06:00
Matt Layher
c8e546926a
Fix two staticcheck issues in IPVS collector tests 2017-01-27 15:20:36 -05:00
Joe Handzik
e213ccbc57 ZFS Collector: Refactor to use maps/slices and fewer globals
Removed all global types that were unnecessary, and refactored to use constructor-created values and inline values instead of globals.

Signed-Off-By: Joe Handzik <joseph.t.handzik@hpe.com>
2017-01-27 14:02:28 -06:00
Ben Kochie
5a6db5c8d2 Handle multiple NFS device mounts
It's possible to mount an NFS share in multiple locations.
* Duplicates contain the same metric values, so they can be ignored.
* Update fixture.
2017-01-24 13:44:08 +01:00
Joe Handzik
94fb93a9f3 ZFS Collector: Add dmu_tx functionality
Signed-Off-By: Joe Handzik <joseph.t.handzik@hpe.com>
2017-01-23 16:41:15 -06:00
Joe Handzik
07c7ae733a ZFS Collector: Add fm functionality
Signed-Off-By: Joe Handzik <joseph.t.handzik@hpe.com>
2017-01-23 16:31:22 -06:00
Joe Handzik
05048c067d ZFS Collector: Add xuio_stats functionality
Signed-Off-By: Joe Handzik <joseph.t.handzik@hpe.com>
2017-01-23 16:30:37 -06:00
Joe Handzik
3c9e779989 ZFS Collector: Add vdev_cache_stats functionality
Signed-Off-By: Joe Handzik <joseph.t.handzik@hpe.com>
2017-01-23 16:29:50 -06:00
Joe Handzik
a02ca9502c ZFS Collector: Add zil functionality
Signed-Off-By: Joe Handzik <joseph.t.handzik@hpe.com>
2017-01-23 16:29:00 -06:00
Joe Handzik
a3125ab4d9 ZFS Collector: Add zfetchstats functionality
Signed-Off-By: Joe Handzik <joseph.t.handzik@hpe.com>
2017-01-23 16:28:11 -06:00
Ben Kochie
acb495ccab Merge pull request #425 from mdlayher/wifi-update
Update vendored wifi, handle stations with missing info
2017-01-20 08:43:44 -05:00
Matt Layher
dfd661a633
Allow graceful failure in hwmon collector 2017-01-17 11:24:28 -05:00
Matt Layher
ca3f07feef
Update vendored wifi, handle stations with missing info 2017-01-17 00:54:18 -05:00
Ben Kochie
92537020a3 Fix runit collector flag typo. 2017-01-16 23:41:33 +01:00
Julius Volz
276112c7ef Merge pull request #418 from mdlayher/wifi-graceful-fail
Make wifi collector fail gracefully if metrics not available
2017-01-13 20:31:21 -05:00
Matt Layher
d3089f2ce8
Make wifi collector fail gracefully if metrics not available 2017-01-13 13:35:20 -05:00
Matt Layher
1e1775e761
Make ZFS collector fail gracefully when not available 2017-01-12 12:54:16 -05:00
Johannes 'fish' Ziemke
2884181cce Merge pull request #415 from mdlayher/mountstats-nfs-additional
Add NFS event metrics to mountstats collector
2017-01-12 14:08:21 +01:00
Matt Layher
e3f99e13b9
Add NFS event metrics to mountstats collector 2017-01-11 11:41:13 -05:00
Matt Layher
efa25665ec
Add initial wifi collector, bump netlink to fix 32-bit builds 2017-01-11 10:08:44 -05:00
Johannes 'fish' Ziemke
55170e8feb Merge pull request #411 from discordianfish/hwmon-move-label-metrics
Use filename as label, move 'label' to own metric
2017-01-10 12:21:18 +01:00
Ben Kochie
38a4a36061 Update end-to-end test. 2017-01-10 10:23:16 +01:00
Ben Kochie
b4fa10ca9d Add collector for Linux EDAC
Collect "Error detection and correction" metrics from memory
controllers.
* Supported on Linux only.
* Add basic fixtures.
* Enabled by default.
2017-01-10 10:14:19 +01:00
Johannes 'fish' Ziemke
6aef20f8d8 Use filename as label, move 'label' to own metric
This closes #406
2017-01-09 18:33:31 +01:00
Joe Handzik
e7442d6517 end-to-end-test.sh: Add zfs plugin
Enables fixture test and updates e2e-output.txt.

Signed-Off-By: Joe Handzik <joseph.t.handzik@hpe.com>
2017-01-08 11:13:35 -06:00
Corey Stewart
10ba27bf2c Remove FreeBSD support for zfs plugin.
This also involves removing zfs_zpool code for now.

Signed-Off-By: Corey Stewart <stewa169@purdue.edu>
Signed-Off-By: Joe Handzik <joseph.t.handzik@hpe.com>
2017-01-08 11:13:35 -06:00
Corey Stewart
a8c94d48e6 Style changes and cleanup
This patch makes stylistic changes to error strings, unexports method names by lower casing them, removes unused dataSetMetric, and adds copyright/licence information.

Signed-Off-By: Corey Stewart <stewa169@purdue.edu>
2017-01-08 10:23:58 -06:00
Christian Schwarz
f29f3873ea Add a collector for ZFS, currently focussed on ARC stats.
It is tested on FreeBSD 10.2-RELEASE and Linux (ZFS on Linux 0.6.5.4).

On FreeBSD, Solaris, etc. ZFS metrics are exposed through sysctls.
ZFS on Linux exposes the same metrics through procfs `/proc/spl/...`.

In addition to sysctl metrics, 'computed metrics' are exposed by
the collector, which are based on several sysctl values.
There is some conditional logic involved in computing these metrics
which cannot be easily mapped to PromQL.

Not all 92 ARC sysctls are exposed right now but this can be changed
with one additional LOC each.
2017-01-08 10:23:58 -06:00
Johannes 'fish' Ziemke
2e47fcb8c5 Only store relevant e2e output
This makes commits ligher/more readable when updating the output.
2017-01-06 12:36:26 +01:00
Johannes 'fish' Ziemke
ad2eb4a788 Use Gauge for megacli counters
Without refactoring this to use const metrics, we need to make this a
gauge to we can keep using Set() which was deprecated for counters.
2017-01-06 12:33:21 +01:00
Johannes 'fish' Ziemke
01a9a37556 Stop using deprecated SetMetricFamilyInjectionHook 2017-01-06 12:21:12 +01:00
Johannes 'fish' Ziemke
3e266e28b9 Merge pull request #397 from dominikh/freebsd-cpu
Collect CPU temperatures on FreeBSD
2017-01-05 17:32:48 +01:00
Johannes 'fish' Ziemke
fc1113cd11 Merge pull request #396 from dominikh/bsd-memleak
Don't leak or race in FreeBSD devstat collector
2017-01-05 17:31:57 +01:00