Node_Exporter/text_collector_examples
Matt Bostock 9e0aee8ae7 Add metrics exposing extended md RAID info (#958)
Add metrics that expose more information about MD RAID devices and
disks:

- the RAID level in use
- the RAID set that a disk belongs to

This allows for things like alert on unusually high I/O
utilisation for a disk compared to other disks in the same RAID set,
which usually means the disk is failing, and for comparing
write/read latency across RAID sets.

Output looks like:

    node_md_disk_info{disk_device="/dev/dm-0", md_device="md1", md_set="A"} 1
    node_md_disk_info{disk_device="/dev/dm-3", md_device="md1", md_set="B"} 1
    node_md_disk_info{disk_device="/dev/dm-2", md_device="md1", md_set="A"} 1
    node_md_disk_info{disk_device="/dev/dm-1", md_device="md1", md_set="B"} 1
    node_md_disk_info{disk_device="/dev/dm-4", md_device="md1", md_set="A"} 1
    node_md_disk_info{disk_device="/dev/dm-5", md_device="md1", md_set="B"} 1
    node_md_info{md_device="md1", md_name="foo", raid_level="10", md_metadata_version="1.2"} 1

The `node_md_info` metric, which gives additional information about the
RAID array, is intentionally separate to avoid adding all of those
labels to each disk. If you need to query using the labels contained in
`node_md_info`, you can do that using PromQL:
https://www.robustperception.io/how-to-have-labels-for-machine-roles/

I looked at adding the array UUID, but there's no sysfs entry for it and
I'm not sure there's a strong use case for it.

This patch to add a sysfs entry for the UUID was apparently not
accepted:
https://www.spinics.net/lists/raid/msg40667.html

Add these metrics as a textfile script rather than adding them to the Go
'md' module as they're perhaps less commonly useful. If lots of people
find them useful, we can later rewrite this in Go.

Signed-off-by: Matt Bostock <mbostock@cloudflare.com>
2018-08-18 08:57:51 +00:00
..
apt.sh Fix apt.sh syntax (#811) 2018-02-05 20:43:25 +01:00
deleted_libraries.py Add metric for outdated libraries (#957) 2018-05-25 18:20:42 +02:00
directory-size.sh Fix metric name in directory size text collector example 2018-05-19 21:11:46 +02:00
ipmitool Fix spelling of celsius in IPMI example script (#967) 2018-06-08 19:21:19 +02:00
md_info.sh Add metrics exposing extended md RAID info (#958) 2018-08-18 08:57:51 +00:00
ntpd_metrics.py Add ntpd metrics from ntpq rv 2017-02-14 16:20:53 +01:00
README.md Document use of atomic wrapper (#781) 2018-02-27 19:46:01 +01:00
smartmon.sh Add scsi smart data to prometheus exporter (#862) 2018-07-04 00:30:20 +02:00
storcli.py Update storcli.py (#783) 2018-01-09 09:10:30 +01:00

Text collector example scripts

These scripts are examples to be used with the Node Exporter Textfile Collector.

To use these scripts, we recommend using a sponge to atomically write the output.

<collector_script> | sponge <output_file>

Sponge comes from moreutils

For more information see: https://github.com/prometheus/node_exporter#textfile-collector