When there are no SMART compatible devices (Raspberry Pi for example) an
error is returned, but the return code is still 0.
`# scan_smart_devices: glob(3) aborted matching pattern /dev/discs/disc*`
* Remove unused `disks` variable.
* Filter for only valid `/dev` devices.
* Always try to return smartmon_device_info metric
Sometimes the 'model family' field is not returned by `smartctl' because
a disk is not in the disk database for the version of smartmontools
installed on the system.
In those cases, the device model and serial number is still returned (at
least as far as I have observed.
Re-work the logic to prefer the 'vendor' field first, and if not
present, always output a `smartmon_device_info` metric even if some
labels have empty values.
On the box I'm testing this on, where previously no metric was returned,
it now returns:
# HELP smartmon_device_info SMART metric device_info
# TYPE smartmon_device_info gauge
smartmon_device_info{disk="/dev/sda",type="sat",model_family="",device_model="INTEL REDACTED",serial_number="REDACTED",firmware_version="REDACTED"} 1
smartmon_device_info{disk="/dev/sdb",type="sat",model_family="",device_model="INTEL REDACTED",serial_number="REDACTED",firmware_version="REDACTED"} 1
smartmon_device_info{disk="/dev/sdc",type="sat",model_family="",device_model="INTEL REDACTED",serial_number="REDACTED",firmware_version="REDACTED"} 1
smartmon_device_info{disk="/dev/sdd",type="sat",model_family="",device_model="INTEL REDACTED",serial_number="REDACTED",firmware_version="REDACTED"} 1
smartmon_device_info{disk="/dev/sde",type="sat",model_family="",device_model="INTEL REDACTED",serial_number="REDACTED",firmware_version="REDACTED"} 1
smartmon_device_info{disk="/dev/sdf",type="sat",model_family="",device_model="INTEL REDACTED",serial_number="REDACTED",firmware_version="REDACTED"} 1
* Add trailing newline
Because POSIX:
https://stackoverflow.com/a/729795
"%d" in awk will truncate values at 2^31. S.M.A.R.T. values can exceed that, thus use a floating point notation instead to encode larger values (at the possible cost of some precision).
Collect metrics from the StorCLI utility on the health of MegaRAID
hardware RAID controllers and write them to stdout so that they can be
used by the textfile collector.
We parse the JSON output that StorCLI provides.
Script must be run as root or with appropriate capabilities for storcli
to access the RAID card.
Designed to run under Python 2.7, using the system Python provided with
many Linux distributions.
The metrics look like this:
mbostock@host:~$ sudo ./storcli.py
megaraid_status_code 0
megaraid_controllers_count 1
megaraid_emergency_hot_spare{controller="0"} 1
megaraid_scheduled_patrol_read{controller="0"} 1
megaraid_virtual_drives{controller="0"} 1
megaraid_drive_groups{controller="0"} 1
megaraid_virtual_drives_optimal{controller="0"} 1
megaraid_degraded{controller="0"} 0
megaraid_battery_backup_healthy{controller="0"} 1
megaraid_ports{controller="0"} 8
megaraid_failed{controller="0"} 0
megaraid_drive_groups_optimal{controller="0"} 1
megaraid_healthy{controller="0"} 1
megaraid_physical_drives{controller="0"} 24
megaraid_controller_info{controller="0", model="AVAGOMegaRAIDSASPCIExpressROMB"} 1
mbostock@host:~$
Add a utility to parse the output of `smartctl`.
* Scans all disks.
* Prints metrics for `smartctl --info`.
* Prints metrics for `smartctl --attributes`.