Skip to content

Commit

Permalink
Fix SmartAttributeWarning alert (#375)
Browse files Browse the repository at this point in the history
* Fix SmartAttributeWarning alert

* Fix alert unit test
  • Loading branch information
Deezzir authored Dec 13, 2024
1 parent a86afe4 commit d7d6c32
Show file tree
Hide file tree
Showing 2 changed files with 6 additions and 5 deletions.
4 changes: 2 additions & 2 deletions src/prometheus_alert_rules/smart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ groups:
- alert: SmartAttributeWarning
# based on https://www.backblaze.com/blog/what-smart-stats-indicate-hard-drive-failures/
expr: smartctl_device_attribute{attribute_id=~"5|187|188|197|198"} > 0
expr: smartctl_device_attribute{attribute_id=~"5|187|188|197|198", attribute_value_type="raw"} > 0
for: 2m
labels:
severity: warning
Expand Down Expand Up @@ -124,4 +124,4 @@ groups:
The NVMe drive has reached 90% of its estimated lifetime.
Note: A value of 100 does not indicate failure. For more details, visit https://charmhub.io/hardware-observer/docs/metrics-and-alerts-smart
VALUE = {{ $value }}
LABELS = {{ $labels }}
LABELS = {{ $labels }}
7 changes: 4 additions & 3 deletions tests/unit/test_alert_rules/test_smart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -168,7 +168,7 @@ tests:
- interval: 1m
input_series:
- series: 'smartctl_device_attribute{device="sda", attribute_id="5", attribute_name="Reallocated_Sectors_Count", instance="ubuntu-2"}'
- series: 'smartctl_device_attribute{device="sda", attribute_id="5", attribute_name="Reallocated_Sectors_Count", instance="ubuntu-2", attribute_value_type="raw"}'
values: '2x10'

alert_rule_test:
Expand All @@ -181,13 +181,14 @@ tests:
device: sda
attribute_id: 5
attribute_name: Reallocated_Sectors_Count
attribute_value_type: raw
exp_annotations:
summary: SMART device attribute correlating with drive failure has its raw value greater than zero. (instance ubuntu-2)
description: |
SMART raw value for attribute "Reallocated_Sectors_Count" with id "5"
on device "sda" is greater than 0.
VALUE = 2
LABELS = map[__name__:smartctl_device_attribute attribute_id:5 attribute_name:Reallocated_Sectors_Count device:sda instance:ubuntu-2]
LABELS = map[__name__:smartctl_device_attribute attribute_id:5 attribute_name:Reallocated_Sectors_Count attribute_value_type:raw device:sda instance:ubuntu-2]
- interval: 1m
input_series:
Expand Down Expand Up @@ -230,4 +231,4 @@ tests:
The NVMe drive has reached 90% of its estimated lifetime.
Note: A value of 100 does not indicate failure. For more details, visit https://charmhub.io/hardware-observer/docs/metrics-and-alerts-smart
VALUE = 95
LABELS = map[__name__:smartctl_device_percentage_used device:nvme instance:ubuntu-4]
LABELS = map[__name__:smartctl_device_percentage_used device:nvme instance:ubuntu-4]

0 comments on commit d7d6c32

Please sign in to comment.