Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storcli.py: Add cachevault status #202

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

jcpunk
Copy link
Contributor

@jcpunk jcpunk commented Jan 29, 2024

Adds metrics for the cachevault status.

Hardware tested: LSI MegaRAID SAS-3 3108 [Invader] (rev 02)

@dswarbrick
Copy link
Member

@SuperQ I need a second opinion here. What does the Big Book of Prometheus Best Practices say about this sort of thing? Should we go with three different metric names, or a single cv_state metric, with the state contained within a label?

@dswarbrick dswarbrick self-assigned this Feb 13, 2024
@dswarbrick dswarbrick requested a review from SuperQ February 13, 2024 18:38
@jcpunk
Copy link
Contributor Author

jcpunk commented Feb 15, 2024

FWIW: https://github.com/prometheus-community/systemd_exporter/tree/main provides:

systemd_unit_state{name="sysinit.target",state="activating",type="target"} 0
systemd_unit_state{name="sysinit.target",state="active",type="target"} 1
systemd_unit_state{name="sysinit.target",state="deactivating",type="target"} 0
systemd_unit_state{name="sysinit.target",state="failed",type="target"} 0
systemd_unit_state{name="sysinit.target",state="inactive",type="target"} 0

@dswarbrick
Copy link
Member

There are essentially three ways we can go about this. For example, if a CacheVault is degraded, we could expose:

cv_optimal{controller="0",cvidx="1"} 0
cv_degraded{controller="0",cvidx="1"} 1
cv_failed{controller="0",cvidx="1"} 0

or

cv_state{controller="0",cvidx="1",state="optimal"} 0
cv_state{controller="0",cvidx="1",state="degraded"} 1
cv_state{controller="0",cvidx="1",state="failed"} 0

or merely

cv_state{controller="0",cvidx="1",state="degraded"} 1

The first two methods are largely the same, although I would argue that the second method is slightly more user-friendly, as it would allow the contents of the state label to be used verbatim in Grafana dashboards with a very simple query.

The third method will result in stale metrics for 5 minutes whenever the state changes, due to Prometheus' default look-behind window and the fact that a series effectively disappears when the state label changes.

@jcpunk jcpunk force-pushed the storcli-cachevault branch from f0bf28e to 799b73e Compare February 16, 2024 14:26
@jcpunk
Copy link
Contributor Author

jcpunk commented Feb 16, 2024

Updated to try and use example output 2

@jcpunk
Copy link
Contributor Author

jcpunk commented May 31, 2024

Any further thoughts on this?

@dswarbrick
Copy link
Member

Waiting for input / review from @SuperQ

@jcpunk
Copy link
Contributor Author

jcpunk commented Dec 5, 2024

Any further thoughts on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants