Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid using InfoMetricFamily when possible #87

Open
gabrielcocenza opened this issue Dec 13, 2024 · 3 comments
Open

Avoid using InfoMetricFamily when possible #87

gabrielcocenza opened this issue Dec 13, 2024 · 3 comments

Comments

@gabrielcocenza
Copy link
Member

gabrielcocenza commented Dec 13, 2024

I think it's not a good practice to just dump content using InfoMetricFamily and then need to have to query using labels.

E.g:
canonical/hardware-observer-operator#354 shows that the ssacli_controller is giving false alerts.

It's cleaner in my opinion if we return 1 for OK 0 for DOWN and -1 for unknown or others.

This can simplify expressions, so instead of using something like:

ssacli_controller_info{status!~"^(OK|NOT CONFIGURED)$"} we could just use ssacli_controller_info == 0

Moreover, I think it's easier to identify if ssacli changed the output if we observe that all metrics are returning -1 instead of checking the status. Always returning 1 with InfoMetricFamily is way less expressive

Copy link

Thank you for reporting us your feedback!

The internal ticket has been created: https://warthogs.atlassian.net/browse/SOLENG-992.

This message was autogenerated

@aieri
Copy link

aieri commented Dec 13, 2024

for clarity, after discussing, the proposal is to encode rich status (i.e. strings) into metric values (integers) without dropping the human readable labels. For example, we could see the following in prometheus:

ssacli_controller_info{status = "OK"} = 1
ssacli_controller_info{status = "NOT CONFIGURED"} = -1

(specific value mapping TBD)

@aieri
Copy link

aieri commented Dec 13, 2024

we should be careful in how we handle upgrades:

  • hardware-observer revision and hardware-exporter revision should match (one provides the metrics, the other the alerts)
  • ensure that the old alerts are removed from COS upon upgrade

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants