-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement prometheus metrics #2
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, my main concern is, as usual, the prometheus client library and how it allows to just inject the metrics from anywhere
it's convenient to some point, but makes refactoring a PITA
i would not block on it, since there's no real solution, unless you want to reinvent a library
} | ||
} | ||
} | ||
impl ToString for ConsensusUpdateStatus { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI from https://doc.rust-lang.org/std/string/trait.ToString.html
This trait is automatically implemented for any type which implements the Display trait. As such, ToString shouldn’t be implemented directly: Display should be implemented instead, and you get the ToString implementation for free.
for request in server.incoming_requests() { | ||
let mut response = Vec::<u8>::new(); | ||
let metric_families = registry.gather(); | ||
// TODO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what should be done here?
record_request_duration( | ||
&self.config.chain_id, | ||
&request_type, | ||
request_start.elapsed(), | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if i see correctly, there are several points where the function can early return
and when it does, no time is recorded
also, IIUC, the metrics will not be reset if an error happens, so the metric will be "stuck" at the same value.
It's an all time "problem" of how prometheus client thinks we should export metrics, so it's just FYI.
@@ -40,6 +40,16 @@ impl StartCommand { | |||
process::exit(1); | |||
}); | |||
|
|||
if let Some(config) = &APP.config().metrics { | |||
let address = config.bind_address.clone(); | |||
info!("Starting up prometheus server on {}", address); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ehh, 1/5 i would maybe call it
info!("Starting up prometheus server on {}", address); | |
info!("Starting up metrics exporter on {}", address); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or prometheus exporter
like it says below
You can pass around the metric objects, but I find that worse -- the entire callstack has to pass the object around to record a metric. For this, I find the Singleton approach nicer
How? |
I prefer data-driven approach, e.g. when all metrics are applied within the same function, and not throughout the codebase. e.g.
well, if you don't try to make it more data-driven, then it's probably not a big problem. But then you are creating a strong coupling between how data is created and how metrics are derived |
I've tried to make this PR the least intrusive possible, though there's about 50 lines of mapping enum values to strings -- given previous comments about the extent of these types of implementations, maybe we should drop the enums and pass strings around?
The
prometheus
feature is also not gated, there's a runtime check for the server only.