-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
metrics, pprof: support reloading services with SIGHUP #3016
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #3016 +/- ##
==========================================
- Coverage 22.87% 22.85% -0.03%
==========================================
Files 791 791
Lines 58688 58734 +46
==========================================
- Hits 13425 13422 -3
- Misses 44366 44414 +48
- Partials 897 898 +1 ☔ View full report in Codecov by Sentry. 🚨 Try these New Features:
|
This is a service interruption and we can easily avoid it.
In this case it can be suppressed, |
1520f03
to
055c9dd
Compare
Do I understand correctly that it is also necessary to overwrite this variable and all its derivatives if |
Likely so, it should be possible to enable/disable the service with SIGHUP. |
055c9dd
to
0d24df4
Compare
I have made sure that the metrics are completely reloaded in all services where they are used. But I'm not at all sure that I did it right. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the amount changes I suspect we're doing something wrong. We're using global metric registry and yet we have some service that is collecting all metrics into a single place. This makes zero sense to me. In general, we still want to use the default registry since it adds Go/process-specific things that we want to export. This means each package can define metrics it wants locally and add data to them irrespectively of settings (or just with some local flag). And then services would just be HTTP windows to things provided by packages, it'd be trivial to restart them.
So my suggestion is to refactor pkg/metrics
out of the way completely (moving respective metrics to appropriate packages) and then solve the restart problem easily.
Or maybe at least, if we're not refactoring it now, we can just leave |
I feel like it is a simpler way in general but still, the idea of local metrics for every package scares me a little just because it will be hard to control them (i mean you will never see all the metrics in a single place then). My internal feeling is close to what @End-rey did in this PR. However, @roman-khimov, I agree that we are fighting against the lib. Dont have a strict opinion here. Maybe if some decision is required, then I have to say that KISS should win here and local metrics in every service should be a better choice overall. |
Logs: ``` prometheus service started successfully pprof service started successfully ``` Appear after shutting down these services. Now they do not appear at all. Signed-off-by: Andrey Butusov <[email protected]>
Add consts for the metric and profiler names. Make `c.veryLastClosers` a map. Signed-off-by: Andrey Butusov <[email protected]>
0d24df4
to
8fb2539
Compare
To simply reload the metrics service and enable/disable it at runtime, always initialize the metrics collector and collect data, even in local mode, if it is not exposed via HTTP. Signed-off-by: Andrey Butusov <[email protected]>
Reload prometheus and pprof services, if the config is updated. Closes #1868. Signed-off-by: Andrey Butusov <[email protected]>
8fb2539
to
b79f58f
Compare
Made "always collect" mode so node only reloads metrics server with SIGHUP. |
Closes #1868.
Is it right that we should reload services, even if the config is not updated?
There is also such a linter error:
contextcheck Function `preRunAndLog->preRunAndLog$2->Shutdown` should pass the context parameter
Do I need to pass the context honestly or is there some another way?