-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
telemetry worker: flush data after stops #515
Open
cataphract
wants to merge
1
commit into
main
Choose a base branch
from
glopes/flush-data-after-stop
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -242,10 +242,10 @@ impl TelemetryWorker { | |
Lifecycle(Start) => { | ||
if !self.data.started { | ||
self.deadlines | ||
.schedule_event(LifecycleAction::FlushData) | ||
.schedule_event(LifecycleAction::FlushMetricAggr) | ||
.unwrap(); | ||
self.deadlines | ||
.schedule_event(LifecycleAction::FlushMetricAggr) | ||
.schedule_event(LifecycleAction::FlushData) | ||
.unwrap(); | ||
self.data.started = true; | ||
} | ||
|
@@ -266,7 +266,7 @@ impl TelemetryWorker { | |
.unwrap(); | ||
} | ||
Lifecycle(FlushData) => { | ||
if !self.data.started { | ||
if !(self.data.started || self.config.restartable) { | ||
return CONTINUE; | ||
} | ||
let batch = self.build_observability_batch(); | ||
|
@@ -297,7 +297,9 @@ impl TelemetryWorker { | |
self.log_err(&e); | ||
} | ||
self.data.started = false; | ||
self.deadlines.clear_pending(); | ||
if !self.config.restartable { | ||
self.deadlines.clear_pending(); | ||
} | ||
return BREAK; | ||
} | ||
CollectStats(stats_sender) => { | ||
|
@@ -342,10 +344,10 @@ impl TelemetryWorker { | |
Err(err) => self.log_err(&err), | ||
} | ||
self.deadlines | ||
.schedule_event(LifecycleAction::FlushData) | ||
.schedule_event(LifecycleAction::FlushMetricAggr) | ||
.unwrap(); | ||
self.deadlines | ||
.schedule_event(LifecycleAction::FlushMetricAggr) | ||
.schedule_event(LifecycleAction::FlushData) | ||
.unwrap(); | ||
self.data.started = true; | ||
} | ||
|
@@ -369,7 +371,7 @@ impl TelemetryWorker { | |
.unwrap(); | ||
} | ||
Lifecycle(FlushData) => { | ||
if !self.data.started { | ||
if !(self.data.started || self.config.restartable) { | ||
return CONTINUE; | ||
} | ||
let mut batch = self.build_app_events_batch(); | ||
|
@@ -459,7 +461,9 @@ impl TelemetryWorker { | |
.await; | ||
|
||
self.data.started = false; | ||
self.deadlines.clear_pending(); | ||
if !self.config.restartable { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same here. |
||
self.deadlines.clear_pending(); | ||
} | ||
return BREAK; | ||
} | ||
CollectStats(stats_sender) => { | ||
|
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be enough to only include
self.data.started = false;
inside the if statement as well, and leave the exit early checks?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it depends. Do we want Start, Stop, Stop to generate two stops? Because that's what PHP ends up generating. The second stop is a noop, but if I moved the assignment
self.data.started = false
under the conditionif !self.config.restartable
, then the stops would be effectiveThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the exit early checks are still in the code, then how would the second
Stop
be effective?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I understand, what you're proposing is that I reset
started = false
only if!restartable
. In that casestarted
would stay true forever once there is a start. So the early checkwould never be hit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you're right. I was confusing the early checks.
There is something in the logic that feels a bit broken. The
FlushData
is also protected by this!self.data.started
check. Should it work after a restartable stop?The things that
Stop
does, should they happen when the first request ends, or when the worker stops?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My guess is yes, otherwise there is no way to send the data that's collected after the stop (
build_observability_batch
is called only from the handlers of Stop and FlushData), at least not without an intervening start+stop.That is a good point. The final flush of the metrics should happen when the worker stops, not when handling Stop. I guess at some point they were the same, but then the restart thing was introduced. But regardless, once we have a way to send metrics after a Stop, for that happen periodic flushes should still happen. So FlushData shouldn't be skipped or unscheduled after a Stop.