-
Notifications
You must be signed in to change notification settings - Fork 540
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
slo: include all request cancellations within SLO #4355
slo: include all request cancellations within SLO #4355
Conversation
for case 2, I think the Handlers should pass correct error to let me update the handlers to handle these late cancellation. |
df58c34
to
f8de330
Compare
@electron0zero Adding some more thoughts about the default behavior of counting canceled requests within SLO: I think we could also support an operator that wants to exclude them from SLO by adding an additional label that indicates if the request was completed or canceled like An additional benefit is that this would give us metrics on canceled requests in general, which I don't think we have today. Could be useful. I think this should be feasible and localized to the postSLOHook method where it is checking for cancelations already. But let me know if there are considerations I missed. |
f8de330
to
0ba43c1
Compare
48b2d02
to
055e6af
Compare
What this PR does:
We count all request cancellations outside SLO, but that's not right because request cancellations are done by users and should fall within SLO.
Request cancellations are now exposed under
result
label intempo_query_frontend_queries_total
andtempo_query_frontend_queries_within_slo_total
withcompleted
orcanceled
values to differentiate between completed and canceled requests.operators can filter
tempo_query_frontend_queries_within_slo_total
withresult=completed
to get the old behaviour of the metric where cancellations where excluded from the metric.with the result label, we also expose the information about the cancelled requests, and you can filter on
result=canceled
to see cancelled requests.I also added a span and span events in the http and gRPC collector to expose more details of the pipeline.
Which issue(s) this PR fixes:
Fixes #
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]