Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exemplar fixes #4366

Merged
merged 2 commits into from
Nov 22, 2024
Merged

Exemplar fixes #4366

merged 2 commits into from
Nov 22, 2024

Conversation

mdisibio
Copy link
Contributor

@mdisibio mdisibio commented Nov 21, 2024

What this PR does:
Fixes several issues with exemplars that were found when testing: grafana/grafana#96859

  1. quantile_over_time(duration) was returning exemplars in nanos instead of float seconds like the samples. This is fixed as well as for any other function like avg_over_time(duration).
  2. Finishes the TODO for the other functions that were capturing placeholder exemplars but without values (always zero):
  • rate()/count_over_time() exemplars are attached to the final sample value. See screenshot below
  • min/max_over_time() use the value from the span
  1. avg_over_time was accidentally dropping exemplars because of struct address issue.
  2. Actually randomly samples a span. Maybe it's my test dataset but the current approach of Spans[0] wasn't working well. The first span in my spansets was always the root span, so quantile_over_time(duration, 0.5) was always choosing the span with the longest duration as the exemplar and it was flying outside the graph. Randomly sampling alleviates this problem.
image

TODOs
This is a lot better but there are still some things we should revisit:

  1. span ID? I really want {status=error} | rate() to link directly to the failing spans like search. But for now we aren't reading the span ID column, which would greatly increase query overhead.
  2. min/max_over_time() exemplars are kind of funny, they are any random span that was considered. I think the ideal behavior is for the exemplar to be the span that was the minimum or maximum.

Which issue(s) this PR fixes:
Fixes #

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

Copy link
Member

@mapno mapno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, nice fixes

@mdisibio mdisibio merged commit 29ef5ab into grafana:main Nov 22, 2024
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants