-
Notifications
You must be signed in to change notification settings - Fork 526
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add TraceQL query hint to retrieve most recent results ordered by trace start time #4238
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Joe Elliott <[email protected]>
Signed-off-by: Joe Elliott <[email protected]>
Signed-off-by: Joe Elliott <[email protected]>
Signed-off-by: Joe Elliott <[email protected]>
f3bc8f9
to
19c0cab
Compare
Signed-off-by: Joe Elliott <[email protected]>
Signed-off-by: Joe Elliott <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for updating the docs! Updates look good.
} | ||
i.headBlockMtx.RUnlock() | ||
if err := anyErr.Load(); err != nil { | ||
return nil, err | ||
} | ||
if combiner.Count() >= maxResults { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One benefit of this early return was that it didn't have to take the blocks mutex if the search was satisfied by the head block. Is that still possible if the query isn't using the mostrecent hint?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup, that check has been moved here. for the anyCombiner
the time is ignored and it's just the old limit check.
|
||
if c.Count() == c.keepMostRecent && c.keepMostRecent > 0 { | ||
// if this is older than the oldest element, bail | ||
if c.OldestTimestampNanos() > new.StartTimeUnixNano { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a feeling that this function is correct, but I need help understanding a bit more. Wondering if it should check the new timestamp cutoff before merging with existing? Or generally: how the timestamp/sorting works when merging. If we get an older partial snippet of an existing entry, it will overwrite the start time to be older. In that case should the result be kicked out? If the trace had been fully combined into one snippet (perfect compaction), it seems like it would have been kicked out (properly counted as older than other results).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, these are good callouts. i wrote this logic thinking about the combiner being used in the engine, but the truth is there is all kinds of weird behavior with fractured traces when used in the query frontend. i don't know if there are good answers for all these edge cases without reconsiderations at the storage layer. this may be a feature that is always best effort.
This is weird:
- Add metadata for trace id 1234 to the most recent list
- It gets pushed out by more recent traces
- Receive a new shard for 1234 that re-qualifies it for the most recent list
- We correctly return 1234 but have lost some information about it
Or this:
- Receive 10 "most recent" traces and return a result to the client
- There is a trace shard earlier in the search window which pushes one of the returned results out of the most recent list.
- We have incorrectly returned a trace that should not have been in the list.
Signed-off-by: Joe Elliott <[email protected]>
Signed-off-by: Joe Elliott <[email protected]>
Signed-off-by: Joe Elliott <[email protected]>
What this PR does
Adds a query hint
with (most_recent=true)
that causes Tempo to perform a more intense search to return the most recent traces ordered by start time. This is currently implemented as a query hint b/c (depending on the circumstances) it can have a quite large impact on time to return results in both streaming grpc and discrete http paths. A TraceQL hint was chosen over a query param b/c it is desired longer term that this is the default behavior and it will be easier to remove a query hint then a query parameter.This change is implemented at the query frontend, query engine and ingester search levels. In all cases it is required to store the most recent results and continue searching instead of immediately returning once a limit is hit.
Other Changes
Which issue(s) this PR fixes:
This PR will give a way to always return the most recent results ordered by start time which partially addresses #3777, #3109 and #2659.
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]