-
Notifications
You must be signed in to change notification settings - Fork 250
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(tracing): rework capturing and use it for all requests #5049
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
WASM Query Engine file Size
|
CodSpeed Performance ReportMerging #5049 will not alter performanceComparing Summary
|
aqrln
force-pushed
the
rework-tracing
branch
3 times, most recently
from
November 25, 2024 14:32
3539cbd
to
e74fb4c
Compare
aqrln
force-pushed
the
rework-tracing
branch
2 times, most recently
from
December 2, 2024 12:55
1006ad1
to
79b8d8d
Compare
aqrln
force-pushed
the
rework-tracing
branch
2 times, most recently
from
December 10, 2024 12:37
db6f925
to
5917c3b
Compare
aqrln
force-pushed
the
rework-tracing
branch
from
December 12, 2024 12:18
bb2734a
to
f05d8f0
Compare
aqrln
force-pushed
the
rework-tracing
branch
from
December 13, 2024 16:31
f27e8d8
to
25bde41
Compare
aqrln
commented
Dec 13, 2024
aqrln
commented
Dec 13, 2024
aqrln
commented
Dec 13, 2024
aqrln
force-pushed
the
rework-tracing
branch
from
December 14, 2024 08:59
898fada
to
7ff8b5c
Compare
jkomyno
reviewed
Dec 16, 2024
jkomyno
reviewed
Dec 16, 2024
jkomyno
reviewed
Dec 16, 2024
jkomyno
reviewed
Dec 16, 2024
jkomyno
reviewed
Dec 16, 2024
jkomyno
reviewed
Dec 16, 2024
jkomyno
reviewed
Dec 16, 2024
jkomyno
approved these changes
Dec 16, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work! Thanks for the additional context in call.
aqrln
added a commit
that referenced
this pull request
Dec 16, 2024
Fixup #5049 and rename one missed span attribute. The tests on the client were updated to expect the new name but the attribute was only updated in one of two places.
aqrln
added a commit
that referenced
this pull request
Dec 16, 2024
Fixup #5049 and rename one missed span attribute. The tests on the client were updated to expect the new name but the attribute was only updated in one of two places.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When exporting the engine traces to the client, we used to use the
tracing_opentelemetry
crate to converttracing
spans to OpenTelemetry spans and feed them to a custom OpenTelemetrySpanProcessor
that, again, converted them from OpenTelemetry spans to our custom format to be serialized and sent to the client and then re-created as OpenTelemetry spans there. This approach had numerous problems:SpanProcessor
, we can just as well converttracing
spans to that same format in atracing_subscriber
Layer
. The dependency onopentelemetry
was also the reason we didn't have tracing support in WASM.Span
constructor) of a specific OTel SDK implementation (@opentelemetry/sdk-trace-base
) in@prisma/instrumentation
, which made us completely incompatible with certain vendor-specific SDKs not based onsdk-trace-base
and caused dependency hell issues for the compatible ones because we and the application had to have matching versions. And we couldn't easily re-map those spans by re-creating them withtracing.startActiveSpan
and discard the engine IDs without buffering and keeping additional data structures in memory which are tricky to maintain and clean up because we couldn't know if and when we received all spans for a request.tracing
Subscriber
corresponding to the correct engine instance (which was crucial for logs working correctly), OpenTelemetry instrumentation was global, so tracing broke with multiplePrismaClient
instances in a single app.Additionally, for Accelerate, we had what we called "capturing mechanism", that was a different
SpanProcessor
which was listenting for spans and events and associating them with requests by trace ID, to be included as part of the response to the client. Capturing the spans in this way for the normal, local engine is a nice solution to the above problems but the following problems needed to be solved:tracing
span IDs are not globally unique: they can be reused after the span is closed, so while they can be relied on to distinguish the spans in the same trace, root span IDs from different points in time could repeat. This wouldn't be a problem for Accelerate in practice, but for library engine we'd want to make a separate endpoint for fetching the spans by request ID, otherwise it's not possible to have theprisma:engine:response_json_serialization
span because the trace would have to be finished before preparing the response — this is a tradeoff we are willing to have for Accelerate but we want a more accurate timing in the library engine. This means we need the request IDs to be unique for a very long period of time.This PR reworks the capturing mechanism to be a
tracing_subscriber
Layer
, introducesRequestId
in the API and makes it the only way to retrieve tracing data used in all types of engines. It also gets rid of OpenTelemetry dependencies, does some refactoring and makes the tracing logic leaner and more efficient.Additionally, this PR enables tracing support in
query-engine-wasm
. This adds 10 KB gzipped which is okay now that Cloudflare allows 3 MB instead of 1 MB on free accounts.Closes: https://github.com/prisma/team-orm/issues/1337
Closes: https://github.com/prisma/team-orm/issues/1336
Closes: https://github.com/prisma/team-orm/issues/1338
Close: https://github.com/prisma/team-orm/issues/1309
Fixes: prisma/prisma#16791
Fixes: prisma/prisma#16980
Fixes: prisma/prisma#17853
Fixes: prisma/prisma#19580
Fixes: prisma/prisma#20779
The corresponding client-side changes will also fix the following issues:
TypeError: parentTracer.getSpanLimits is not a function
prisma#14887Span
constructor prisma#16309prisma:engine
spans do not respectsuppressTracing()
prisma#17953tracing: engine
spans don't pass throughSampler
prisma#19088parentTracer.getSpanLimits is not a function
prisma#21397opentelemetry-sdk-trace-base
(e.g. Datadog tracer) prisma#22450@prisma/instrumentation
dependencies peer dependencies prisma#24373