You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We are planning to add OpenTelemetry to ApostropheCMS. OpenTelemetry is an open-source framework for tracing, metrics and logging.
For our first iteration we'll be adding support for tracing the time taken by various operations in Apostrophe as part of any request.
Here's the list of things we plan to add "spans" for within OpenTelemetry's traces.
OpenTelemetry is pretty amazing because it can automatically associate all of these spans with the correct request, and much more as well: it can collect data from MongoDB queries, Express middleware, etc. and associate that with the correct request too. And with tools like Jaeger you can visualize that, very similar to the Chrome network panel.
In fact, what OpenTelemetry can do "out of the box" if you turn on all of its Node instrumentation is so complete that we are able to focus just on the very Apostrophe-specific aspects. Here is our tech design:
OpenTelemetry Tech Design (Phase 1)
Stability note: the OpenTelemetry APIs for logging and metrics are still in beta but our supporter's primary concern is tracing, which is stable. OpenTelemetry has officially committed to those tracing specs for three years, but we are comfortable that given their simplicity we won't have any trouble migrating later if we have to.
A log is text logged at a particular point in time, with metadata. You might view logs in a log browsing and searching application.
A metric is a counter ("total requests ever"), measure ("total requests in the last hour"), or observer ("requests active right now"). You might see it on a dashboard or graphed over time.
A trace captures information about an entire request, in our case an HTTP request. Within a trace, a span is a single operation spanning a period of time. Apostrophe is responsible for tracing spans that are relevant to Apostrophe developers, and OpenTelemetry will automatically contextualize them in the current trace. There are diagrams here.
PRIORITIES
While OpenTelemetry is suitable for logging, metrics and tracing, our client is mainly concerned with traces. That means we need to push traces forward first.
The details we are most concerned with are for server performance. Frontend performance measurement is already covered by many tools that can operate independently of any tech design choices made in Apostrophe.
Also, a huge amount of information about Node.js applications can already be added to request traces with OpenTelemetry, including MongoDB queries, middleware execution times, etc. We don't have to cover those, only what is unique to Apostrophe. Any spans we trace will automatically become part of what is reported by OpenTelemetry for the appropriate HTTP request.
IMPORTANT SPANS TO INCLUDE IN TRACES
OpenTelemetry has built-in support for traces in much of Node.js including Express and MongoDB. We would add new spans to cover what's important to Apostrophe.
We should wrap spans around the following, presented in chronological order as we move through the request:
page.serveGetPage which is responsible for fetching the page (@apostrophecms/page:getPage)
Each promise event's execution as a whole, as modulename:event:eventname
Each individual promise event handler, as modulename:event:eventname:handler:handlermodule:handlername
page.sendPage which renders the page, as @apostrophecms/page:sendPage
The specific page template in sendPage, as @apostrophecms/page:sendPage:templatemodule:templatename
insert and update calls at the Apostrophe model layer level, as modulename:insert and modulename:update
Queries at the Apostrophe model layer level, as modulename:query:toArray, modulename:query:toDistinct and modulename:query:toCount. MongoDB queries are already traced by OpenTelemetry but these spans will wrap those and add clarifying context
Async components, as modulename:component:componentname
DEPENDENCIES TO ADD, DEPENDENCIES TO LEAVE OUT
Apostrophe, and other modules that want to create a tracer and add spans, should depend on @opentelemetry/api, NOT the opentelemetry SDK. The idea behind the API module is that if the SDK is present at project level then what is traced by the API actually goes somewhere, otherwise it has no overhead.
DELIVERABLES
Tracing of the new spans identified above.
A proof of concept, for instance added to the testbed project and enabled when an environment variable is present, when working with a local installation of Jaeger as mentioned below (for dev QA).
That proof of concept should properly configure a service name to show up in Jaeger.
Documentation of how to actually enable OpenTelemetry in a project with pointing the output at Jaeger as an example.
See also the opentelemetry-poc branch of apostrophe, which currently contains the SDK as a dependency and code to initialize it (that is wrong of course and should be moved out to project level), but also demonstrates using the API module to trace all promise events (which is correct and can be extended to also trace individual handlers as explained above).
Dev QA that this also works in an assembly project without surprises. There shouldn't be a difference, and the hostname is already traced so no special multisite work should be needed, but we need to check that it works with several sites producing traces.
WAIT. HOW IS THIS EVEN POSSIBLE?
Experienced async developers will be wondering how it is possible for OpenTelemetry's simple span api to work, since req is not passed everywhere, and without that it's unclear how each span becomes associated with the right web request. The answer is that it relies on the async_hooks module and implements a mechanism similar to AsyncLocalStorage. This is an amazing feature of newer Node.js releases. However, it also comes with a 20% performance hit (it was worse in older versions), which is why it should be used only when OpenTelemetry is actually active, and not all the time in production. That's why we have no plans to use it for other purposes, like exposing req to template helper functions, etc. For that, you should continue to use solutions like our async components.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
We are planning to add OpenTelemetry to ApostropheCMS. OpenTelemetry is an open-source framework for tracing, metrics and logging.
For our first iteration we'll be adding support for tracing the time taken by various operations in Apostrophe as part of any request.
Here's the list of things we plan to add "spans" for within OpenTelemetry's traces.
OpenTelemetry is pretty amazing because it can automatically associate all of these spans with the correct request, and much more as well: it can collect data from MongoDB queries, Express middleware, etc. and associate that with the correct request too. And with tools like Jaeger you can visualize that, very similar to the Chrome network panel.
In fact, what OpenTelemetry can do "out of the box" if you turn on all of its Node instrumentation is so complete that we are able to focus just on the very Apostrophe-specific aspects. Here is our tech design:
OpenTelemetry Tech Design (Phase 1)
Stability note: the OpenTelemetry APIs for logging and metrics are still in beta but our supporter's primary concern is tracing, which is stable. OpenTelemetry has officially committed to those tracing specs for three years, but we are comfortable that given their simplicity we won't have any trouble migrating later if we have to.
TERMINOLOGY
OpenTelemetry provides logs, metrics, and traces. Here is an architectural overview.
A log is text logged at a particular point in time, with metadata. You might view logs in a log browsing and searching application.
A metric is a counter ("total requests ever"), measure ("total requests in the last hour"), or observer ("requests active right now"). You might see it on a dashboard or graphed over time.
A trace captures information about an entire request, in our case an HTTP request. Within a trace, a span is a single operation spanning a period of time. Apostrophe is responsible for tracing spans that are relevant to Apostrophe developers, and OpenTelemetry will automatically contextualize them in the current trace. There are diagrams here.
PRIORITIES
While OpenTelemetry is suitable for logging, metrics and tracing, our client is mainly concerned with traces. That means we need to push traces forward first.
The details we are most concerned with are for server performance. Frontend performance measurement is already covered by many tools that can operate independently of any tech design choices made in Apostrophe.
Also, a huge amount of information about Node.js applications can already be added to request traces with OpenTelemetry, including MongoDB queries, middleware execution times, etc. We don't have to cover those, only what is unique to Apostrophe. Any spans we trace will automatically become part of what is reported by OpenTelemetry for the appropriate HTTP request.
IMPORTANT SPANS TO INCLUDE IN TRACES
OpenTelemetry has built-in support for traces in much of Node.js including Express and MongoDB. We would add new spans to cover what's important to Apostrophe.
We should wrap spans around the following, presented in chronological order as we move through the request:
page.serveGetPage
which is responsible for fetching the page (@apostrophecms/page:getPage
)Each promise event's execution as a whole, as
modulename:event:eventname
Each individual promise event handler, as
modulename:event:eventname:handler:handlermodule:handlername
page.sendPage
which renders the page, as@apostrophecms/page:sendPage
The specific page template in
sendPage
, as@apostrophecms/page:sendPage:templatemodule:templatename
insert and update calls at the Apostrophe model layer level, as
modulename:insert
andmodulename:update
Queries at the Apostrophe model layer level, as
modulename:query:toArray
,modulename:query:toDistinct
andmodulename:query:toCount
. MongoDB queries are already traced by OpenTelemetry but these spans will wrap those and add clarifying contextAsync components, as
modulename:component:componentname
DEPENDENCIES TO ADD, DEPENDENCIES TO LEAVE OUT
Apostrophe, and other modules that want to create a tracer and add spans, should depend on @opentelemetry/api, NOT the opentelemetry SDK. The idea behind the API module is that if the SDK is present at project level then what is traced by the API actually goes somewhere, otherwise it has no overhead.
DELIVERABLES
Tracing of the new spans identified above.
A proof of concept, for instance added to the testbed project and enabled when an environment variable is present, when working with a local installation of Jaeger as mentioned below (for dev QA).
That proof of concept should properly configure a service name to show up in Jaeger.
Documentation of how to actually enable OpenTelemetry in a project with pointing the output at Jaeger as an example.
See also the
opentelemetry-poc
branch of apostrophe, which currently contains the SDK as a dependency and code to initialize it (that is wrong of course and should be moved out to project level), but also demonstrates using the API module to trace all promise events (which is correct and can be extended to also trace individual handlers as explained above).Dev QA that this also works in an assembly project without surprises. There shouldn't be a difference, and the hostname is already traced so no special multisite work should be needed, but we need to check that it works with several sites producing traces.
WAIT. HOW IS THIS EVEN POSSIBLE?
Experienced async developers will be wondering how it is possible for OpenTelemetry's simple span api to work, since req is not passed everywhere, and without that it's unclear how each span becomes associated with the right web request. The answer is that it relies on the async_hooks module and implements a mechanism similar to AsyncLocalStorage. This is an amazing feature of newer Node.js releases. However, it also comes with a 20% performance hit (it was worse in older versions), which is why it should be used only when OpenTelemetry is actually active, and not all the time in production. That's why we have no plans to use it for other purposes, like exposing
req
to template helper functions, etc. For that, you should continue to use solutions like our async components.Beta Was this translation helpful? Give feedback.
All reactions