-
Notifications
You must be signed in to change notification settings - Fork 191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add OpenTelemetry converter to export Prometheus metrics #97
Comments
I also believe that outputting data using open telemetry with the right attribute semantics could be used to later correlate data from application data. |
Typically we use tracing the measure the latency across function calls. Could you please explain better why you need tracing? |
When I mention OpenTelemetry(OTel) it's not just about tracing. Metrics themselves can be produced in the OTel format. This seems to me much more flexible than outputting metrics as timeseries in the Prometheus format. The main advantage is to easily integrate and correlate with App generated metrics. |
It's actually on our roadmap to support export metrics in different formats, not just Prometheus... We could discuss in more detail, it would be nice if you could create a google doc detailing the ideas and then everyone can give feedback |
I totally support the idea of replacing the Prometheus metrics with OpenTelemetry metrics. Then it can be exported anywhere (including Prometheus) through the OpenTelemetry Collector. It would make things more open and platform-agnostic. |
wait a min, why kepler need OpenTelemetry or in general distributed tracing?
it's good to support different type of format as output, but may I know what's the different between OpenTelemetry and prometheus? I hope this is the correct document. if document above is correct, can anyone help find a sample that prometheus consumes openmetrics format? Otherwise, it looks like a single way from prometheus to openmetrics. Hence, I suppose to avoid misunderstanding, we'd better rename this issue to add openmetrics support? |
OpenTelemetry (OTel) is not just about Tracing. It includes metrics and logs... More to come in the future. |
@sallyom has some early PoC with this |
I would like to see the PoC and further evaluate it. To see whether it's ready for us at implementation level or not. Hence, if migrate to OTel means too much efforts and dependencies, I would like to considering that we wait until OTel ready with UI as dashboard, to make sure kepler's user has same UX in prometheus and OTel.
|
@SamYuan1990 OTel won't have natively dashboards, or UI, etc. OTel defines data structures and protocols for metrics, logs and traces. They provide SDKs so that app developers can send metrics, logs and traces that can then be consumed in any OpenTelemetry-supported backend and UI: Prometheus + Grafana, or Datadog, or New Relic, or Splunk, or Dynatrace, etc. etc. OTel also provides a "collector", whose role is mostly to act as a proxy, relaying metrics, logs and traces from one place to another. You can use OpenTelemetry in Kepler to export OTel metrics, that will be pushed to an OTel Collector running on the side (like a wagon), and that will export these metrics to Prometheus. This way, it's 100% compatible with the current architecture, and you don't need to rewrite your Grafana dashboards. The benefit is that the user can easily configure the OpenTelemetry Collector to push metrics to other backends as well (Datadog, New Relic, etc.) To answer your points:
Last but not least: it is important to follow semantic conventions. For example, you're currently exporting this metric: In OpenTelemetry, you will rather create a metric as:
When exported to Prometheus (using either OpenTelemetry Collector Contrib exporters for Prometheus), this metric will be converted to Hope this helps understand OpenTelemetry! |
LGTM. btw, do you know if prometheus has any idea to consume OTel metric directly? |
I tried with https://github.com/open-telemetry/opentelemetry-go/blob/main/example/prometheus/main.go and https://github.com/open-telemetry/opentelemetry-go/blob/main/example/view/main.go it seems if we use OTel, it's nearly same as prometheus? ref https://opentelemetry.io/docs/reference/specification/metrics/data-model/#point-kinds |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
@SamYuan1990 OpenTelemetry can collect metrics from Prometheus directly So, we don't need to export OpenTelemetry metrics, right? |
yea... but in the past, as offline discussed with @rootfs , if we are going to run kepler on edge node. (edge computing) we'd better support opentelemetry metric. as for edge node, it's better to use remote push. |
Humm, I was not aware of this use case. |
@marceloamaral In general, we all agree it's better to use the open standard that most vendors agreed on, than just one specific technology. It will make the integration with the rest of the world much smoother, and it should not add any friction when interacting with the Prometheus world. I understand that switching from a Prometheus-based code to OpenTelemetry is quite a challenge, though! Trivia: Did you know that OpenTelemetry takes its roots in OpenMetrics (and others), which derives directly from Prometheus? 😉 |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
I don't think this is stale. |
thanks! @brunobat |
yes, in the latest release they added native otlp ingestion. https://github.com/prometheus/prometheus/releases/tag/v2.47.0 |
👋 Prometheus team member here. For information, the Prometheus community agreed that the Prometheus client libraries will support exporting OTLP metrics directly. From the Sep 30th 2023's Prometheus developers summit notes
|
There was some discussion in the Community meeting about the overhead of Prometheus and OTLP client. |
Some more experiments are leading that PromAgent+RW is less cpu crunching than setting up Otel Collector+OTLP and Otel collector+RW. (results here) thanks to @danielm0hr @bertysentry has your team conducted similar benchmarks? At the same point I would like to add the point that Otel SDK instrumentation still supports Prometheus and the scope of this integration to not limited to setting up Otel Collector +RW but to instrument Kepler using open protocol (and not Prom metrics) that supports metrics vendor other than Prometheus. Using Prometheus as a backend is not affected by this integration. |
I did some benchmarks in the past that show a CPU overhead on the otel side when dealing with the otel prometheus receiver and exporter. But it did better in memory and network - While it also depends on the configuration of the collector. I can not confirm that the CPU usage was higher in a otel in/out senario then in prometheus scrape + rw. Unfortunately, I do not have much time to make the setup and the results available in a way that is understandable like https://github.com/danielm0hr/edge-metrics-measurements. |
I think that we are talking about different use cases here:
IIUC the first use case can be accomplished today with the OTEL collector scraping metrics from the /metrics endpoint (and hopefully Prometheus should be able to support this natively in the future). IMHO the second case would deserve careful evaluation because the Kepler exporter has some unique characteristics/challenges in terms of instrumentation (discussed in #439 and #365 (comment)). |
@simonpasquier The idea would be to use the OpenTelemetry SDK everywhere we can to produce OTLP metrics instead of Prometheus metrics. Of course, one can use OpenTelemetry's receiver for Prometheus to export the metrics to another OpenTelemetry-supporting backend, but it's an added step in the way that we can remove. The Prometheus server can now ingest OTLP metrics natively. This means that Kepler use OpenTelemetry to send OTLP metrics and still use Prometheus as a backend, without any extra-step, and no OpenTelemetry Collector required at all, and therefore no performance hit either. |
The performance issue I'm referring to was with the Prometheus client_golang library and one would need to verify that the OTEL SDK provides good performances given the very special nature of the Kepler exporter. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
Describe the solution you'd like
A clear and concise description of what you want to happen.
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Additional context
Add any other context or screenshots about the feature request here.
The text was updated successfully, but these errors were encountered: