-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pkg/translator/prometheus: Allow not normalizing UTF8 characters #35469
Conversation
I'm still confused if we just want to split the configuration of UTF-8 and suffixes into separate things of if we want to configure full metric normalization into one single configuration option 🤔 |
Hmmmm, I'm a bit confused with the direction now that I've read the translation package a bit more. We already have a feature flag for normalization. We have two flags actually:
The first one is still Alpha, and is about not allowing labels starting with The second one is more relevant to the work here. It's in beta stage, which means it's turned on by default, and if I'm reading the code correctly the only difference it makes is adding unit/type suffixes to the code IF opentelemetry-collector-contrib/pkg/translator/prometheus/normalize_name.go Lines 87 to 92 in 3033832
So what's the plan here? Should we add yet another feature-flag? Or slightly change the behavior of the existing one to not drop UTF-8 characters when normalization is disabled? If we decide to continue with the existing feature-gate, I believe our plan changes to eventually deprecate the flag instead of promoting to stable 🤔 |
At a high level, we should try and delegate + control all UTF-8 behavior through the new feature gate. |
Not sure what that implies for |
afc3592
to
e382b41
Compare
16ce50a
to
c6164f8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed the implementation moving the new feature gate allowUTF8
to the translator package. The code is not even compiling right now, but I wanted to push the changes to share a few things I'm facing
} | ||
tokens = append(tokens, metric.Name()) | ||
|
||
if addMetricSuffixes && normalizeNameGate.IsEnabled() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part here feels awkward. Two feature flags that go against each other.
NormalizeNameGate is used to tell if the user wants Prometheus Normalization or not, which means utf8 turns into underscore, units become part of the metric name and type suffixes are also added.
If NormalizeName and AllowUTF8 are both enabled at the same time, I guess we implement partial normalization (allow utf8 but still add suffixes)? When adding the suffixes, should we use dots or underscores to separate from the metric name?
I'm inclining to not have yet another feature gate and just stop removing forbidden runes when normalizeNameGate
is disabled 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The normalization of Otel metrics when exporting to Prometheus performs 3 operations:
- Remove unsupported runes in Prometheus
- Append the unit as a suffix
- Follow some additional Prometheus conventions (
_total
suffix for counters, etc.)
Allowing UTF8 means we can skip step 1 in the above list, and only this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That means that we'll have different behaviors for NormalizeNameGate
, right? Feels weird to me, I'd prefer that they don't interact with each other 😬
No strong opinions though, since NormalizeNameGate is on by default(beta) and the new gate will be off by default (alpha) we won't be introducing breaking changes immediately
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me we should have:
normalizeNameGate | allowUTF8 | Operations |
---|---|---|
false |
false |
Remove unsupported runes, don't follow Prometheus conventions |
false |
true |
Do nothing, leave metric names as is |
true |
false |
Current default behavior: remove unsupported runes in Prometheus 2.x, and follow Prometheus conventions, like unit suffixes, _total suffix, _ratio , etc.) |
true |
true |
Let all runes in place AND follow Prometheus conventions: unit suffixes, _total , _ratio , etc. |
Make sure the code behaves as in the above table and we should be fine! 😊
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've just added a test that verifies their interaction :)
c6164f8
to
eaf5214
Compare
eaf5214
to
d01f73e
Compare
d01f73e
to
3bfbc3c
Compare
I got that whenever model.NameValidationScheme wasn't set to model.UTF8Validation and I passed a name with dots. |
I guess using init() to check the feature-flag state isn't the best approach here. I can't tell what is initialized first |
1f49d11
to
e9f0df5
Compare
Ok, should be ready for review again. I've done manual tests with prometheus exporter and it's working: cmdline:
configfile: receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
exporters:
debug:
verbosity: detailed
prometheus:
endpoint: 0.0.0.0:8889
service:
pipelines:
metrics:
receivers: [otlp]
exporters: [prometheus, debug] Otel metrics endpoin: $ curl -H 'Accept: application/openmetrics-text; escaping=allow-utf-8' localhost:8889/metrics
# HELP "run.total" The number of times the iteration ran
# TYPE "run.total" counter
{"run.total","attr.A"="chocolate.cake","attr.B"="raspberry","attr.C"="vanilla",job="test-service"} 10
$ curl -H 'Accept: application/openmetrics-text' localhost:8889/metrics
# HELP run_total The number of times the iteration ran
# TYPE run_total counter
run_total{attr_A="chocolate.cake",attr_B="raspberry",attr_C="vanilla",job="test-service"} 10 I still want to test remotewriteexporter, hopefully still this week |
2e05a62
to
72f14fc
Compare
Urgh, I was hoping to test remotewrite now, but after rebasing the PR now that 0.111.0 was released I can't build the binary anymore 🤔 arthursens$ make otelcontribcol
cd ./cmd/otelcontribcol && GO111MODULE=on CGO_ENABLED=0 go build -trimpath -o ../../bin/otelcontribcol_darwin_arm64 \
-tags "" .
# github.com/open-telemetry/opentelemetry-collector-contrib/cmd/otelcontribcol
runtime.main_main·f: function main is undeclared in the main package
make: *** [otelcontribcol] Error 1 |
72f14fc
to
a56c8dd
Compare
Signed-off-by: Arthur Silva Sens <[email protected]>
a56c8dd
to
615b0aa
Compare
Just adding some context here: Initially, we thought that enabling UTF-8 was something that overrides the behaviors on all exporters that use the translation package. After discussion with the OpenTelemetry-Prometheus Working Group, we realized that that's not the case. The I do remember that we wanted to explore allowing different exporters to provide config options that would control the translation, but I still have one doubt: Should we have config+feature flag? Or just config? |
I think it is fine to just have config. Maybe add a comment saying that it is experimental if you want. |
Ok, I think I'll split this into two PRs then: 1 - Add optionality to the translator API. |
Here is the 1st PR: #35904 |
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
@@ -60,6 +60,10 @@ Given the example, metrics will be available at `https://1.2.3.4:1234/metrics`. | |||
|
|||
OpenTelemetry metric names and attributes are normalized to be compliant with Prometheus naming rules. [Details on this normalization process are described in the Prometheus translator module](../../pkg/translator/prometheus/). | |||
|
|||
Prometheus 2.55.0 introduced support for UTF-8 characters behind the feature-flag `utf8-names`. Prometheus 3.0.0 and later accept UTF-8 by default. This means that name and attribute normalization is not required if you're using those versions. To allow UTF-8 characters to be exposed without normalization, start the collector with the feature gate: `--feature-gates=pkg.translator.prometheus.allow_utf8`. | |||
|
|||
The scraper must include `scaping=allow-utf-8` in the `Accept` header for UTF-8 characters to be exposed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The scraper must include `scaping=allow-utf-8` in the `Accept` header for UTF-8 characters to be exposed. | |
The scraper must include `escaping=allow-utf-8` in the `Accept` header for UTF-8 characters to be exposed. |
I also suggest we make sure to follow Prometheus' now official guide on UTF-8 and OTLP metrics ingestion. Their OTLP ingestion has a
We should probably do something equivalent, instead of combining flags. Especially since the behavior in Prometheus will change again with no suffixing required once they add unit and description metadata. |
We've discussed this a few times in the Otel's Prometheus SIG calls, we all prefer what you described but we were a bit concerned with breaking changes with the current config 😬 Recently we've also talked about moving the translator package into a separate package because it's has been super difficult to work and maintain this package and its duplicate in the Prometheus repository. We're seeing optimizations being done in only one side, different maintainers choosing different configuration options, etc. Are you aware of the Prometheus SIG calls @bertysentry? As one of the code owners here your opinion would be super helpful :) |
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
Closed as inactive. Feel free to reopen if this PR is still being worked on. |
Description:
Update the Prometheus translation package to optionally allow UTF-8 characters in metric and label names. Motivated by Prometheus 2.55.0 and 3.0.0 finally accepting UTF8 characters 🙂
Link to tracking Issue:
Fixes #35459
Testing:
Beyond the unit tests, I'm doing manual tests with this branch.
exporter/prometheus:
Metrics that don't have UTF8 characters are exposed as is, metrics that have follow the new exposition format
exporter/prometheusremotewrite
TODO
Documentation: