Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkg/translator/prometheus: Allow not normalizing UTF8 characters #35469

Closed

Conversation

ArthurSens
Copy link
Member

@ArthurSens ArthurSens commented Sep 27, 2024

Description:

Update the Prometheus translation package to optionally allow UTF-8 characters in metric and label names. Motivated by Prometheus 2.55.0 and 3.0.0 finally accepting UTF8 characters 🙂

Link to tracking Issue:

Fixes #35459

Testing:

Beyond the unit tests, I'm doing manual tests with this branch.

exporter/prometheus:

$ curl -H 'Accept: application/openmetrics-text; escaping=allow-utf-8' localhost:8889/metrics
# HELP "run.total" The number of times the iteration ran
# TYPE "run.total" counter
{"run.total","attr.A"="chocolate.cake","attr.B"="raspberry","attr.C"="vanilla",job="test-service"} 10

$ curl -H 'Accept: application/openmetrics-text' localhost:8889/metrics
# HELP run_total The number of times the iteration ran
# TYPE run_total counter
run_total{attr_A="chocolate.cake",attr_B="raspberry",attr_C="vanilla",job="test-service"} 10

Metrics that don't have UTF8 characters are exposed as is, metrics that have follow the new exposition format

exporter/prometheusremotewrite

TODO

Documentation:

@ArthurSens
Copy link
Member Author

I'm still confused if we just want to split the configuration of UTF-8 and suffixes into separate things of if we want to configure full metric normalization into one single configuration option 🤔

@ArthurSens
Copy link
Member Author

ArthurSens commented Sep 27, 2024

Hmmmm, I'm a bit confused with the direction now that I've read the translation package a bit more.

We already have a feature flag for normalization. We have two flags actually:

  • pkg.translator.prometheus.PermissiveLabelSantitization: Controls whether to change labels starting with _ to key_.
  • pkg.translator.prometheus.NormalizeName: Controls whether metrics names are automatically normalized to follow Prometheus naming convention

The first one is still Alpha, and is about not allowing labels starting with _. I wonder if this is still a problem now that UTF-8 is supported. I don't mean to remove the flag because Prometheus < 2.55.0 still have this problem, but maybe we can plan an eventual deprecation.

The second one is more relevant to the work here. It's in beta stage, which means it's turned on by default, and if I'm reading the code correctly the only difference it makes is adding unit/type suffixes to the code IF addMetricSuffixes is enabled as well. If suffixes are disabled, the feature flag still don't have control if forbidden runes are removed or not, it is always removed:

if addMetricSuffixes && normalizeNameGate.IsEnabled() {
return normalizeName(metric, namespace)
}
// Simple case (no full normalization, no units, etc.), we simply trim out forbidden chars
metricName = RemovePromForbiddenRunes(metric.Name())

So what's the plan here? Should we add yet another feature-flag? Or slightly change the behavior of the existing one to not drop UTF-8 characters when normalization is disabled?

If we decide to continue with the existing feature-gate, I believe our plan changes to eventually deprecate the flag instead of promoting to stable 🤔

@dashpole
Copy link
Contributor

At a high level, we should try and delegate + control all UTF-8 behavior through the new feature gate. pkg.translator.prometheus.NormalizeName should only add suffixes, and should use model.EscapeName() if we need to do any escaping. Ideally, the escaping scheme should be configurable.

@dashpole
Copy link
Contributor

Not sure what that implies for pkg.translator.prometheus.PermissiveLabelSantitization, since it depends on whether starting with a _ is allowed with UTF-8 (I don't think it is?).

@ArthurSens ArthurSens force-pushed the promexporter-allowutf8 branch from afc3592 to e382b41 Compare September 28, 2024 00:22
@ArthurSens ArthurSens force-pushed the promexporter-allowutf8 branch 4 times, most recently from 16ce50a to c6164f8 Compare September 28, 2024 00:35
Copy link
Member Author

@ArthurSens ArthurSens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed the implementation moving the new feature gate allowUTF8 to the translator package. The code is not even compiling right now, but I wanted to push the changes to share a few things I'm facing

}
tokens = append(tokens, metric.Name())

if addMetricSuffixes && normalizeNameGate.IsEnabled() {
Copy link
Member Author

@ArthurSens ArthurSens Sep 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part here feels awkward. Two feature flags that go against each other.

NormalizeNameGate is used to tell if the user wants Prometheus Normalization or not, which means utf8 turns into underscore, units become part of the metric name and type suffixes are also added.

If NormalizeName and AllowUTF8 are both enabled at the same time, I guess we implement partial normalization (allow utf8 but still add suffixes)? When adding the suffixes, should we use dots or underscores to separate from the metric name?

I'm inclining to not have yet another feature gate and just stop removing forbidden runes when normalizeNameGate is disabled 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The normalization of Otel metrics when exporting to Prometheus performs 3 operations:

  1. Remove unsupported runes in Prometheus
  2. Append the unit as a suffix
  3. Follow some additional Prometheus conventions (_total suffix for counters, etc.)

Allowing UTF8 means we can skip step 1 in the above list, and only this one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That means that we'll have different behaviors for NormalizeNameGate, right? Feels weird to me, I'd prefer that they don't interact with each other 😬

No strong opinions though, since NormalizeNameGate is on by default(beta) and the new gate will be off by default (alpha) we won't be introducing breaking changes immediately

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me we should have:

normalizeNameGate allowUTF8 Operations
false false Remove unsupported runes, don't follow Prometheus conventions
false true Do nothing, leave metric names as is
true false Current default behavior: remove unsupported runes in Prometheus 2.x, and follow Prometheus conventions, like unit suffixes, _total suffix, _ratio, etc.)
true true Let all runes in place AND follow Prometheus conventions: unit suffixes, _total , _ratio, etc.

Make sure the code behaves as in the above table and we should be fine! 😊

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just added a test that verifies their interaction :)

@ArthurSens ArthurSens force-pushed the promexporter-allowutf8 branch from c6164f8 to eaf5214 Compare October 1, 2024 17:40
@github-actions github-actions bot requested a review from rapphil October 1, 2024 17:40
@ArthurSens ArthurSens force-pushed the promexporter-allowutf8 branch from eaf5214 to d01f73e Compare October 1, 2024 17:46
@ArthurSens ArthurSens marked this pull request as ready for review October 1, 2024 17:49
@ArthurSens ArthurSens requested a review from a team as a code owner October 1, 2024 17:49
@ArthurSens ArthurSens force-pushed the promexporter-allowutf8 branch from d01f73e to 3bfbc3c Compare October 1, 2024 18:05
@ArthurSens ArthurSens requested a review from jsuereth as a code owner October 1, 2024 18:05
@github-actions github-actions bot added the exporter/googlemanagedprometheus Google Managed Prometheus exporter label Oct 1, 2024
@github-actions github-actions bot requested review from aabmass and damemi October 1, 2024 18:05
@dashpole
Copy link
Contributor

dashpole commented Oct 1, 2024

I got that whenever model.NameValidationScheme wasn't set to model.UTF8Validation and I passed a name with dots.

@ArthurSens
Copy link
Member Author

ArthurSens commented Oct 1, 2024

I got that whenever model.NameValidationScheme wasn't set to model.UTF8Validation and I passed a name with dots.

I guess using init() to check the feature-flag state isn't the best approach here. I can't tell what is initialized first

@ArthurSens ArthurSens marked this pull request as draft October 2, 2024 17:51
@ArthurSens ArthurSens force-pushed the promexporter-allowutf8 branch 2 times, most recently from 1f49d11 to e9f0df5 Compare October 3, 2024 19:20
@ArthurSens ArthurSens marked this pull request as ready for review October 3, 2024 19:20
@ArthurSens
Copy link
Member Author

ArthurSens commented Oct 3, 2024

Ok, should be ready for review again.

I've done manual tests with prometheus exporter and it's working:

cmdline:

make otelcontribcol
./bin/otelcontribcol_darwin_arm64 --feature-gates=pkg.translator.prometheus.allowUTF8 --config otelcol.yaml

configfile:

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317

exporters:
  debug:
    verbosity: detailed
  prometheus:
    endpoint: 0.0.0.0:8889

service:
  pipelines:
    metrics:
      receivers: [otlp]
      exporters: [prometheus, debug]

Otel metrics endpoin:

$ curl -H 'Accept: application/openmetrics-text; escaping=allow-utf-8' localhost:8889/metrics
# HELP "run.total" The number of times the iteration ran
# TYPE "run.total" counter
{"run.total","attr.A"="chocolate.cake","attr.B"="raspberry","attr.C"="vanilla",job="test-service"} 10

$ curl -H 'Accept: application/openmetrics-text' localhost:8889/metrics
# HELP run_total The number of times the iteration ran
# TYPE run_total counter
run_total{attr_A="chocolate.cake",attr_B="raspberry",attr_C="vanilla",job="test-service"} 10

I still want to test remotewriteexporter, hopefully still this week

@ArthurSens ArthurSens force-pushed the promexporter-allowutf8 branch 3 times, most recently from 2e05a62 to 72f14fc Compare October 4, 2024 17:02
@ArthurSens
Copy link
Member Author

Urgh, I was hoping to test remotewrite now, but after rebasing the PR now that 0.111.0 was released I can't build the binary anymore 🤔

arthursens$ make otelcontribcol
cd ./cmd/otelcontribcol && GO111MODULE=on CGO_ENABLED=0 go build -trimpath -o ../../bin/otelcontribcol_darwin_arm64 \
                -tags "" .
# github.com/open-telemetry/opentelemetry-collector-contrib/cmd/otelcontribcol
runtime.main_main·f: function main is undeclared in the main package
make: *** [otelcontribcol] Error 1

@ArthurSens ArthurSens force-pushed the promexporter-allowutf8 branch from 72f14fc to a56c8dd Compare October 4, 2024 17:09
@ArthurSens ArthurSens force-pushed the promexporter-allowutf8 branch from a56c8dd to 615b0aa Compare October 8, 2024 12:42
@ArthurSens
Copy link
Member Author

Just adding some context here: Initially, we thought that enabling UTF-8 was something that overrides the behaviors on all exporters that use the translation package. After discussion with the OpenTelemetry-Prometheus Working Group, we realized that that's not the case. The prometheus/common global variable modifies the validation made by the original package, but we can still control the translation itself.


I do remember that we wanted to explore allowing different exporters to provide config options that would control the translation, but I still have one doubt:

Should we have config+feature flag? Or just config?

@dashpole
Copy link
Contributor

I think it is fine to just have config. Maybe add a comment saying that it is experimental if you want.

@ArthurSens
Copy link
Member Author

I think it is fine to just have config. Maybe add a comment saying that it is experimental if you want.

Ok, I think I'll split this into two PRs then:

1 - Add optionality to the translator API.
2 - Update exporters with this extra option.

@ArthurSens
Copy link
Member Author

Here is the 1st PR: #35904

Copy link
Contributor

github-actions bot commented Nov 5, 2024

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Nov 5, 2024
@dashpole dashpole removed the Stale label Nov 11, 2024
@@ -60,6 +60,10 @@ Given the example, metrics will be available at `https://1.2.3.4:1234/metrics`.

OpenTelemetry metric names and attributes are normalized to be compliant with Prometheus naming rules. [Details on this normalization process are described in the Prometheus translator module](../../pkg/translator/prometheus/).

Prometheus 2.55.0 introduced support for UTF-8 characters behind the feature-flag `utf8-names`. Prometheus 3.0.0 and later accept UTF-8 by default. This means that name and attribute normalization is not required if you're using those versions. To allow UTF-8 characters to be exposed without normalization, start the collector with the feature gate: `--feature-gates=pkg.translator.prometheus.allow_utf8`.

The scraper must include `scaping=allow-utf-8` in the `Accept` header for UTF-8 characters to be exposed.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The scraper must include `scaping=allow-utf-8` in the `Accept` header for UTF-8 characters to be exposed.
The scraper must include `escaping=allow-utf-8` in the `Accept` header for UTF-8 characters to be exposed.

@bertysentry
Copy link
Contributor

I also suggest we make sure to follow Prometheus' now official guide on UTF-8 and OTLP metrics ingestion.

Their OTLP ingestion has a translation_strategy configuration variable, that can be set to:

  • UnderscoreEscapingWithSuffixes (the default, conforming to Prometheus 2.x)
  • NoUTF8EscapingWithSuffixes (the new behavior you're implementing here)

We should probably do something equivalent, instead of combining flags. Especially since the behavior in Prometheus will change again with no suffixing required once they add unit and description metadata.

@ArthurSens
Copy link
Member Author

We've discussed this a few times in the Otel's Prometheus SIG calls, we all prefer what you described but we were a bit concerned with breaking changes with the current config 😬

Recently we've also talked about moving the translator package into a separate package because it's has been super difficult to work and maintain this package and its duplicate in the Prometheus repository. We're seeing optimizations being done in only one side, different maintainers choosing different configuration options, etc.

Are you aware of the Prometheus SIG calls @bertysentry? As one of the code owners here your opinion would be super helpful :)

Copy link
Contributor

github-actions bot commented Dec 8, 2024

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Dec 8, 2024
Copy link
Contributor

Closed as inactive. Feel free to reopen if this PR is still being worked on.

@github-actions github-actions bot closed this Dec 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[exporter/prometheus] Stop translating metrics to the traditional "Prometheus format"
5 participants