Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What to do about config options & schema versions #1483

Open
hannahramadan opened this issue Oct 15, 2024 · 10 comments
Open

What to do about config options & schema versions #1483

hannahramadan opened this issue Oct 15, 2024 · 10 comments
Labels

Comments

@hannahramadan
Copy link

Looking for some guidance.

The OTel Ruby agent is adding a configuration option (and I believe already has some) that are tied to older semconv naming conventions. When introducing new configs options and perhaps cleaning up old ones, I'm looking for guidance on what the config should be named/ if both should be recognized.

For example, a config would exist if users wanted to receive information on the table name of sql queries. The old semantic convention for this is db.sql.table, and the new name is db.collection.name. The configuration might look like db_sql_table and db_collection_name, which the option to omit or include.

Should we support both db_sql_table and db_collection_name and have them named as such or is there a better path forward.

@trask
Copy link
Member

trask commented Oct 15, 2024

hi @hannahramadan!

it sounds like OTEL_SEMCONV_STABILITY_OPT_IN is what you're look for, check out https://github.com/open-telemetry/semantic-conventions/blob/main/docs/database/README.md:

Existing database instrumentations that are using v1.24.0 of this document (or prior):

  • SHOULD NOT change the version of the database conventions that they emit by default until the database semantic conventions are marked stable. Conventions include, but are not limited to, attributes, metric and span names, and unit of measure.
  • SHOULD introduce an environment variable OTEL_SEMCONV_STABILITY_OPT_IN in the existing major version which is a comma-separated list of values. If the list of values includes:
    • database - emit the new, stable database conventions, and stop emitting the old experimental database conventions that the instrumentation emitted previously.
    • database/dup - emit both the old and the stable database conventions, allowing for a seamless transition.
    • The default behavior (in the absence of one of these values) is to continue emitting whatever version of the old experimental database conventions the instrumentation was emitting previously.
    • Note: database/dup has higher precedence than database in case both values are present
  • SHOULD maintain (security patching at a minimum) the existing major version for at least six months after it starts emitting both sets of conventions.
  • SHOULD drop the environment variable in the next major version.

@hannahramadan
Copy link
Author

Thank you for the quick response @trask ◡̈

Outside of controlling which conventions to use for all database conventions, we want to provide users with a config option whether to send a specific attribute or not. The concern is around tying config options to a specific schema version.

Should users be able to configure, for example, both db_sql_table: omit or db_collection_name: omit and have the attribute omitted, no matter the schema version? We were weary of needing to create so many configs for both schema versions.

@trask
Copy link
Member

trask commented Oct 15, 2024

We were weary of needing to create so many configs for both schema versions.

yeah, I'd personally suggest sticking with the simple OTEL_SEMCONV_STABILITY_OPT_IN, and users can always customize via span processors if they have really specific needs. this is what we're doing in Java at least.

@joaopgrassi joaopgrassi added the question Further information is requested label Oct 21, 2024
@joaopgrassi
Copy link
Member

joaopgrassi commented Oct 21, 2024

It seems like OTEL_SEMCONV_STABILITY_OPT_IN is the way to go, and also agree with @trask about using others ways to filter that the user have to control/configure instead of the SDK.

@hannahramadan does that help you? Can we close this issue?

@hannahramadan
Copy link
Author

Hi @trask, @joaopgrassi - We may be talking about different things or I'm misunderstanding—thanks for bearing with me!

From what I understand, OTEL_SEMCONV_STABILITY_OPT_IN is a way for users to decide which conventions they want to send: new, old, or both.

I'm curious about what should happen if someone wants to omit one specific attribute from an instrumentation via config.

In my example, collecting db.sql.table/ db.collection.name can have an impact on performance, so users should be able to turn on/off its collection. Downstream options are unideal because of the performance hit that would've already been taken.

I may be complicating the question, but what I'm wondering is it makes sense to create configs that reflect both semantic conventions. For example, both the following options should work to include this attribute (will be omitted by default), no matter what OTEL_SEMCONV_STABILITY_OPT_IN is configured like.

OpenTelemetry::SDK.configure do |c| 
  c.use 'OpenTelemetry::Instrumentation::Mysql2', { db_sql_table: :include }
end 
OpenTelemetry::SDK.configure do |c| 
  c.use 'OpenTelemetry::Instrumentation::Mysql2', { db_collection_name: :include }
end 

@trask
Copy link
Member

trask commented Oct 23, 2024

In my example, collecting db.sql.table/ db.collection.name can have an impact on performance, so users should be able to turn on/off its collection. Downstream options are unideal because of the performance hit that would've already been taken.

is this because you are parsing the sql statements?

if so, Java also has that concern also, and so we have a configuration property that can be used to opt-out of parsing (and sanitization): otel.instrumentation.jdbc.statement-sanitizer.enabled

I would definitely support something standardized for opt-ing out of database query parsing / sanitization

@hannahramadan
Copy link
Author

hannahramadan commented Oct 24, 2024

@trask - yes exactly. We are parsing sql statements.

To confirm, Java has a single config called statement-sanitizer, and in addition to sql sanitization, it also controls the parsing of sql for db.sql.table/ db.collection.name (and presumably anything else that needs to parse sql)?

@trask
Copy link
Member

trask commented Oct 24, 2024

To confirm, Java has a single config called statement-sanitizer, and in addition to sql sanitization, it also controls the parsing of sql for db.sql.table/ db.collection.name (and presumably anything else that needs to parse sql)?

correct (though not saying it's necessarily the best way of doing it, I think there's a bit of historical bias there b/c we started with it only affecting statement sanitization, and later we started extracting table and operation names at the same time as sanitization)

@hannahramadan
Copy link
Author

Thank you @trask! We (Ruby) are talking about how to tackle this and I'd be glad to followup if the consensus is something you're interested in. I'll close this issue and so appreciate the information and help!

@trask trask added area:db and removed question Further information is requested labels Nov 16, 2024
@trask
Copy link
Member

trask commented Nov 16, 2024

I'm going to re-open and link to #1450, as we want to start documenting config options in semconv

@trask trask reopened this Nov 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: No status
Development

No branches or pull requests

4 participants