-
Notifications
You must be signed in to change notification settings - Fork 187
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Http client and server span default collection behavior for url.full
and url.query
attributes
#860
Comments
There is nothing officially documented, but I believe this #128 (comment) and #128 (comment) summarize the approach we follow now:
None of this is specific to |
Might be worthy to have dedicated columns for each attribute indicating the potential concerns from privacy/security perspective. For example:
|
This was discussed today in both the semantic convention and maintainer meetings. The feedback from these meetings was:
Two potential options stand out if we want to mention a specific sanitization format:
I will bring this discussion to the Specification meeting tomorrow in order to gather more community feedback, but please feel free to also post any concerns here. |
I looked through the semver spec and wasn't able to find verbiage that supports this. I may have just missed it, but in the interest of clarity can you please point me to the wording? While I agree with the end result, I think we should be clear on if this is something semver explicitly allows, or something we're explicitly violating semver for. If it is the latter, the communication of the change is even more important. In particular, we should be clear if this is going to be a policy for OTel moving forward. |
SemVer is specific to public API surface. If we try to apply it to semantic conventions then presence of the attribute with specific name and type is a public contract. The value of the attribute is not and it's allowed to change - https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/versioning-and-stability.md#semantic-conventions-stability |
In the maintainer meeting today, I (incorrectly) thought semver allowed breaking changes for security fixes, which is where @dyladan’s comment is coming from. Luckily though as @lmolkova points out, the OTel semantic conventions definition of stability specifically allows us to change attribute values 😅:
|
The OTel .NET instrumentation libraries breaking change is also resolved with this. (I was also incorrectly assuming security/critical bug fixes are allowed by semver without major v bump!) |
To my knowledge this is not true, the redaction processor was created specifically for this compliancy question |
My understanding of query parameters is that, while they can be abused to leak sensitive information, that is considered a mistake. Is there anything in the w3c that defines query parameter as a sensitive field? I would much rather redaction be an opt-in mode rather than the default. An opt-in option give users who want the extra level of security an option while not breaking end users who are in control of their instrumentation. |
Query params are commonly used for authentication. E.g. it's supported by AWS S3, Azure Blobs, Google Storage |
That feels like an edge case rather than the norm. While auth parameters can be used in query param, I have not heard that it is a norm. Those uses cases feel like a reason to make redaction an option, but not the standard. Any instrumentation for those uses cases could make a decision to opt-in to redaction by default. But I don't think every instrumentation library that records these http semantic conventions should be forced to redact by default via the semantic convention. |
I don't know that I'd agree that it's an edge case, but I would suggest that out of the universe of people using OpenTelemetry there are more cases of people who aren't passing in auth tokens (or, perhaps, who desire that functionality and would like those values to be unredacted) vs. people for whom it would be a surprise. I also feel like there are many existing solutions here in the ecosystem, e.g., redaction processor. Given that changing the default would be an explicit and implicit change in behavior that modifies years of precedent, I would be far more supportive of a new opt-in behavior to redact all values; Preferably, I would suggest that the spec requires all semconv attributes which contain potentially sensitive data to offer customizable redaction by default. |
I appreciate that this can be a source of leaking potentially sensitive information. Do we have some data or some other kind of evidence suggesting this is:
Right now, the justification for masking things because they could opt people into capture PII or sensitive information could extend to so many other things in the spec. For example, I talked with a customer last year who was concerned about how some of their URLs contain PII and they're captured automatically. They clearly got value out of capturing these values generally, but weren't aware of a way to redact them. The This basically met their expectations: they know there's inherent value in capturing this information, but they just need to know how to tweak controls so that things don't leave their network. I would propose that this should be the guiding principle about capturing values which may have PII or sensitive data in them by default. And if there are instrumentations that, by default, include sensitive data that they really ought not to, then we should fix those instrumentations. |
The cost of leak can be extremely high even though they are rare. One of the common use-cases is to drop instrumentation into production app (i.e. enable OTel using cloud provider CLI/UX for a running app). For this case, I believe safe defaults are more important than precise telemetry. In the same way that users can discover redaction processor (which requires them to run collector), they can discover that some details are missing and enable them. But I think that the distro should be able to change instrumentation behavior. This allows instrumentations to play safe and distro to provide the right configuration story and defaults specific to the vendor. |
Respectfully, I don't feel like a solution that requires everyone to create a distro because the reference implementation does unexpected things (or things that makes the reference implementation less useful) is a good one. That strikes me as a great way to actually blunt the adoption of OpenTelemetry and reduce the amount of contributor-hours we get from vendors, since they'll need to instead use that engineering time to create and maintain their own distribution. |
I want to better understand the valid use cases that are being used as justification for this change. Putting a password in a query string would be considered a mistake, and wouldn't, in my opinion, justify the redaction by default - you could easily write a password to a log body and we wouldn't consider redacting those. What about the authentication methods mentioned above (AWS S3, Azure Blobs, Google Storage) makes it safe to use query string for authentication but not to record the values? If it is because the tokens can expire, doesn't that make them safe to record? I am genuinely ignorant about these authentication methods and they are in conflict with my current mental model of URL security best practices. |
I also want to challenge the notion that this is not a breaking change as defined by the Spec stability conventions. The Spec states:
This change as it is being proposed will certainly impact specialized tool and will need user interventions. I see that the spec also states:
and I guess I want to challenge the application of that rule to this situation. My interpretation of that statement is that the semantic convention doesn't guarantee any specific values for attributes that are not specific list, i.e., we don't promise that |
Finally, if this change does get merged as is, we MUST evangelize this change broadly. The average user isn't going to care that the spec could be interpreted to allow this change as non-breaking. All they are gonna see is failed queries, missed triggers, and broken dashboards after an instrumentation upgrade. That is a horrible user experience. Adding something to the semantic convention release notes is not sufficient, we must enforce a rule that instrumentation release notes/change logs accurately reflect this change. |
Is there a provision for restoring the previous behavior, for all the active users who are depending upon the current standard values? |
having a consistent way for users to select their desired behavior seems important whether we change the default behavior or not there's a couple of related comments also on the PR:
|
+100 |
That's exactly what I experienced when we upgraded the AspNetCore instrumentation library from 1.8.0 to 1.8.1 🙃 |
Unfortunately, it's also possible to have |
I wonder why I have to set particular environment variables in order to opt out and there is no C# API (during the DI container configuration) which I could use instead? |
Area(s)
area:url
Is your change request related to a problem? Please describe.
The HTTP client and server spans have
url.full
andurl.query
attributes, respectively, tagged as Required. These attributes can contain sensitive information, as mentioned in the spec. Even though the spec states,Sensitive content provided SHOULD be scrubbed when instrumentations can identify it
, it isn't possible to identify this in a deterministic way.https://github.com/open-telemetry/semantic-conventions/blob/main/docs/http/http-spans.md
This is an issue because users can onboard to the instrumentation library without being aware of the data collected by the library, resulting in a leak of sensitive information.
Describe the solution you'd like
The attributes that have the possibility to contain sensitive information are not collected by default. Changing this now will be breaking change, an alternate is to have a consistent opt-out option defined in the specification to opt out of the default behavior.
Describe alternatives you've considered
As a mitigation, users can provide processors to scrub information. However this is likely to be a reactive thing, we should have a safe-by-default stance instead.
Additional context
Related issue: #128
The text was updated successfully, but these errors were encountered: