Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenTelemetry support #5685

Open
knp-sap opened this issue Dec 6, 2024 · 3 comments
Open

OpenTelemetry support #5685

knp-sap opened this issue Dec 6, 2024 · 3 comments
Assignees
Labels
triage/in-progress Issue triage is in progress

Comments

@knp-sap
Copy link

knp-sap commented Dec 6, 2024

As an operator, I would like to be able to ingest the SPIRE Server's audit logs via OpenTelemetry.

  • Subsystem: server
@rturner3 rturner3 added the triage/in-progress Issue triage is in progress label Dec 10, 2024
@rturner3
Copy link
Collaborator

@knp-sap Just to clarify, is there something that is blocking a SPIRE user from ingesting SPIRE audit logs with an OpenTelemetry log collector?

There are some open questions in my mind:

  • What changes would need to go into SPIRE to support OpenTelemetry? Does this need to be done in SPIRE code or can it be solved with a separate log scraper that enriches/reformats SPIRE logs to match the required format?
  • Are logs sent synchronously and asynchronously using OpenTelemetry? Trying to understand the potential performance impact of SPIRE.
  • Would adding support for some custom log fields also solve this problem?

@knp-sap
Copy link
Author

knp-sap commented Dec 11, 2024

is there something that is blocking a SPIRE user from ingesting SPIRE audit logs with an OpenTelemetry log collector?

No, a user can leverage the File Log Receiver to collect the logs. Unfortunately, this is not accepted in my organization due to compliance reasons (e.g., container logs not being up to the standard of audit logs).

What changes would need to go into SPIRE to support OpenTelemetry? Does this need to be done in SPIRE code ... ?

The SPIRE code needs to be changed.

An MVP for the audit logs could be:

  • Implementing a custom logrus hook that sends the logs via HTTP or gRPC.
  • SPIRE users being able to configure the target that receives the audit logs.

A proper implementation would be to actually use the OpenTelemetry APIs and SDKs (https://opentelemetry.io/docs/languages/go/getting-started/).

... can it be solved with a separate log scraper that enriches/reformats SPIRE logs to match the required format?

I wouldn't say it's about the format of the SPIRE Server's audit logs.

Are logs sent synchronously and asynchronously using OpenTelemetry? Trying to understand the potential performance impact of SPIRE.

It depends on the implementation, but an asynchronous setup should be possible.

Would adding support for some custom log fields also solve this problem?

No.

@rturner3
Copy link
Collaborator

rturner3 commented Jan 7, 2025

Unfortunately, this is not accepted in my organization due to compliance reasons (e.g., container logs not being up to the standard of audit logs).

Hey @knp-sap, I just wanted to make sure I better understood some of the constraints you're dealing with. Would it be possible to elaborate on this a bit? Are you saying that filesystem ACLs on your log files don't provide granular enough authorization in your environment, i.e. the audit log files can't be trusted?

I have some doubts on the performance and reliability impact of publishing logs to an HTTP/gRPC endpoint:

  • In larger scale environments this could amount to a significant volume of log publishing requests
  • If the log publishing endpoint is down for some reason, how would SPIRE handle it? i.e. would audit logs be dropped and lossy after a certain number of retries?
  • If the log publishing endpoint is slow to respond (maybe it is running on a remote host), it could also impact the performance of SPIRE and availability of audit logs

I've typically seen OS streams or files used as the output medium for log data since they're more reliable/performant than network communication.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triage/in-progress Issue triage is in progress
Projects
None yet
Development

No branches or pull requests

3 participants