Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved Telemetry Processing in Azure Functions #9961

Open
RohitRanjanMS opened this issue Apr 2, 2024 · 1 comment
Open

Improved Telemetry Processing in Azure Functions #9961

RohitRanjanMS opened this issue Apr 2, 2024 · 1 comment
Assignees

Comments

@RohitRanjanMS
Copy link
Member

To address the challenges and gaps identified with the current logging and telemetry capture mechanisms in Azure Functions, the following proposal outlines a new approach aimed at enhancing privacy compliance, improving customer experience, and ensuring richer telemetry capture.

Challenges and Gaps

Azure Functions, allows users to run event-triggered code without the need to explicitly provision or manage infrastructure. While Azure Functions offers robust logging and monitoring capabilities, there are identified challenges in its current default implementation, particularly concerning the processing of customer-generated telemetry.

Although customers can set up Application Insights directly on the worker to bypass the issues linked to the standard logging method, they frequently, perhaps unwittingly, opt for the default behavior.

In the default mode, the host process in Azure Functions is responsible for capturing and processing all logs, including those generated by the worker process and customer code. These logs are then sent to configured sinks such as Application Insights, Azure Monitor, and others. This architecture, while functional, presents several challenges:

  • Privacy and Compliance Risks: Processing customer data within the host poses privacy and compliance risks, contrary to the principle of minimizing data processing.
  • Degraded Customer Experience: The current logging mechanism can result in delayed log visibility, inaccuracies in timestamps, lack of support for distributed traces, and potential sequence inconsistencies.

image

Goal

To overcome these challenges, a new default logging and telemetry mechanism is proposed. This mechanism will:

  • Direct Telemetry Emission: Utilize auto-instrumentation or code-less agents within the worker process to directly emit telemetry and logs to AI/Otel endpoints, thus eliminating the need for the host process to process customer data.
  • Enhanced Telemetry and Tracing: Support rich telemetry capture and distributed tracing by leveraging the capabilities of AI and Otel, providing customers with more detailed and accurate insights into their applications.
  • Retention of Local Development Experience: Ensure that the new mechanism retains compatibility with local development practices, allowing developers to continue using familiar tools and processes.

Benefits

  • Improved Privacy and Compliance: By minimizing the processing of customer data by the host, the proposed solution addresses key privacy and compliance concerns.
  • Real-time Telemetry and Insights: Direct emission of telemetry enables real-time monitoring and insights, enhancing the customer experience, especially for long-running functions.
  • Enhanced Tracing and Debugging: Support for distributed tracing improves troubleshooting and monitoring of complex, distributed applications.
@jviau
Copy link
Contributor

jviau commented Apr 8, 2024

@RohitRanjanMS - how does this issue differ from #9273? Can we consolidate down to one issue? If they are indeed separate, we should open a telemetry epic issue and add a task list for all the individual telemetry issues

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants