Skip to content

Latest commit

 

History

History
341 lines (234 loc) · 14.9 KB

File metadata and controls

341 lines (234 loc) · 14.9 KB

@dotcom-reliability-kit/opentelemetry

An OpenTelemetry client that's preconfigured for drop-in use in FT apps. This module is part of FT.com Reliability Kit.

Tip

OpenTelemetry is an open source observability framework that supports sending metrics, traces, and logs to a large variety of backends via a shared protocol. We try to abstract some of these concepts away with this module, but understanding OpenTelemetry will help you get it set up.

Usage

Install @dotcom-reliability-kit/opentelemetry as a dependency:

npm install --save @dotcom-reliability-kit/opentelemetry

Setup

You can set up OpenTelemetry in a number of ways, each has pros and cons which we'll outline in the sections below.

Automated setup with --require

You can completely avoid code changes by setting up OpenTelemetry using the Node.js --require command-line option:

node --require @dotcom-reliability-kit/opentelemetry/setup ./my-app.js

This will import our setup script before any of your code. OpenTelemetry will be configured with environment variables.

For environments where you can't modify the node command directly (e.g. AWS Lambda) you'll need to specify this using the NODE_OPTIONS environment variable set to --require @dotcom-reliability-kit/opentelemetry/setup.

Pros Cons
  • You don't need to consider placement in your JavaScript
  • No application code needs to be modified
  • Config options are managed for you through environment variables
  • It may be easy to accidentally remove the `--require`

Automated setup with require()

If you can't use --require, e.g. because your tooling won't allow it, then you can include the setup script directly in your code:

import '@dotcom-reliability-kit/opentelemetry/setup';
// or
require('@dotcom-reliability-kit/opentelemetry/setup');

OpenTelemetry will be configured with environment variables.

Warning

This must be the first import/require statement in your application for OpenTelemetry to be set up correctly.

Pros Cons
  • Very little application code needs to be modified
  • Config options are managed for you through environment variables
  • It could be easy to accidentally import something else before OpenTelemetry
  • Some code may not be instrumented correctly if the instrumentation is done asynchronously

Manual setup

If you'd like to customise the OpenTelemetry config more and have control over what runs, you can include in your code:

import * as opentelemetry from '@dotcom-reliability-kit/opentelemetry';
// or
const opentelemetry = require('@dotcom-reliability-kit/opentelemetry');

Call the function, passing in configuration options:

Warning

This must be the first function called in your application for OpenTelemetry to be set up correctly (including before other import/require statements).

opentelemetry.setup({ /* ... */ });
Pros Cons
  • You have more full control over the configuration
  • It could be easy to accidentally add code above the function call, before OpenTelemetry has been set up
  • Some code may not be instrumented correctly if the instrumentation is done asynchronously
  • You need to manage config options yourself which may result in inconsistencies between apps

This method returns any SDK instances created during setup. Calling this method a second time will return the same instances without rerunning setup.

Sending custom metrics

Many metrics are taken care of by OpenTelemetry's auto-instrumentation (e.g. HTTP request data), but you sometimes need to send your own metrics. We expose the OpenTelemetry getMeter method (documentation) which allows you to do this.

In your code, load in the getMeter function:

import { getMeter } from '@dotcom-reliability-kit/opentelemetry';
// or
const { getMeter } = require('@dotcom-reliability-kit/opentelemetry');

You can now use it in the same way as the built-in OpenTelemetry equivalent. For more information, see the OpenTelemetry Meter documentation.

// Assumes that `app` is an Express application instance
const meter = getMeter('my-app');
const hitCounter = meter.createCounter('my-app.hits');

app.get('/', (request, response) => {
    hitCounter.add(1);
    response.send('Thanks for visiting');
});

Running in production

Production metrics

To send metrics in production, you'll need an API Gateway key and the URL of the FT's official metrics collector. You can find this information in Tech Hub.

See configuration options for information on how to pass the keys and URL into your app via environment variables.

Production tracing

Warning

Tracing is not supported centrally yet and these instructions assume your team or group will be setting up their own collector.

To use this package in production you'll need a Collector that can receive traces over HTTP. This could be something you run (e.g. the AWS Distro for OpenTelemetry) or a third-party service.

Having traces collected centrally will give you a good view of how your production application is performing, allowing you to debug issues more effectively.

OpenTelemetry can generate a huge amount of data which, depending on where you send it, can become very expensive. In production environments where you don't have control over the traffic volume of your app, you'll likely need to sample your traces. This package automatically samples traces (at 5% by default).

Running locally

Local metrics

We don't recommend trying to get a metrics Collector set up locally, but you should still import OpenTelemetry in local development. If the environment variables are not present then we'll instrument all your code but not send anything. This means that what you run in development is closer to what you run in production.

Local tracing

If you want to debug specific performance issues then setting up a local Collector can help you. You shouldn't be sending traces in local development to your production backend as this could make it harder to debug real production issues. You probably also don't want to sample traces in local development – you'll want to collect all traffic because the volume will be much lower.

Running a backend

To view traces locally, you'll need a backend for them to be sent to. In this example we'll be using Jaeger via Docker. You'll need Docker (or a compatible alternative) to be set up first.

Jaeger maintains a useful guide for this.

Sending traces to your local backend

Once your backend is running you'll need to make some configuration changes.

You'll need to set the tracing endpoint to use Jaeger's tracing endpoint on port 4318 (OTLP/HTTP). E.g. http://localhost:4318/v1/traces.

You'll also need to disable sampling by configuring it to 100.

Assuming you're using one of the automated setups, environment variables could be set like this:

OPENTELEMETRY_TRACING_ENDPOINT=http://localhost:4318/v1/traces \
OPENTELEMETRY_TRACING_SAMPLE_PERCENTAGE=100 \
npm start

Run your application and perform some actions. Open up the Jaeger interface (http://localhost:16686). You should start to see traces appear.

Implementation details

Some details about how we're implementing OpenTelemetry. This is to help avoid any gotchas and to document some of the decisions we made:

  • We don't send traces for paths that we frequently poll or that will create unnecessary noise/cost. We ignore paths like /__gtg, /__health, and /favicon.ico. For the full list, visit lib/index.js.

  • We don't instrument file system operations because we don't find these useful. If you would like traces for file system operations then let us know and we can add a configuration.

  • It's less of our implementation detail and more a note on the OpenTelemetry Node.js SDK. Native ES Modules cannot be auto-instrumented without the --experimental-loader Node.js option. Documentation is here.

Configuration options

Depending on the way you set up OpenTelemetry, you can either configure it via environment variables or options passed into an object.

For automated setups (here and here) you'll need to use environment variables, e.g.

EXAMPLE=true npm start

For the manual setup, you'll need to use an options object, e.g.

opentelemetry.setup({
    example: true
});

options.authorizationHeader

Deprecated. This will still work but has been replaced with options.tracing.authorizationHeader, which is now the preferred way to set this option.

options.logInternals

Boolean indicating whether to log internal OpenTelemetry warnings and errors. Defaults to false.

options.metrics

An object containing other metrics-specific configurations. Defaults to undefined which means that OpenTelemetry metrics will not be sent.

options.metrics.endpoint

A URL to send OpenTelemetry metrics to. E.g. http://localhost:4318/v1/metrics. Defaults to undefined which means that OpenTelemetry metrics will not be sent.

Environment variable: OPENTELEMETRY_METRICS_ENDPOINT
Option: metrics.endpoint (String)

options.metrics.apiGatewayKey

Set the X-OTel-Key HTTP header in requests to the central API-Gateway-backed OpenTelemetry metrics collector. Defaults to undefined.

Environment variable: OPENTELEMETRY_API_GATEWAY_KEY
Option: metrics.apiGatewayKey (String)

options.tracing

An object containing other tracing-specific configurations. Defaults to undefined which means that OpenTelemetry traces will not be sent.

options.tracing.endpoint

A URL to send OpenTelemetry traces to. E.g. http://localhost:4318/v1/traces. Defaults to undefined which means that OpenTelemetry traces will not be sent.

Environment variable: OPENTELEMETRY_TRACING_ENDPOINT
Option: tracing.endpoint (String)

options.tracing.authorizationHeader

Set the Authorization HTTP header in requests to the OpenTelemetry tracing collector. Defaults to undefined.

Environment variable: OPENTELEMETRY_AUTHORIZATION_HEADER
Option: tracing.authorizationHeader (String)

options.tracing.samplePercentage

The percentage of traces to send to the exporter. Defaults to 5 which means that 5% of traces will be exported.

Environment variable: OPENTELEMETRY_TRACING_SAMPLE_PERCENTAGE
Option: tracing.samplePercentage (Number)

OTEL_ environment variables

OpenTelemetry itself can be configured through OTEL_-prefixed environment variables (documentation).

Caution

We strongly advise against using these. The power of this module is consistency and any application-specific changes should be considered. If you use these environment variables we won't offer support if things break.

Migrating

Consult the Migration Guide if you're trying to migrate to a later major version of this package.

Contributing

See the central contributing guide for Reliability Kit.

License

Licensed under the MIT license.
Copyright © 2024, The Financial Times Ltd.