Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Instrumenting connectivity routines and export spans with OpenTelemetry #140

Closed
wants to merge 6 commits into from

Conversation

amircybersec
Copy link
Contributor

This is draft PR on instrumenting connectivity routines in connectivity.go (dial, read, write, etc) and export signals (traces and errors in this case) to an OTLP HTTP Receiver or STDOUT.

You can run tests in connectivity_test.go (> go test) to collect traces for various test cases.

To receive traces exported by OTLP span exporter, you can install Jaeger UI locally (or remotely) using the following docker command:

docker run -d --name jaeger -p 16686:16686 -p 4318:4318 jaegertracing/all-in-one:latest

The UI web dashboard can be access at http://localhost:16686

The screenshot below shows a captured trace:

Screenshot 2023-12-13 at 1 36 47 PM

@amircybersec amircybersec changed the title Instrumenting Connectivity Routines and export spans with OpenTelemetry Instrumenting connectivity routines and export spans with OpenTelemetry Dec 13, 2023
@amircybersec
Copy link
Contributor Author

@fortuna I refactored the code so that only instrumentation part of the code (minimum necessary) remains inside connectivity.go. This requires importing only one package go.opentelemetry.io/otel.

The rest of setup operations is done in the application code. I moved those into tracing_utils_test.go file which is only used in the tests.

Developers can follow a similar pattern showed in tracing_utils_test.go to setup their own traceProvider, Exporter, etc in the application code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can see in this file how otel brings significant bloat to the SDK. We need to make it independent of large frameworks.

@@ -22,6 +22,8 @@ import (
"syscall"
"time"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You’re still depending on otel. We’ll need to stick to it forever with this approach. Even though you’re not exposing the types, there’s dependency on the otel behavior and globals.

I suggest you create a new trace package that defines the interfaces we need. It can mimic otel or the standard library trace. Then the consumer can provide whatever implementation they prefer.

Copy link
Contributor Author

@amircybersec amircybersec Dec 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the packages added in go.mod are not used in the package code (SDK core) and are just used in the tests (connectivity_test.go and tracing_utils_test.go). When building a go app with the SDK, they will be excluded. Only the following three packages are required in the SDK:

go.opentelemetry.io/otel v1.21.0
go.opentelemetry.io/otel/metric v1.21.0 ---> we can probably get rid of this one too
go.opentelemetry.io/otel/trace v1.21.0

The rest will go into the application code but they are appearing in go.mod because they were used in tests.

I can investigate alternative approaches to further slim down the bloat.

Here's the full list of packages that are requires for go build and go install operations.

➜  x git:(amir-telemetry) ✗ go list -m all
github.com/Jigsaw-Code/outline-sdk/x
github.com/Jigsaw-Code/outline-sdk v0.0.10
github.com/creack/pty v1.1.9
github.com/davecgh/go-spew v1.1.1
github.com/eycorsican/go-tun2socks v1.16.11
github.com/go-logr/logr v1.3.0
github.com/go-logr/stdr v1.2.2
github.com/google/go-cmp v0.6.0
github.com/google/gopacket v1.1.19
github.com/kr/pretty v0.3.1
github.com/kr/pty v1.1.1
github.com/kr/text v0.2.0
github.com/miekg/dns v1.1.54
github.com/pkg/diff v0.0.0-20210226163009-20ebb0f2a09e
github.com/pmezard/go-difflib v1.0.0
github.com/riobard/go-bloom v0.0.0-20200614022211-cdc8013cb5b3
github.com/rogpeppe/go-internal v1.10.0
github.com/shadowsocks/go-shadowsocks2 v0.1.5
github.com/songgao/water v0.0.0-20190725173103-fd331bda3f4b
github.com/stretchr/objx v0.5.0
github.com/stretchr/testify v1.8.4
github.com/vishvananda/netlink v1.1.0
github.com/vishvananda/netns v0.0.0-20191106174202-0a2b9b5464df
github.com/yuin/goldmark v1.4.13
go.opentelemetry.io/otel v1.21.0
go.opentelemetry.io/otel/metric v1.21.0
go.opentelemetry.io/otel/trace v1.21.0
golang.org/x/crypto v0.14.0
golang.org/x/exp/shiny v0.0.0-20230817173708-d852ddb80c63
golang.org/x/image v0.11.0
golang.org/x/mobile v0.0.0-20230905140555-fbe1c053b6a9
golang.org/x/mod v0.12.0
golang.org/x/net v0.17.0
golang.org/x/sync v0.3.0
golang.org/x/sys v0.14.0
golang.org/x/term v0.13.0
golang.org/x/text v0.13.0
golang.org/x/tools v0.12.1-0.20230818130535-1517d1a3ba60
gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c
gopkg.in/yaml.v3 v3.0.1

Removing the connectivity_test.go and tracing_utils_test.go and running go mod tidy removes most of the dependencies that are not needed for core instrumentation part of the SDK:

Screenshot 2023-12-16 at 11 58 18 AM

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fortuna there's specific language in OpenTelemetry Docs for our specific use-case:

"If you’re instrumenting a library, only install the OpenTelemetry API package for your language. Your library will not emit telemetry on its own. It will only emit telemetry when it is part of an app that uses the OpenTelemetry SDK."

conn, dialErr := connect(ctx)
defer dialSpan.End()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn’t be deferred. You should mark the end of the dial span here.

writeErr := dnsConn.WriteMsg(&dnsRequest)
defer writeSpan.End()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to fix all those deferred span ends

@@ -49,13 +51,29 @@ func (err *TestError) Unwrap() error {
// TestResolverStreamConnectivity uses the given [transport.StreamEndpoint] to connect to a DNS resolver and resolve the test domain.
// The context can be used to set a timeout or deadline, or to pass values to the dialer.
func TestResolverStreamConnectivity(ctx context.Context, resolver transport.StreamEndpoint, testDomain string) (time.Duration, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed in the meeting, let's make this return the span information, something like:

  • Dial
    • start, end timestamp
    • error
  • Write
    • start, end timestamp
    • error
  • Read
    • start, end timestamp
    • error

Later, we can make users of the API report that in whatever system they want (including otel).

For now, let's remove all the otel logic. We can add to the app later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fortuna I am implementing this by defining Tracer and Span interfaces such that they will be compatible with otel types.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don’t really need that for now. We need to think about the data processing side. Will a genetic structure make it harder?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what I am aiming for (i.e. generic span) with the caveat of making interfaces compatible with otel. It is almost done. I need to do some testing and will commit changes shortly. We can use existing report package for collecting the data or developers can use otel if they wish -- connectivity package won't have any dependencies on otel

@amircybersec
Copy link
Contributor Author

I am closing this PR for now. Perhaps we can revisit use of OpenTelmetry and/or spans down the road.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants