-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Instrumenting connectivity routines and export spans with OpenTelemetry #140
Conversation
@fortuna I refactored the code so that only instrumentation part of the code (minimum necessary) remains inside The rest of setup operations is done in the application code. I moved those into Developers can follow a similar pattern showed in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can see in this file how otel
brings significant bloat to the SDK. We need to make it independent of large frameworks.
@@ -22,6 +22,8 @@ import ( | |||
"syscall" | |||
"time" | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You’re still depending on otel. We’ll need to stick to it forever with this approach. Even though you’re not exposing the types, there’s dependency on the otel behavior and globals.
I suggest you create a new trace package that defines the interfaces we need. It can mimic otel or the standard library trace. Then the consumer can provide whatever implementation they prefer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of the packages added in go.mod
are not used in the package code (SDK core) and are just used in the tests (connectivity_test.go
and tracing_utils_test.go
). When building a go app with the SDK, they will be excluded. Only the following three packages are required in the SDK:
go.opentelemetry.io/otel v1.21.0
go.opentelemetry.io/otel/metric v1.21.0 ---> we can probably get rid of this one too
go.opentelemetry.io/otel/trace v1.21.0
The rest will go into the application code but they are appearing in go.mod
because they were used in tests.
I can investigate alternative approaches to further slim down the bloat.
Here's the full list of packages that are requires for go build
and go install
operations.
➜ x git:(amir-telemetry) ✗ go list -m all
github.com/Jigsaw-Code/outline-sdk/x
github.com/Jigsaw-Code/outline-sdk v0.0.10
github.com/creack/pty v1.1.9
github.com/davecgh/go-spew v1.1.1
github.com/eycorsican/go-tun2socks v1.16.11
github.com/go-logr/logr v1.3.0
github.com/go-logr/stdr v1.2.2
github.com/google/go-cmp v0.6.0
github.com/google/gopacket v1.1.19
github.com/kr/pretty v0.3.1
github.com/kr/pty v1.1.1
github.com/kr/text v0.2.0
github.com/miekg/dns v1.1.54
github.com/pkg/diff v0.0.0-20210226163009-20ebb0f2a09e
github.com/pmezard/go-difflib v1.0.0
github.com/riobard/go-bloom v0.0.0-20200614022211-cdc8013cb5b3
github.com/rogpeppe/go-internal v1.10.0
github.com/shadowsocks/go-shadowsocks2 v0.1.5
github.com/songgao/water v0.0.0-20190725173103-fd331bda3f4b
github.com/stretchr/objx v0.5.0
github.com/stretchr/testify v1.8.4
github.com/vishvananda/netlink v1.1.0
github.com/vishvananda/netns v0.0.0-20191106174202-0a2b9b5464df
github.com/yuin/goldmark v1.4.13
go.opentelemetry.io/otel v1.21.0
go.opentelemetry.io/otel/metric v1.21.0
go.opentelemetry.io/otel/trace v1.21.0
golang.org/x/crypto v0.14.0
golang.org/x/exp/shiny v0.0.0-20230817173708-d852ddb80c63
golang.org/x/image v0.11.0
golang.org/x/mobile v0.0.0-20230905140555-fbe1c053b6a9
golang.org/x/mod v0.12.0
golang.org/x/net v0.17.0
golang.org/x/sync v0.3.0
golang.org/x/sys v0.14.0
golang.org/x/term v0.13.0
golang.org/x/text v0.13.0
golang.org/x/tools v0.12.1-0.20230818130535-1517d1a3ba60
gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c
gopkg.in/yaml.v3 v3.0.1
Removing the connectivity_test.go
and tracing_utils_test.go
and running go mod tidy
removes most of the dependencies that are not needed for core instrumentation part of the SDK:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fortuna there's specific language in OpenTelemetry Docs for our specific use-case:
"If you’re instrumenting a library, only install the OpenTelemetry API package for your language. Your library will not emit telemetry on its own. It will only emit telemetry when it is part of an app that uses the OpenTelemetry SDK."
conn, dialErr := connect(ctx) | ||
defer dialSpan.End() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn’t be deferred. You should mark the end of the dial span here.
writeErr := dnsConn.WriteMsg(&dnsRequest) | ||
defer writeSpan.End() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to fix all those deferred span ends
@@ -49,13 +51,29 @@ func (err *TestError) Unwrap() error { | |||
// TestResolverStreamConnectivity uses the given [transport.StreamEndpoint] to connect to a DNS resolver and resolve the test domain. | |||
// The context can be used to set a timeout or deadline, or to pass values to the dialer. | |||
func TestResolverStreamConnectivity(ctx context.Context, resolver transport.StreamEndpoint, testDomain string) (time.Duration, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed in the meeting, let's make this return the span information, something like:
- Dial
- start, end timestamp
- error
- Write
- start, end timestamp
- error
- Read
- start, end timestamp
- error
Later, we can make users of the API report that in whatever system they want (including otel).
For now, let's remove all the otel logic. We can add to the app later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fortuna I am implementing this by defining Tracer and Span interfaces such that they will be compatible with otel types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don’t really need that for now. We need to think about the data processing side. Will a genetic structure make it harder?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what I am aiming for (i.e. generic span) with the caveat of making interfaces compatible with otel. It is almost done. I need to do some testing and will commit changes shortly. We can use existing report package for collecting the data or developers can use otel if they wish -- connectivity package won't have any dependencies on otel
I am closing this PR for now. Perhaps we can revisit use of OpenTelmetry and/or spans down the road. |
This is draft PR on instrumenting connectivity routines in connectivity.go (dial, read, write, etc) and export signals (traces and errors in this case) to an OTLP HTTP Receiver or STDOUT.
You can run tests in
connectivity_test.go
(> go test
) to collect traces for various test cases.To receive traces exported by OTLP span exporter, you can install Jaeger UI locally (or remotely) using the following docker command:
➜
docker run -d --name jaeger -p 16686:16686 -p 4318:4318 jaegertracing/all-in-one:latest
The UI web dashboard can be access at
http://localhost:16686
The screenshot below shows a captured trace: