-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Instrumenting connectivity routines and export spans with OpenTelemetry #140
Changes from all commits
fb26612
b787b58
a52168a
e979e15
1be2659
2c7d4d9
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -22,6 +22,8 @@ import ( | |
"syscall" | ||
"time" | ||
|
||
"go.opentelemetry.io/otel" | ||
|
||
"github.com/Jigsaw-Code/outline-sdk/transport" | ||
"github.com/miekg/dns" | ||
) | ||
|
@@ -49,13 +51,29 @@ func (err *TestError) Unwrap() error { | |
// TestResolverStreamConnectivity uses the given [transport.StreamEndpoint] to connect to a DNS resolver and resolve the test domain. | ||
// The context can be used to set a timeout or deadline, or to pass values to the dialer. | ||
func TestResolverStreamConnectivity(ctx context.Context, resolver transport.StreamEndpoint, testDomain string) (time.Duration, error) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As discussed in the meeting, let's make this return the span information, something like:
Later, we can make users of the API report that in whatever system they want (including otel). For now, let's remove all the otel logic. We can add to the app later. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @fortuna I am implementing this by defining Tracer and Span interfaces such that they will be compatible with otel types. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We don’t really need that for now. We need to think about the data processing side. Will a genetic structure make it harder? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's what I am aiming for (i.e. generic span) with the caveat of making interfaces compatible with otel. It is almost done. I need to do some testing and will commit changes shortly. We can use existing report package for collecting the data or developers can use otel if they wish -- connectivity package won't have any dependencies on otel |
||
return testResolver(ctx, resolver.Connect, testDomain) | ||
tracer := otel.Tracer("TestResolverStreamConnectivity") | ||
ctx, span := tracer.Start(ctx, "TestResolverStreamConnectivity") | ||
defer span.End() | ||
fmt.Println("TestResolverStreamConnectivity") | ||
duration, err := testResolver(ctx, resolver.Connect, testDomain) | ||
if err != nil { | ||
fmt.Println("TestResolverStreamConnectivity error") | ||
span.RecordError(err) | ||
} | ||
return duration, err | ||
} | ||
|
||
// TestResolverPacketConnectivity uses the given [transport.PacketEndpoint] to connect to a DNS resolver and resolve the test domain. | ||
// The context can be used to set a timeout or deadline, or to pass values to the listener. | ||
func TestResolverPacketConnectivity(ctx context.Context, resolver transport.PacketEndpoint, testDomain string) (time.Duration, error) { | ||
return testResolver(ctx, resolver.Connect, testDomain) | ||
tracer := otel.Tracer("TestResolverPacketConnectivity") | ||
ctx, span := tracer.Start(ctx, "TestResolverPacketConnectivity") | ||
defer span.End() | ||
duration, err := testResolver(ctx, resolver.Connect, testDomain) | ||
if err != nil { | ||
span.RecordError(err) | ||
} | ||
return duration, err | ||
} | ||
|
||
func isTimeout(err error) bool { | ||
|
@@ -75,6 +93,10 @@ func makeTestError(op string, err error) error { | |
} | ||
|
||
func testResolver[C net.Conn](ctx context.Context, connect func(context.Context) (C, error), testDomain string) (time.Duration, error) { | ||
tracer := otel.Tracer("testResolver") | ||
ctx, parentSpan := tracer.Start(ctx, "testResolver") | ||
defer parentSpan.End() | ||
|
||
deadline, ok := ctx.Deadline() | ||
if !ok { | ||
// Default deadline is 5 seconds. | ||
|
@@ -86,8 +108,11 @@ func testResolver[C net.Conn](ctx context.Context, connect func(context.Context) | |
} | ||
testTime := time.Now() | ||
testErr := func() error { | ||
ctx, dialSpan := tracer.Start(ctx, "dial") | ||
conn, dialErr := connect(ctx) | ||
defer dialSpan.End() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This shouldn’t be deferred. You should mark the end of the dial span here. |
||
if dialErr != nil { | ||
dialSpan.RecordError(dialErr) | ||
return makeTestError("dial", dialErr) | ||
} | ||
defer conn.Close() | ||
|
@@ -96,13 +121,19 @@ func testResolver[C net.Conn](ctx context.Context, connect func(context.Context) | |
|
||
var dnsRequest dns.Msg | ||
dnsRequest.SetQuestion(dns.Fqdn(testDomain), dns.TypeA) | ||
ctx, writeSpan := tracer.Start(ctx, "write") | ||
writeErr := dnsConn.WriteMsg(&dnsRequest) | ||
defer writeSpan.End() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You need to fix all those deferred span ends |
||
if writeErr != nil { | ||
writeSpan.RecordError(writeErr) | ||
return makeTestError("write", writeErr) | ||
} | ||
|
||
_, readSpan := tracer.Start(ctx, "read") | ||
_, readErr := dnsConn.ReadMsg() | ||
defer readSpan.End() | ||
if readErr != nil { | ||
readSpan.RecordError(readErr) | ||
// An early close on the connection may cause a "unexpected EOF" error. That's an application-layer error, | ||
// not triggered by a syscall error so we don't capture an error code. | ||
// TODO: figure out how to standardize on those errors. | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,76 @@ | ||
// tracing_utils.go in the connectivity package | ||
|
||
package connectivity | ||
|
||
import ( | ||
"context" | ||
"encoding/json" | ||
"fmt" | ||
"log" | ||
|
||
"go.opentelemetry.io/otel" | ||
"go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp" | ||
"go.opentelemetry.io/otel/sdk/resource" | ||
"go.opentelemetry.io/otel/sdk/trace" | ||
semconv "go.opentelemetry.io/otel/semconv/v1.4.0" | ||
// necessary imports | ||
) | ||
|
||
// JSONStdoutExporter, initTracing, and other related functions go here | ||
type JSONStdoutExporter struct{} | ||
|
||
// func (e *JSONStdoutExporter) ExportSpans(ctx context.Context, spans []trace.ReadOnlySpan) error { | ||
// for _, span := range spans { | ||
// fmt.Printf("Span: %s, Duration: %v\n", span.Name(), span.EndTime().Sub(span.StartTime())) | ||
// } | ||
// return nil | ||
// } | ||
|
||
func (e *JSONStdoutExporter) ExportSpans(ctx context.Context, spans []trace.ReadOnlySpan) error { | ||
for _, span := range spans { | ||
fmt.Printf("Span: %s, Duration: %v\n", span.Name(), span.EndTime().Sub(span.StartTime())) | ||
jsonSpan, err := json.Marshal(span) | ||
if err != nil { | ||
return err | ||
} | ||
fmt.Println(span) | ||
fmt.Println(string(jsonSpan)) | ||
} | ||
return nil | ||
} | ||
|
||
func (e *JSONStdoutExporter) Shutdown(ctx context.Context) error { | ||
// Perform any cleanup if necessary | ||
return nil | ||
} | ||
|
||
// exporter := &JSONStdoutExporter{} | ||
// exporter, err := stdouttrace.New(stdouttrace.WithPrettyPrint()) | ||
// exporter, err := otlptracegrpc.New(ctx, otlptracegrpc.WithEndpoint("localhost:4317")) | ||
|
||
func initTracing() *trace.TracerProvider { | ||
collectorURL := "localhost:4318" // Default URL if not specified | ||
|
||
ctx := context.Background() | ||
exporter, err := otlptracehttp.New( | ||
ctx, | ||
otlptracehttp.WithEndpoint(collectorURL), | ||
otlptracehttp.WithInsecure(), // Use WithTLSCredentials for a secure connection | ||
) | ||
if err != nil { | ||
log.Fatalf("failed to create exporter: %v", err) | ||
} | ||
tp := trace.NewTracerProvider( | ||
trace.WithBatcher(exporter), | ||
trace.WithSampler(trace.AlwaysSample()), | ||
trace.WithResource(resource.NewWithAttributes( | ||
semconv.SchemaURL, | ||
semconv.ServiceNameKey.String("Outline Connectivity Tester"), // Explicitly set service name | ||
// Add other attributes as needed | ||
)), | ||
// Additional configurations like resources, sampler, etc. | ||
) | ||
otel.SetTracerProvider(tp) | ||
|
||
return tp | ||
} |
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You can see in this file how |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You’re still depending on otel. We’ll need to stick to it forever with this approach. Even though you’re not exposing the types, there’s dependency on the otel behavior and globals.
I suggest you create a new trace package that defines the interfaces we need. It can mimic otel or the standard library trace. Then the consumer can provide whatever implementation they prefer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of the packages added in
go.mod
are not used in the package code (SDK core) and are just used in the tests (connectivity_test.go
andtracing_utils_test.go
). When building a go app with the SDK, they will be excluded. Only the following three packages are required in the SDK:The rest will go into the application code but they are appearing in
go.mod
because they were used in tests.I can investigate alternative approaches to further slim down the bloat.
Here's the full list of packages that are requires for
go build
andgo install
operations.Removing the
connectivity_test.go
andtracing_utils_test.go
and runninggo mod tidy
removes most of the dependencies that are not needed for core instrumentation part of the SDK:There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fortuna there's specific language in OpenTelemetry Docs for our specific use-case:
"If you’re instrumenting a library, only install the OpenTelemetry API package for your language. Your library will not emit telemetry on its own. It will only emit telemetry when it is part of an app that uses the OpenTelemetry SDK."