Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve trace_context performance #4721

Merged
merged 9 commits into from
Nov 29, 2023
Merged

Conversation

xiehuc
Copy link
Contributor

@xiehuc xiehuc commented Nov 16, 2023

Propagation is essential logic, regardless of whether it hits sampling. In the past, the implementation of Propagation did not prioritize memory management, allocating a lot of temporary heap memory. This caused its efficiency to be low. In some scenarios, too much CPU was wasted.
image
For example, even if it doesn't hit sampling, startServerSpan consumes 25% of the entire request CPU.

Typical server span Propagation:
image

Typical client span Propagation
image

It can be seen that the poor performance of Propagation is mainly due to regular expressions and Sprintf.

This CR can significantly improve Propagation by optimizing it in the following ways:

Use string.Builder to pre-allocate memory
Use hex.Decode to a stack-based [N]byte array instead of hex.DecodeString, which temporarily allocates a heap-based string.
Use strings.Cut instead of strings.Split
Avoid using regex
All unit tests pass. Benchmarks:

New logic:

goos: linux
goarch: amd64
pkg: go.opentelemetry.io/otel/propagation
cpu: Intel(R) Xeon(R) Platinum 8255C CPU @ 2.50GHz
BenchmarkInject/SampledSpanContext-16            3489816               344.4 ns/op            96 B/op          3 allocs/op
BenchmarkInject/WithoutSpanContext-16           19206630                60.12 ns/op           16 B/op          1 allocs/op
BenchmarkExtract/Sampled-16                      2094048               576.1 ns/op           160 B/op          4 allocs/op
BenchmarkExtract/BogusVersion-16                 8635071               137.9 ns/op            16 B/op          1 allocs/op
BenchmarkExtract/FutureAdditionalData-16         2073709               577.8 ns/op           160 B/op          4 allocs/op
PASS
ok      go.opentelemetry.io/otel/propagation    7.692s

Old logic:

goos: linux
goarch: amd64
pkg: go.opentelemetry.io/otel/propagation
cpu: Intel(R) Xeon(R) Platinum 8255C CPU @ 2.50GHz
BenchmarkInject/SampledSpanContext-16            1289794               926.0 ns/op           224 B/op         11 allocs/op
BenchmarkInject/WithoutSpanContext-16           18479284                60.12 ns/op           16 B/op          1 allocs/op
BenchmarkExtract/Sampled-16                       760831              1423 ns/op             320 B/op          6 allocs/op
BenchmarkExtract/BogusVersion-16                 6141056               191.4 ns/op            16 B/op          1 allocs/op
BenchmarkExtract/FutureAdditionalData-16          704563              1586 ns/op             320 B/op          6 allocs/op
PASS
ok      go.opentelemetry.io/otel/propagation    6.938s

It can be seen that the performance has improved by about 2-3 times.

* using strings.Builder instead of fmt.Sprint

* using strings.Cut instead of strings.Split

* using hex.Decode instead of hex.DecodeString

* avoid use regexp
Copy link

linux-foundation-easycla bot commented Nov 16, 2023

CLA Signed

The committers listed above are authorized under a signed CLA.

Copy link

codecov bot commented Nov 16, 2023

Codecov Report

Merging #4721 (770fb29) into main (204be61) will increase coverage by 0.1%.
The diff coverage is 100.0%.

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##            main   #4721     +/-   ##
=======================================
+ Coverage   81.8%   81.9%   +0.1%     
=======================================
  Files        224     224             
  Lines      18113   18116      +3     
=======================================
+ Hits       14817   14847     +30     
+ Misses      3000    2982     -18     
+ Partials     296     287      -9     
Files Coverage Δ
propagation/trace_context.go 96.6% <100.0%> (+31.1%) ⬆️

Copy link
Member

@hanyuancheung hanyuancheung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM👍

@MrAlias
Copy link
Contributor

MrAlias commented Nov 16, 2023

Similar to #4722, please use benchstat to make comparisons between benchmarks in your description.

Also, the increases in allocations is concerning. The impact on GC is not captured in the benchmarks but will be when it is run on real systems.

@xiehuc
Copy link
Contributor Author

xiehuc commented Nov 17, 2023

Similar to #4722, please use benchstat to make comparisons between benchmarks in your description.

Also, the increases in allocations is concerning. The impact on GC is not captured in the benchmarks but will be when it is run on real systems.

allocation is on stack, less GC than old implement,

@xiehuc
Copy link
Contributor Author

xiehuc commented Nov 17, 2023

goos: linux
goarch: amd64
pkg: go.opentelemetry.io/otel/propagation
cpu: Intel(R) Xeon(R) Platinum 8255C CPU @ 2.50GHz
                                │ /tmp/old.txt │            /tmp/new.txt             │
                                │    sec/op    │   sec/op     vs base                │
Inject/SampledSpanContext-16       916.0n ± 0%   341.2n ± 0%  -62.75% (p=0.000 n=10)
Inject/WithoutSpanContext-16       61.79n ± 1%   59.72n ± 1%   -3.35% (p=0.000 n=10)
Extract/Sampled-16                1398.0n ± 0%   574.8n ± 0%  -58.88% (p=0.000 n=10)
Extract/BogusVersion-16            190.4n ± 0%   138.6n ± 0%  -27.23% (p=0.000 n=10)
Extract/FutureAdditionalData-16   1553.0n ± 0%   574.8n ± 1%  -62.99% (p=0.000 n=10)
geomean                            471.9n        247.7n       -47.50%

                                │ /tmp/old.txt │             /tmp/new.txt             │
                                │     B/op     │    B/op     vs base                  │
Inject/SampledSpanContext-16       224.00 ± 0%   96.00 ± 0%  -57.14% (p=0.000 n=10)
Inject/WithoutSpanContext-16        16.00 ± 0%   16.00 ± 0%        ~ (p=1.000 n=10) ¹
Extract/Sampled-16                  320.0 ± 0%   160.0 ± 0%  -50.00% (p=0.000 n=10)
Extract/BogusVersion-16             16.00 ± 0%   16.00 ± 0%        ~ (p=1.000 n=10) ¹
Extract/FutureAdditionalData-16     320.0 ± 0%   160.0 ± 0%  -50.00% (p=0.000 n=10)
geomean                             89.90        57.51       -36.03%
¹ all samples are equal

                                │ /tmp/old.txt │             /tmp/new.txt             │
                                │  allocs/op   │ allocs/op   vs base                  │
Inject/SampledSpanContext-16       11.000 ± 0%   3.000 ± 0%  -72.73% (p=0.000 n=10)
Inject/WithoutSpanContext-16        1.000 ± 0%   1.000 ± 0%        ~ (p=1.000 n=10) ¹
Extract/Sampled-16                  6.000 ± 0%   4.000 ± 0%  -33.33% (p=0.000 n=10)
Extract/BogusVersion-16             1.000 ± 0%   1.000 ± 0%        ~ (p=1.000 n=10) ¹
Extract/FutureAdditionalData-16     6.000 ± 0%   4.000 ± 0%  -33.33% (p=0.000 n=10)
geomean                             3.308        2.169       -34.43%
¹ all samples are equal

@xiehuc
Copy link
Contributor Author

xiehuc commented Nov 17, 2023

original benchmark result
old.txt
new.txt

@xiehuc
Copy link
Contributor Author

xiehuc commented Nov 17, 2023

@MrAlias please check again, thanks

@MrAlias
Copy link
Contributor

MrAlias commented Nov 17, 2023

It seems like I was confusing old vs new in my last comment. The benchstat output was helpful in clarifying this.

propagation/trace_context.go Outdated Show resolved Hide resolved
propagation/trace_context.go Outdated Show resolved Hide resolved
propagation/trace_context.go Outdated Show resolved Hide resolved
propagation/trace_context.go Outdated Show resolved Hide resolved
@xiehuc
Copy link
Contributor Author

xiehuc commented Nov 18, 2023

@MrAlias all suggestion modified, please check again, thanks

propagation/trace_context.go Outdated Show resolved Hide resolved
@xiehuc
Copy link
Contributor Author

xiehuc commented Nov 29, 2023

newest code benchmark result

goos: linux
goarch: amd64
pkg: go.opentelemetry.io/otel/propagation
cpu: Intel(R) Xeon(R) Platinum 8255C CPU @ 2.50GHz
                                │ /tmp/old.txt │            /tmp/new2.txt            │
                                │    sec/op    │   sec/op     vs base                │
Inject/SampledSpanContext-16       916.0n ± 0%   335.7n ± 1%  -63.35% (p=0.000 n=10)
Inject/WithoutSpanContext-16       61.79n ± 1%   60.83n ± 0%   -1.55% (p=0.000 n=10)
Extract/Sampled-16                1398.0n ± 0%   586.4n ± 1%  -58.05% (p=0.000 n=10)
Extract/BogusVersion-16            190.4n ± 0%   137.0n ± 0%  -28.07% (p=0.000 n=10)
Extract/FutureAdditionalData-16   1553.0n ± 0%   579.0n ± 0%  -62.72% (p=0.000 n=10)
geomean                            471.9n        248.6n       -47.32%

                                │ /tmp/old.txt │            /tmp/new2.txt             │
                                │     B/op     │    B/op     vs base                  │
Inject/SampledSpanContext-16       224.00 ± 0%   96.00 ± 0%  -57.14% (p=0.000 n=10)
Inject/WithoutSpanContext-16        16.00 ± 0%   16.00 ± 0%        ~ (p=1.000 n=10) ¹
Extract/Sampled-16                  320.0 ± 0%   160.0 ± 0%  -50.00% (p=0.000 n=10)
Extract/BogusVersion-16             16.00 ± 0%   16.00 ± 0%        ~ (p=1.000 n=10) ¹
Extract/FutureAdditionalData-16     320.0 ± 0%   160.0 ± 0%  -50.00% (p=0.000 n=10)
geomean                             89.90        57.51       -36.03%
¹ all samples are equal

                                │ /tmp/old.txt │            /tmp/new2.txt             │
                                │  allocs/op   │ allocs/op   vs base                  │
Inject/SampledSpanContext-16       11.000 ± 0%   3.000 ± 0%  -72.73% (p=0.000 n=10)
Inject/WithoutSpanContext-16        1.000 ± 0%   1.000 ± 0%        ~ (p=1.000 n=10) ¹
Extract/Sampled-16                  6.000 ± 0%   4.000 ± 0%  -33.33% (p=0.000 n=10)
Extract/BogusVersion-16             1.000 ± 0%   1.000 ± 0%        ~ (p=1.000 n=10) ¹
Extract/FutureAdditionalData-16     6.000 ± 0%   4.000 ± 0%  -33.33% (p=0.000 n=10)
geomean                             3.308        2.169       -34.43%
¹ all samples are equal

@xiehuc
Copy link
Contributor Author

xiehuc commented Nov 29, 2023

new2.txt

@xiehuc
Copy link
Contributor Author

xiehuc commented Nov 29, 2023

@MrAlias all comment fixed, please check again, thanks

@MrAlias MrAlias merged commit 0405492 into open-telemetry:main Nov 29, 2023
24 checks passed
@MrAlias MrAlias added this to the v1.22.0 milestone Jan 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants