-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimized stolarsky_mean
#2274
base: main
Are you sure you want to change the base?
Optimized stolarsky_mean
#2274
Conversation
Review checklistThis checklist is meant to assist creators of PRs (to let them know what reviewers will typically look for) and reviewers (to guide them in a structured review process). Items do not need to be checked explicitly for a PR to be eligible for merging. Purpose and scope
Code quality
Documentation
Testing
Performance
Verification
Created with ❤️ by the Trixi.jl community. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #2274 +/- ##
=======================================
Coverage 96.88% 96.88%
=======================================
Files 490 490
Lines 39491 39496 +5
=======================================
+ Hits 38260 38265 +5
Misses 1231 1231
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an interesting (and ingenious way) to rewrite the expression and possibly improve performance. Would it be worthwhile to also benchmark it on Roci (or some other machine) just to see its imfluence?
Co-authored-by: Andrew Winters <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Hendrik made me realize that for integers the
For real numbers:
For integers
So, basically just by preallocating the power functions a 50% speed up is gained. The |
Results on university machine (Goldstein): new version
old version:
There's still a roughly 17% improvement. Benchmarks for Goldstein:
Integers:
|
@@ -412,8 +412,15 @@ Given ε = 1.0e-4, we use the following algorithm. | |||
c3 = convert(RealT, -1 / 21) * (2 * gamma * (gamma - 2) - 9) * c2 | |||
return 0.5f0 * (x + y) * @evalpoly(f2, 1, c1, c2, c3) | |||
else | |||
return (gamma - 1) / gamma * (y^gamma - x^gamma) / | |||
(y^(gamma - 1) - x^(gamma - 1)) | |||
if isinteger(gamma) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if isinteger(gamma) | |
if gamma isa Integer |
isinteger
checks also the values of floating-point numbers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems much faster, but I had to remove the specific parametrization. That would imply also to change the struct, so that, for instance, 2
can be accepted, instead of 2.0
.
julia> @inline function stolarsky_mean_1(x::RealT, y::RealT, gamma::RealT) where {RealT <: Real}
epsilon_f2 = convert(RealT, 1.0e-4)
f2 = (x * (x - 2 * y) + y * y) / (x * (x + 2 * y) + y * y) # f2 = f^2
if f2 < epsilon_f2
# convenience coefficients
c1 = convert(RealT, 1 / 3) * (gamma - 2)
c2 = convert(RealT, -1 / 15) * (gamma + 1) * (gamma - 3) * c1
c3 = convert(RealT, -1 / 21) * (2 * gamma * (gamma - 2) - 9) * c2
return 0.5f0 * (x + y) * @evalpoly(f2, 1, c1, c2, c3)
else
if isinteger(gamma)
yg = y^(gamma-1)
xg = x^(gamma-1)
else
yg = exp((gamma-1)*log(y))
xg = exp((gamma-1)*log(x))
end
return (gamma - 1) * (yg*y - xg*x) / (gamma * ( yg - xg))
end
end
stolarsky_mean_1 (generic function with 1 method)
julia> @inline function stolarsky_mean_2(x::Real, y::Real, gamma::Real)
epsilon_f2 = convert(Real, 1.0e-4)
f2 = (x * (x - 2 * y) + y * y) / (x * (x + 2 * y) + y * y) # f2 = f^2
if f2 < epsilon_f2
# convenience coefficients
c1 = convert(Real, 1 / 3) * (gamma - 2)
c2 = convert(Real, -1 / 15) * (gamma + 1) * (gamma - 3) * c1
c3 = convert(Real, -1 / 21) * (2 * gamma * (gamma - 2) - 9) * c2
return 0.5f0 * (x + y) * @evalpoly(f2, 1, c1, c2, c3)
else
if gamma isa Integer
yg = y^(gamma-1)
xg = x^(gamma-1)
else
yg = exp((gamma-1)*log(y))
xg = exp((gamma-1)*log(x))
end
return (gamma - 1) * (yg*y - xg*x) / (gamma * (yg - xg) )
end
end
stolarsky_mean_2 (generic function with 1 method)
julia> @benchmark value = stolarsky_mean_1($(Ref(300.1))[], $(Ref(410.7))[], $(Ref(1.7))[])
BenchmarkTools.Trial: 10000 samples with 997 evaluations per sample.
Range (min … max): 19.647 ns … 27.185 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 19.982 ns ┊ GC (median): 0.00%
Time (mean ± σ): 20.086 ns ± 0.438 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
▃▅▆▇████▇▆▃ ▁ ▁▁▁▁ ▂
▄▇███████████▇▆▄▄▅▆▅▅▅▆▆▅▄▄▅▆▆▇██▇█████████████▇▅▅▅▅▅▆▆▅▆▄▆ █
19.6 ns Histogram: log(frequency) by time 22.1 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
julia> @benchmark value = stolarsky_mean_2($(Ref(300.1))[], $(Ref(410.7))[], $(Ref(1.7))[])
BenchmarkTools.Trial: 10000 samples with 997 evaluations per sample.
Range (min … max): 19.394 ns … 35.163 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 19.588 ns ┊ GC (median): 0.00%
Time (mean ± σ): 19.620 ns ± 0.326 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
▃▅▇██▇▄
▂▂▃▅▇████████▅▃▂▂▂▁▁▁▁▂▂▁▁▂▁▂▁▁▂▂▁▁▂▁▁▂▁▁▁▁▁▁▁▁▁▂▂▂▂▂▂▂▂▂▂▂ ▃
19.4 ns Histogram: frequency by time 20.7 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
julia> @benchmark value = stolarsky_mean_1($(Ref(300.1))[], $(Ref(410.7))[], $(Ref(1.0))[])
BenchmarkTools.Trial: 10000 samples with 999 evaluations per sample.
Range (min … max): 8.078 ns … 20.871 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 9.662 ns ┊ GC (median): 0.00%
Time (mean ± σ): 9.584 ns ± 0.374 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
▁ ▂▂▅▄▆ ▄▃ ▃▅▅▅█▇ ▂ ▂▁▃▃▅ ▂
▂▂▃▆▄▆▄▇▆▇█▇▅▅▅▆▇█▇▆█████████▇▅██▄██████▄▆█▅▇█████▅▄▇▅▃▆▄▄ █
8.08 ns Histogram: log(frequency) by time 10.5 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
julia> @benchmark value = stolarsky_mean_2($(Ref(300.1))[], $(Ref(410.7))[], $(Ref(1))[])
BenchmarkTools.Trial: 10000 samples with 1000 evaluations per sample.
Range (min … max): 2.500 ns … 7.277 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 2.536 ns ┊ GC (median): 0.00%
Time (mean ± σ): 2.543 ns ± 0.107 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
▄ █ ▁▁▆ ▆ ▆ ▇ ▁
▁▁▁▁▁▁▁▁▃▂▂▃▃▇▅▅▅█▆▇▇██████████▇▇▇██▇▇▆█▆▆▆█▄▄▃▃▄▂▂▂▁▁▁▁▁ ▄
2.5 ns Histogram: frequency by time 2.57 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
julia> @benchmark value = stolarsky_mean_1($(Ref(300.1))[], $(Ref(410.7))[], $(Ref(2.0))[])
BenchmarkTools.Trial: 10000 samples with 999 evaluations per sample.
Range (min … max): 9.267 ns … 17.994 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 10.538 ns ┊ GC (median): 0.00%
Time (mean ± σ): 10.502 ns ± 0.383 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
▃▄▃ ▁▂▅▃▄ ▃ ▅▄▆▆█▂ ▂▂▂▂▅▂ ▂
▃▁▆▇▄▁▃▁▆▇▅▆█████████▅▆█▇███████▆█▇███████▅▇▆▆▆█▇▇█▇▅▆▇▆▅▆▆ █
9.27 ns Histogram: log(frequency) by time 11.8 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
julia> @benchmark value = stolarsky_mean_2($(Ref(300.1))[], $(Ref(410.7))[], $(Ref(2))[])
BenchmarkTools.Trial: 10000 samples with 1000 evaluations per sample.
Range (min … max): 5.276 ns … 11.283 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 5.684 ns ┊ GC (median): 0.00%
Time (mean ± σ): 5.751 ns ± 0.224 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
▇█▅ ▄▅▂ ▂
▄▁▁▁▁▁▁▁▁▁▆█▄▁▁▁▁▁▁▁▁▁▁████▄▁▁▁▁▁▁▃▁▇▇▅▄▁▁▁▃▃▁▁▅▆███▇▄▃▃▄▃ █
5.28 ns Histogram: log(frequency) by time 6.24 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please change the code
@inline stolarsky_mean(x::Real, y::Real, gamma::Real) = stolarsky_mean(promote(x, y,
gamma)...)
a few lines above to
@inline stolarsky_mean(x::Real, y::Real, gamma::Real) = stolarsky_mean(promote(x, y)..., gamma)
and then check the gamma isa Integer
option?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get the following error when trying that. As far as I understood, that creates a loop because the main function is always defined for type Reals, and all the variables are bounded to be of the same type, hence the function calls itself and doesn't switch to the "main" one.
julia> function stolarsky_mean(x::RealT, y::RealT, gamma::RealT) where {RealT <: Real}
epsilon_f2 = convert(RealT, 1.0e-4)
f2 = (x * (x - 2 * y) + y * y) / (x * (x + 2 * y) + y * y) # f2 = f^2
if f2 < epsilon_f2
# convenience coefficients
c1 = convert(RealT, 1 / 3) * (gamma - 2)
c2 = convert(RealT, -1 / 15) * (gamma + 1) * (gamma - 3) * c1
c3 = convert(RealT, -1 / 21) * (2 * gamma * (gamma - 2) - 9) * c2
return 0.5f0 * (x + y) * @evalpoly(f2, 1, c1, c2, c3)
else
if gamma isa Integer
yg = y^(gamma - 1)
xg = x^(gamma - 1)
else
yg = exp((gamma - 1) * log(y)) # equivalent to y^gamma but faster for non-integers
xg = exp((gamma - 1) * log(x)) # equivalent to x^gamma but faster for non-integers
end
return (gamma - 1) / gamma * (yg * y - xg * x) /
(yg - xg)
end
end
stolarsky_mean (generic function with 2 methods)
julia> stolarsky_mean(x::Real, y::Real, gamma::Real) = stolarsky_mean(promote(x, y)..., gamma)
stolarsky_mean (generic function with 2 methods)
julia> @benchmark value = stolarsky_mean($(Ref(300.1))[], $(Ref(410.7))[], $(Ref(2))[])
ERROR: StackOverflowError:
Stacktrace:
[1] stolarsky_mean(x::Float64, y::Float64, gamma::Int64) (repeats 79984 times)
@ Main .\REPL[31]:1
julia> stolarsky_mean(301.2, 421.2, 2)
ERROR: StackOverflowError:
Stacktrace:
[1] stolarsky_mean(x::Float64, y::Float64, gamma::Int64) (repeats 79984 times)
@ Main .\REPL[31]:1
A workaround would be something like:
stolarsky_mean(x::Real, y::Real, gamma::Real) = stolarsky_mean(promote(x, y)..., gamma)
function stolarsky_mean(x::RealT, y::RealT, gamma::Real) where {RealT <: Real}
epsilon_f2 = convert(RealT, 1.0e-4)
f2 = (x * (x - 2 * y) + y * y) / (x * (x + 2 * y) + y * y) # f2 = f^2
if f2 < epsilon_f2
# convenience coefficients
c1 = convert(RealT, 1 / 3) * (gamma - 2)
c2 = convert(RealT, -1 / 15) * (gamma + 1) * (gamma - 3) * c1
c3 = convert(RealT, -1 / 21) * (2 * gamma * (gamma - 2) - 9) * c2
return 0.5f0 * (x + y) * @evalpoly(f2, 1, c1, c2, c3)
else
if gamma isa Integer
yg = y^(gamma - 1)
xg = x^(gamma - 1)
else
yg = exp((gamma - 1) * log(y)) # equivalent to y^gamma but faster for non-integers
xg = exp((gamma - 1) * log(x)) # equivalent to x^gamma but faster for non-integers
end
return (gamma - 1) / gamma * (yg * y - xg * x) /
(yg - xg)
end
end
but again, shouldn't I allow the equation struct to accept different types and not promoting the gamma
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I meant was basically the second option:
julia> stolarsky_mean(x::Real, y::Real, gamma::Real) = stolarsky_mean(promote(x, y)..., gamma)
stolarsky_mean (generic function with 1 method)
julia> function stolarsky_mean(x::RealT, y::RealT, gamma::Real) where {RealT <: Real}
epsilon_f2 = convert(RealT, 1.0e-4)
f2 = (x * (x - 2 * y) + y * y) / (x * (x + 2 * y) + y * y) # f2 = f^2
if f2 < epsilon_f2
# convenience coefficients
c1 = convert(RealT, 1 / 3) * (gamma - 2)
c2 = convert(RealT, -1 / 15) * (gamma + 1) * (gamma - 3) * c1
c3 = convert(RealT, -1 / 21) * (2 * gamma * (gamma - 2) - 9) * c2
return 0.5f0 * (x + y) * @evalpoly(f2, 1, c1, c2, c3)
else
if gamma isa Integer
yg = y^(gamma - 1)
xg = x^(gamma - 1)
else
yg = exp((gamma - 1) * log(y)) # equivalent to y^gamma but faster for non-integers
xg = exp((gamma - 1) * log(x)) # equivalent to x^gamma but faster for non-integers
end
return (gamma - 1) * (yg * y - xg * x) / (gamma * (yg - xg))
end
end
stolarsky_mean (generic function with 2 methods)
julia> stolarsky_mean(300.1, 410.7, 2)
355.40000000000003
julia> stolarsky_mean(300.1, 410.7, 2.0)
355.39999999999986
julia> stolarsky_mean(300.1, 410.7f0, 2.0)
355.4000061035154
julia> stolarsky_mean(300.1, 410.7f0, 2)
355.40000610351564
Using this approach, gamma
can be an Integer
and will not be promoted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perfect, thank you! Sorry for the missunderstanding, I will introduce that.
return (gamma - 1) / gamma * (yg * y - xg * x) / | ||
(yg - xg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return (gamma - 1) / gamma * (yg * y - xg * x) / | |
(yg - xg) | |
return (gamma - 1) / gamma * (yg * y - xg * x) / (yg - xg) |
Does this formatting work now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And does it make a difference if you avoid an additional division by rewriting this part as something like
return (gamma - 1) / gamma * (yg * y - xg * x) / | |
(yg - xg) | |
return (gamma - 1) * (yg * y - xg * x) / (gamma * (yg - xg)) |
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like there's not a big difference. It is slightly faster for reals, but slightly slower for integers.
julia> @inline function stolarsky_mean_1(x::RealT, y::RealT, gamma::RealT) where {RealT <: Real}
epsilon_f2 = convert(RealT, 1.0e-4)
f2 = (x * (x - 2 * y) + y * y) / (x * (x + 2 * y) + y * y) # f2 = f^2
if f2 < epsilon_f2
# convenience coefficients
c1 = convert(RealT, 1 / 3) * (gamma - 2)
c2 = convert(RealT, -1 / 15) * (gamma + 1) * (gamma - 3) * c1
c3 = convert(RealT, -1 / 21) * (2 * gamma * (gamma - 2) - 9) * c2
return 0.5f0 * (x + y) * @evalpoly(f2, 1, c1, c2, c3)
else
if isinteger(gamma)
yg = y^(gamma-1)
xg = x^(gamma-1)
else
yg = exp((gamma-1)*log(y))
xg = exp((gamma-1)*log(x))
end
return (gamma - 1)/gamma * (yg*y - xg*x) / (yg - xg)
end
end
stolarsky_mean_1 (generic function with 1 method)
julia> @inline function stolarsky_mean_2(x::RealT, y::RealT, gamma::RealT) where {RealT <: Real}
epsilon_f2 = convert(RealT, 1.0e-4)
f2 = (x * (x - 2 * y) + y * y) / (x * (x + 2 * y) + y * y) # f2 = f^2
if f2 < epsilon_f2
# convenience coefficients
c1 = convert(RealT, 1 / 3) * (gamma - 2)
c2 = convert(RealT, -1 / 15) * (gamma + 1) * (gamma - 3) * c1
c3 = convert(RealT, -1 / 21) * (2 * gamma * (gamma - 2) - 9) * c2
return 0.5f0 * (x + y) * @evalpoly(f2, 1, c1, c2, c3)
else
if isinteger(gamma)
yg = y^(gamma-1)
xg = x^(gamma-1)
else
yg = exp((gamma-1)*log(y))
xg = exp((gamma-1)*log(x))
end
return (gamma - 1) * (yg*y - xg*x) / (gamma * (yg - xg) )
end
end
stolarsky_mean_2 (generic function with 1 method)
julia> @benchmark value = stolarsky_mean_1($(Ref(300.1))[], $(Ref(410.7))[], $(Ref(2.0))[])
BenchmarkTools.Trial: 10000 samples with 999 evaluations per sample.
Range (min … max): 8.965 ns … 101.965 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 9.815 ns ┊ GC (median): 0.00%
Time (mean ± σ): 9.794 ns ± 1.531 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
▁▂ ▃█▄ ▆▇ ▃ ▆▁ ▂▁ ▁
▃▂▃▃██▆▅▅███▇▇▇▆▄▄▅▅██▇▄▆▇██▆▅▇██▄▄▅▆██▅▆██▆▅▄▅▆▇▇▅▅▇▄▄▃▅▄▆ █
8.96 ns Histogram: log(frequency) by time 11.4 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
julia> @benchmark value = stolarsky_mean_2($(Ref(300.1))[], $(Ref(410.7))[], $(Ref(2.0))[])
BenchmarkTools.Trial: 10000 samples with 999 evaluations per sample.
Range (min … max): 9.183 ns … 19.411 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 10.492 ns ┊ GC (median): 0.00%
Time (mean ± σ): 10.481 ns ± 0.363 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
▁▂▁▁▁▅▄▄ ▄▂ ▁▁█▆▇▁ ▂▁▁ ▁▂ ▁ ▂
▄▁▁▁▃▅▄▄▃██▆▅▆█████████▇██▇██████▇██▇▇▆███▆▇█████▇▆▆▆▇▇▇▇▇▇ █
9.18 ns Histogram: log(frequency) by time 11.8 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
julia> @benchmark value = stolarsky_mean_1($(Ref(300.1))[], $(Ref(410.7))[], $(Ref(1.4))[])
BenchmarkTools.Trial: 10000 samples with 997 evaluations per sample.
Range (min … max): 19.662 ns … 45.032 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 19.761 ns ┊ GC (median): 0.00%
Time (mean ± σ): 19.815 ns ± 0.562 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
▃█▂
▂▄███▄▂▂▂▁▂▂▂▁▁▁▂▁▁▂▂▂▂▁▂▁▁▁▂▁▁▁▁▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▂ ▂
19.7 ns Histogram: frequency by time 21.4 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
julia> @benchmark value = stolarsky_mean_2($(Ref(300.1))[], $(Ref(410.7))[], $(Ref(1.4))[])
BenchmarkTools.Trial: 10000 samples with 997 evaluations per sample.
Range (min … max): 19.621 ns … 27.873 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 19.987 ns ┊ GC (median): 0.00%
Time (mean ± σ): 20.087 ns ± 0.432 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
▁▄▆▇█████▇▅▂ ▁ ▁▁▂▁▁ ▃
▃▆████████████▇▆▃▆▆▅▅▄▅▆▆▄▆▆▅▆▇▇█▇████████████▇▆▆▅▃▆▅▄▆▄▅▄▆ █
19.6 ns Histogram: log(frequency) by time 22.1 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
julia> @benchmark value = stolarsky_mean_1($(Ref(300.1))[], $(Ref(410.7))[], $(Ref(1.7))[])
BenchmarkTools.Trial: 10000 samples with 997 evaluations per sample.
Range (min … max): 19.658 ns … 27.584 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 19.765 ns ┊ GC (median): 0.00%
Time (mean ± σ): 19.884 ns ± 0.437 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
▁▆█▇▄ ▁▁ ▁
██████▃▅▅▃▄▅▅▄▅▄▄▅▃▅▄▄▅▅▄▅▅▅▆▅▇██▆▇▆▇▆▇███▇█▆▅▅▂▃▅▅▄▅▅▆▃▄▃▄ █
19.7 ns Histogram: log(frequency) by time 21.8 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
julia> @benchmark value = stolarsky_mean_2($(Ref(300.1))[], $(Ref(410.7))[], $(Ref(1.7))[])
BenchmarkTools.Trial: 10000 samples with 997 evaluations per sample.
Range (min … max): 19.642 ns … 42.111 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 19.983 ns ┊ GC (median): 0.00%
Time (mean ± σ): 20.010 ns ± 0.352 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
▁▂▅▅▇█▇▇▅▂
▂▂▂▃▃▅▅▇███████████▅▃▂▂▂▂▁▁▁▁▂▁▁▁▁▁▁▁▁▁▁▁▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂ ▃
19.6 ns Histogram: frequency by time 21.2 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am mainly looking at the median results. There, the first option seems to be best all the time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with you, I will compare also full runs, to see if there are any noticeable differences.
Co-authored-by: Hendrik Ranocha <[email protected]>
The stolarsky mean will come in handy in Trixi Atmo. Here a faster version:
Simulation of EC Polytropic Euler with the previous stolarsky mean:
Results for the optimized version
Thus on my machine, there's an 18% improvement.