-
-
Notifications
You must be signed in to change notification settings - Fork 211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ODE update function generated using symbolics much slower than naive user implementation? #2372
Comments
This seems to be discretization dependent...if julia> b_sym
BenchmarkTools.Trial: 10000 samples with 651 evaluations.
Range (min … max): 188.833 ns … 2.616 μs ┊ GC (min … max): 0.00% … 90.32%
Time (median): 196.710 ns ┊ GC (median): 0.00%
Time (mean ± σ): 226.175 ns ± 218.425 ns ┊ GC (mean ± σ): 10.78% ± 10.06%
Memory estimate: 928 bytes, allocs estimate: 2.
julia> b_usr
BenchmarkTools.Trial: 10000 samples with 398 evaluations.
Range (min … max): 243.714 ns … 9.236 μs ┊ GC (min … max): 0.00% … 96.53%
Time (median): 249.991 ns ┊ GC (median): 0.00%
Time (mean ± σ): 313.016 ns ± 642.649 ns ┊ GC (mean ± σ): 17.87% ± 8.37%
Memory estimate: 896 bytes, allocs estimate: 8. Whereas any refinement of this, e.g. julia> b_sym
BenchmarkTools.Trial: 10000 samples with 9 evaluations.
Range (min … max): 2.067 μs … 257.579 μs ┊ GC (min … max): 0.00% … 97.63%
Time (median): 2.127 μs ┊ GC (median): 0.00%
Time (mean ± σ): 2.352 μs ± 3.887 μs ┊ GC (mean ± σ): 2.80% ± 1.69%
Memory estimate: 1.83 KiB, allocs estimate: 36.
julia> b_usr
BenchmarkTools.Trial: 10000 samples with 379 evaluations.
Range (min … max): 257.026 ns … 11.460 μs ┊ GC (min … max): 0.00% … 91.71%
Time (median): 271.844 ns ┊ GC (median): 0.00%
Time (mean ± σ): 363.252 ns ± 631.237 ns ┊ GC (mean ± σ): 18.52% ± 10.18%
Memory estimate: 1.41 KiB, allocs estimate: 8. And it seems the number of allocations in the (SG) function grows with the dimensionality of the refinement.. |
The change in behavior happens right at |
I debugged through the entire function generation for both values, With no differences in function generation through MTK or Symbolics, I guess I'll close this issue, as I suspect there is something deeper at work here. |
Describe the bug 🐞
Question about MTK performance - I would like to utilize MTK for finite volume work, and wrote some simple code using upwind fluxes. The update equation is incredibly simple, but there is an order of magnitude difference between the symbolic-generated code and naive user code.
I'd like to eventually use this for much more complicated functions (where the compiler can create more efficient implementations) but am currently trying to understand the origin of the difference in execution speed.
Expected behavior
I expected structural simplify to optimize the memory allocations, e.g., the naive user code creates multiple temporary vectors and concatenates them to return the update state - I want to leverage compiler knowledge before execution to reduce the temporaries created. In reality, it seems the symbolically generated function allocates more than the user-defined function (and is also an order of magnitude slower)
Minimal Reproducible Example 👇
Output
Environment (please complete the following information):
using Pkg; Pkg.status()
using Pkg; Pkg.status(; mode = PKGMODE_MANIFEST)
versioninfo()
The text was updated successfully, but these errors were encountered: