Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement subcell limiting for non-conservative systems #1670

Merged
merged 49 commits into from
Oct 31, 2023

Conversation

amrueda
Copy link
Contributor

@amrueda amrueda commented Oct 10, 2023

This PR extends the subcell limiting strategies implemented in #1476 to non-conservative systems.

To do the subcell limiting, the DGSEM is formulated as a flux-differencing formula. For non-conservative systems, this formulation requires expressing non-conservative terms as the product of both local and symmetric components. I have included an example using the GLM-MHD equations, where the Powell and GLM non-conservative terms have been recast in the format of "local * symmetric."

…ative systems

-> A working version of this implementation is added for the GLM-MHD system
-> The flux-differencing formula requires non-conservative terms of the
   form (local * symmetric)... I modified equations/ideal_glm_mhd_2d.jl and
   solvers/dgsem_tree/dg_2d.jl to make it work
-> In this first implementation, we only use the Powell term and
   deactivate the GLM term
…he modified Powell source term. This was needed due to incompatibility on non-conforming meshes.
@github-actions
Copy link
Contributor

github-actions bot commented Oct 10, 2023

Review checklist

This checklist is meant to assist creators of PRs (to let them know what reviewers will typically look for) and reviewers (to guide them in a structured review process). Items do not need to be checked explicitly for a PR to be eligible for merging.

Purpose and scope

  • The PR has a single goal that is clear from the PR title and/or description.
  • All code changes represent a single set of modifications that logically belong together.
  • No more than 500 lines of code are changed or there is no obvious way to split the PR into multiple PRs.

Code quality

  • The code can be understood easily.
  • Newly introduced names for variables etc. are self-descriptive and consistent with existing naming conventions.
  • There are no redundancies that can be removed by simple modularization/refactoring.
  • There are no leftover debug statements or commented code sections.
  • The code adheres to our conventions and style guide, and to the Julia guidelines.

Documentation

  • New functions and types are documented with a docstring or top-level comment.
  • Relevant publications are referenced in docstrings (see example for formatting).
  • Inline comments are used to document longer or unusual code sections.
  • Comments describe intent ("why?") and not just functionality ("what?").
  • If the PR introduces a significant change or new feature, it is documented in NEWS.md.

Testing

  • The PR passes all tests.
  • New or modified lines of code are covered by tests.
  • New or modified tests run in less then 10 seconds.

Performance

  • There are no type instabilities or memory allocations in performance-critical parts.
  • If the PR intent is to improve performance, before/after time measurements are posted in the PR.

Verification

  • The correctness of the code was verified using appropriate tests.
  • If new equations/methods are added, a convergence test has been run and the results
    are posted in the PR.

Created with ❤️ by the Trixi.jl community.

@codecov
Copy link

codecov bot commented Oct 10, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (0f49e5b) 90.84% compared to head (e836b7f) 82.73%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1670      +/-   ##
==========================================
- Coverage   90.84%   82.73%   -8.11%     
==========================================
  Files         430      431       +1     
  Lines       34437    34672     +235     
==========================================
- Hits        31282    28683    -2599     
- Misses       3155     5989    +2834     
Flag Coverage Δ
unittests 82.73% <100.00%> (-8.11%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
...tree_2d_dgsem/elixir_mhd_shockcapturing_subcell.jl 100.00% <100.00%> (ø)
src/Trixi.jl 43.48% <ø> (ø)
...llbacks_stage/subcell_limiter_idp_correction_2d.jl 94.12% <100.00%> (ø)
src/equations/equations.jl 98.11% <100.00%> (+0.07%) ⬆️
src/equations/ideal_glm_mhd_2d.jl 98.80% <100.00%> (+14.51%) ⬆️
src/solvers/dgsem_tree/containers_2d.jl 57.14% <100.00%> (-39.49%) ⬇️
src/solvers/dgsem_tree/dg_2d_subcell_limiters.jl 99.56% <100.00%> (+0.71%) ⬆️
src/solvers/dgsem_tree/subcell_limiters_2d.jl 94.74% <100.00%> (ø)
src/time_integration/methods_SSP.jl 83.70% <100.00%> (+0.36%) ⬆️

... and 140 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@amrueda
Copy link
Contributor Author

amrueda commented Oct 11, 2023

I ran a convergence test (5 different meshes and polydeg = 3) with a modified version of examples/tree_2d_dgsem/elixir_mhd_alfven_wave.jl, where I activated the positivity limiter. The l2 and linf errors are equal to the ones produced by VolumeIntegralShockCapturingHG and VolumeIntegralFluxDifferencing to machine precision accuracy. Here are the convergence tables:

l2
rho                 rho_v1              rho_v2              rho_v3              rho_e               B1                  B2                  B3                  psi
error     EOC       error     EOC       error     EOC       error     EOC       error     EOC       error     EOC       error     EOC       error     EOC       error     EOC
5.51e-03  -         4.77e-04  -         4.77e-04  -         9.35e-04  -         2.84e-04  -         3.73e-04  -         3.73e-04  -         9.33e-04  -         3.80e-04  -
9.09e-04  2.60      5.81e-05  3.04      5.81e-05  3.04      8.11e-05  3.53      2.34e-05  3.60      2.09e-05  4.16      2.09e-05  4.16      4.08e-05  4.51      9.42e-06  5.33
1.11e-04  3.03      5.94e-06  3.29      5.94e-06  3.29      8.54e-06  3.25      1.32e-06  4.15      1.47e-06  3.83      1.47e-06  3.83      2.16e-06  4.24      4.65e-07  4.34
9.46e-06  3.56      4.82e-07  3.62      4.82e-07  3.62      7.11e-07  3.59      8.35e-08  3.98      1.19e-07  3.63      1.19e-07  3.63      1.67e-07  3.69      2.81e-08  4.05
3.17e-07  4.90      2.10e-08  4.52      2.10e-08  4.52      2.85e-08  4.64      4.19e-09  4.32      1.28e-08  3.21      1.28e-08  3.21      1.80e-08  3.22      1.74e-09  4.01

mean      3.52      mean      3.62      mean      3.62      mean      3.75      mean      4.01      mean      3.71      mean      3.71      mean      3.92      mean      4.43
----------------------------------------------------------------------------------------------------
linf
rho                 rho_v1              rho_v2              rho_v3              rho_e               B1                  B2                  B3                  psi
error     EOC       error     EOC       error     EOC       error     EOC       error     EOC       error     EOC       error     EOC       error     EOC       error     EOC
1.33e-02  -         1.97e-03  -         1.97e-03  -         2.68e-03  -         1.09e-03  -         1.99e-03  -         1.99e-03  -         3.71e-03  -         1.14e-03  -
1.86e-03  2.83      1.80e-04  3.45      1.80e-04  3.45      2.45e-04  3.45      8.61e-05  3.66      1.28e-04  3.95      1.28e-04  3.95      2.31e-04  4.00      3.86e-05  4.88
2.68e-04  2.80      1.73e-05  3.38      1.73e-05  3.38      2.61e-05  3.23      5.46e-06  3.98      9.23e-06  3.80      9.23e-06  3.80      1.40e-05  4.05      2.37e-06  4.02
2.35e-05  3.51      1.39e-06  3.64      1.39e-06  3.64      2.71e-06  3.27      3.56e-07  3.94      6.15e-07  3.91      6.15e-07  3.91      8.90e-07  3.97      1.47e-07  4.01
9.69e-07  4.60      9.39e-08  3.88      9.39e-08  3.88      9.04e-08  4.90      2.40e-08  3.89      4.71e-08  3.71      4.71e-08  3.71      6.77e-08  3.72      9.15e-09  4.00

mean      3.44      mean      3.59      mean      3.59      mean      3.71      mean      3.87      mean      3.84      mean      3.84      mean      3.94      mean      4.23
----------------------------------------------------------------------------------------------------

@amrueda
Copy link
Contributor Author

amrueda commented Oct 11, 2023

When running the convergence test documented above, the following performance summary was obtained with the novel subcell limiting for non-conservative systems and one thread:

  ────────────────────────────────────────────────────────────────────────────────────────
                Trixi.jl                        Time                    Allocations
                                       ───────────────────────   ────────────────────────
           Tot / % measured:                 125s /  97.4%           3.03GiB /  99.9%

 Section                       ncalls     time    %tot     avg     alloc    %tot      avg
 ────────────────────────────────────────────────────────────────────────────────────────
 rhs!                           4.87k     103s   84.6%  21.2ms   11.5KiB    0.0%    2.43B
   volume integral              4.87k    84.0s   68.9%  17.3ms   2.20KiB    0.0%    0.46B
     calcflux_fhat!             19.9M    45.9s   37.6%  2.30μs     0.00B    0.0%    0.00B
     calcflux_fv!               19.9M    22.0s   18.1%  1.10μs     0.00B    0.0%    0.00B
     calcflux_antidiffusive!    19.9M    8.47s    6.9%   425ns     0.00B    0.0%    0.00B
     ~volume integral~          4.87k    7.67s    6.3%  1.58ms   2.20KiB    0.0%    0.46B
   interface flux               4.87k    11.7s    9.6%  2.40ms     0.00B    0.0%    0.00B
   prolong2interfaces           4.87k    3.37s    2.8%   693μs     0.00B    0.0%    0.00B
   surface integral             4.87k    2.04s    1.7%   419μs     0.00B    0.0%    0.00B
   reset ∂u/∂t                  4.87k    1.33s    1.1%   273μs     0.00B    0.0%    0.00B
   Jacobian                     4.87k    593ms    0.5%   122μs     0.00B    0.0%    0.00B
   ~rhs!~                       4.87k   24.6ms    0.0%  5.05μs   9.33KiB    0.0%    1.96B
   prolong2boundaries           4.87k   1.97ms    0.0%   406ns     0.00B    0.0%    0.00B
   prolong2mortars              4.87k   1.83ms    0.0%   376ns     0.00B    0.0%    0.00B
   mortar flux                  4.87k   1.44ms    0.0%   296ns     0.00B    0.0%    0.00B
   boundary flux                4.87k    266μs    0.0%  54.6ns     0.00B    0.0%    0.00B
   source terms                 4.87k    220μs    0.0%  45.2ns     0.00B    0.0%    0.00B
 a posteriori limiter           4.87k    12.1s   10.0%  2.49ms   1.47KiB    0.0%    0.31B
   blending factors             4.87k    7.01s    5.8%  1.44ms      752B    0.0%    0.15B
     positivity                 4.87k    6.41s    5.3%  1.32ms     0.00B    0.0%    0.00B
     ~blending factors~         4.87k    602ms    0.5%   124μs      752B    0.0%    0.15B
   ~a posteriori limiter~       4.87k    5.12s    4.2%  1.05ms      752B    0.0%    0.15B
 I/O                              165    3.64s    3.0%  22.1ms   2.17GiB   71.6%  13.4MiB
   save solution                  164    3.63s    3.0%  22.1ms   2.17GiB   71.5%  13.5MiB
   get element variables          164   7.10ms    0.0%  43.3μs   1.44MiB    0.0%  8.98KiB
   ~I/O~                          165   2.00ms    0.0%  12.1μs    379KiB    0.0%  2.30KiB
   save mesh                      164   59.1μs    0.0%   360ns     0.00B    0.0%    0.00B
 calculate dt                   1.62k    2.15s    1.8%  1.32ms     0.00B    0.0%    0.00B
 analyze solution                  18    888ms    0.7%  49.3ms    880MiB   28.4%  48.9MiB
 ────────────────────────────────────────────────────────────────────────────────────────

The summary obtained with VolumeIntegralShockCapturingHG and one thread is:

 ────────────────────────────────────────────────────────────────────────────────────
              Trixi.jl                      Time                    Allocations      
                                   ───────────────────────   ────────────────────────
         Tot / % measured:              49.9s /  94.0%           3.03GiB /  99.9%    

 Section                   ncalls     time    %tot     avg     alloc    %tot      avg
 ────────────────────────────────────────────────────────────────────────────────────
 rhs!                       4.87k    41.2s   87.8%  8.47ms   11.5KiB    0.0%    2.43B
   volume integral          4.87k    25.2s   53.6%  5.17ms   2.20KiB    0.0%    0.46B
     pure DG                4.87k    21.8s   46.5%  4.48ms     0.00B    0.0%    0.00B
     blending factors       4.87k    3.23s    6.9%   663μs     0.00B    0.0%    0.00B
     ~volume integral~      4.87k    103ms    0.2%  21.1μs   2.20KiB    0.0%    0.46B
     blended DG-FV          4.87k   1.04ms    0.0%   213ns     0.00B    0.0%    0.00B
   interface flux           4.87k    11.3s   24.0%  2.32ms     0.00B    0.0%    0.00B
   prolong2interfaces       4.87k    1.77s    3.8%   364μs     0.00B    0.0%    0.00B
   surface integral         4.87k    1.70s    3.6%   350μs     0.00B    0.0%    0.00B
   reset ∂u/∂t              4.87k    754ms    1.6%   155μs     0.00B    0.0%    0.00B
   Jacobian                 4.87k    556ms    1.2%   114μs     0.00B    0.0%    0.00B
   ~rhs!~                   4.87k   17.1ms    0.0%  3.51μs   9.33KiB    0.0%    1.96B
   prolong2mortars          4.87k   2.41ms    0.0%   496ns     0.00B    0.0%    0.00B
   prolong2boundaries       4.87k   2.28ms    0.0%   468ns     0.00B    0.0%    0.00B
   source terms             4.87k    716μs    0.0%   147ns     0.00B    0.0%    0.00B
   mortar flux              4.87k    687μs    0.0%   141ns     0.00B    0.0%    0.00B
   boundary flux            4.87k    371μs    0.0%  76.3ns     0.00B    0.0%    0.00B
 I/O                          165    2.96s    6.3%  17.9ms   2.17GiB   71.6%  13.4MiB
   save solution              164    2.84s    6.0%  17.3ms   2.17GiB   71.5%  13.5MiB
   get element variables      164    115ms    0.2%   704μs   1.41MiB    0.0%  8.80KiB
   ~I/O~                      165   1.89ms    0.0%  11.5μs    379KiB    0.0%  2.30KiB
   save mesh                  164   77.8μs    0.0%   474ns     0.00B    0.0%    0.00B
 calculate dt               1.62k    2.08s    4.4%  1.28ms     0.00B    0.0%    0.00B
 analyze solution              18    676ms    1.4%  37.5ms    881MiB   28.4%  48.9MiB
 ────────────────────────────────────────────────────────────────────────────────────

It is clear that the volume integral with subcell limiting is significantly more expensive than VolumeIntegralShockCapturingHG. Some extra cost was expected due to the fact that the new volume integral computes two non-conservative terms (Powell and GLM separately) instead of one. However, this seems like an extreme slow down. I tried to improve the performance, but so far this is the best I achieved...

src/equations/equations.jl Outdated Show resolved Hide resolved
src/equations/equations.jl Outdated Show resolved Hide resolved
src/equations/equations.jl Outdated Show resolved Hide resolved
src/equations/equations.jl Outdated Show resolved Hide resolved
src/equations/equations.jl Outdated Show resolved Hide resolved
@amrueda
Copy link
Contributor Author

amrueda commented Oct 12, 2023

Thanks for your comments, @ranocha! I already applied your suggestions and replied to your questions.

Do you have any clue of what might be making my code slow? (see performance post)

It seems that my problem is in the function calcflux_fhat!, which is taking a total of 45.9s to compute the DG volume integral with non-conservative terms for the test above. This is more than twice the time needed by flux_differencing_kernel!, which takes 21.8s to compute the DG volume integral with non-conservative terms.

As I wrote above,

Some extra cost was expected due to the fact that the new volume integral computes two non-conservative terms (Powell and GLM separately) instead of one. However, this seems like an extreme slow down.

Do you see any performance issues in calcflux_fhat!? or do you have any idea about how I can speed up that function?

@ranocha
Copy link
Member

ranocha commented Oct 13, 2023

Do you have any clue of what might be making my code slow? (see performance post)

It seems that my problem is in the function calcflux_fhat!, which is taking a total of 45.9s to compute the DG volume integral with non-conservative terms for the test above. This is more than twice the time needed by flux_differencing_kernel!, which takes 21.8s to compute the DG volume integral with non-conservative terms.

As I wrote above,

Some extra cost was expected due to the fact that the new volume integral computes two non-conservative terms (Powell and GLM separately) instead of one. However, this seems like an extreme slow down.

Do you see any performance issues in calcflux_fhat!? or do you have any idea about how I can speed up that function?

Did you try starting Julia with the flag --check-bounds=no? It appears to make a significant difference for me in this case.

@amrueda
Copy link
Contributor Author

amrueda commented Oct 16, 2023

Did you try starting Julia with the flag --check-bounds=no? It appears to make a significant difference for me in this case.

Yes, the results I reported were obtained with --check-bounds=no.

src/equations/equations.jl Outdated Show resolved Hide resolved
src/equations/ideal_glm_mhd_2d.jl Outdated Show resolved Hide resolved
src/equations/ideal_glm_mhd_2d.jl Outdated Show resolved Hide resolved
src/equations/ideal_glm_mhd_2d.jl Outdated Show resolved Hide resolved
src/equations/ideal_glm_mhd_2d.jl Show resolved Hide resolved
src/equations/equations.jl Outdated Show resolved Hide resolved
Co-authored-by: Hendrik Ranocha <[email protected]>
ranocha
ranocha previously approved these changes Oct 25, 2023
Copy link
Member

@ranocha ranocha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@ranocha ranocha requested a review from sloede October 25, 2023 11:41
src/solvers/dgsem_tree/dg_2d_subcell_limiters.jl Outdated Show resolved Hide resolved
src/equations/equations.jl Outdated Show resolved Hide resolved
…ut a method

Co-authored-by: Michael Schlottke-Lakemper <[email protected]>
Co-authored-by: Michael Schlottke-Lakemper <[email protected]>
@ranocha ranocha requested a review from sloede October 26, 2023 06:37
Copy link
Member

@sloede sloede left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@amrueda amrueda merged commit 61c33b0 into trixi-framework:main Oct 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants