Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kernel optimization #22

Open
3 of 7 tasks
huiyuxie opened this issue Aug 22, 2024 · 1 comment
Open
3 of 7 tasks

Kernel optimization #22

huiyuxie opened this issue Aug 22, 2024 · 1 comment
Assignees
Labels
enhancement New feature or request performance Improve performance

Comments

@huiyuxie
Copy link
Member

huiyuxie commented Aug 22, 2024

The kernel optimization has already marked as TODO in this repo.

Check the below tasks:

  • Remove control flow divergence from boundary_flux_kernel!
  • Refactor set_diagonal_to_zero! in cuda_volume_integral! and check the precision using IdealGlmMhdEquations
  • Change set_log_type("log_Base") and set_sqrt_type("sqrt_Base") to other ways
  • See whether a prototype for create_cache is needed or not
  • Possible change from isequal to simple math expression and compare performance
  • Use synchronization to combine close kernels
  • Combine two similar kernels in mortar kernel
@huiyuxie huiyuxie self-assigned this Aug 22, 2024
@huiyuxie huiyuxie added the enhancement New feature or request label Aug 22, 2024
@huiyuxie huiyuxie changed the title Control flow divergence in boundary_flux_kernel! Kernel optimization Aug 26, 2024
@huiyuxie huiyuxie added the good first issue Good for newcomers label Aug 26, 2024
@huiyuxie huiyuxie added the performance Improve performance label Sep 3, 2024
@huiyuxie
Copy link
Member Author

Possible change from isequal to simple math expression and compare performance - this one need further check with benchmark especially for 3D problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request performance Improve performance
Projects
None yet
Development

No branches or pull requests

1 participant