-
-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor and add CUDA JIT compilation to voigt_profile()
#113
Refactor and add CUDA JIT compilation to voigt_profile()
#113
Conversation
Codecov Report
@@ Coverage Diff @@
## main #113 +/- ##
==========================================
- Coverage 72.87% 69.37% -3.50%
==========================================
Files 21 21
Lines 693 738 +45
==========================================
+ Hits 505 512 +7
- Misses 188 226 +38
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
dde8a02
to
79f9446
Compare
da68531
to
42dbca0
Compare
62f61d5
to
02cdb29
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
*beep* *bop* All benchmarks: All benchmarks:
before after ratio
[9c2da67c] [02cdb293]
13.4±0.04s 13.2±0.01s 0.98 run_stardis.BenchmarkRunStardis.time_run_stardis
|
* setting up radiation field structure * further work moving structures around * hook up radiation field to run_stardis() * fix bug with output * removing copied files * add back voigt test * further clean up stardis/base.py * remove transport files * update documentation * Add CUDA GPU JIT compilation to `faddeeva()` (#107) * Add cuda import * Add cuda boilerplate for faddeeva * Add square root of pi as constant to scope of entire file * Add faddeeva_cuda test * Vectorize faddeeva * Refactor faddeeva to be vectorized and work with cuda * Clean up faddeeva_cuda test * Rename variables to keep with convention * Refactor faddeeva_gpu and include associated tests * Add functionality for more datatypes for faddeeva_cuda Also add associated tests * Clean up faddeeva_gpu tests by testing numpy and cuda array types separately * Optimize faddeeva function to be branchless * Use cupy arrays instead of numba.cuda arrays Also call faddeeva_cuda with prespecified numbers of blocks and threads * Only import cupy if the machine has GPUs available * Typecast input to complex * Size should be of output array * Return cupy array by default * Change name of blackbody function. Hook up RadiationField.source_function to raytrace. * add units to stardis outputs (#122) * add units to stardis outputs * explicitly declare units to F_lambda * Benchmark internal STARDIS functions (#85) * Fix posting comment * Try addding token as env var * Put env var at correct place * fix benchmark comment bug * Benchmark raytrace function * Correct path for config soft link * Increase timeout time for raytrace benchmark * Add code to benchmark calc_alpha_line_at_nu * Add push trigger for all branches temporarily * Group all the benchmarking code under one class * Add tracing_lambdas attribute to benchmark class * Update benchmark setup * Fix breaking benchmark pipeline (#128) * Refactor and add CUDA JIT compilation to `voigt_profile()` (#113) * Add pi as a constant with file scope * Use real attribute of datatype rather than numpy call * Vectorize voigt_profile and add tests * Change indenting to improve readability * Typecast inputs to float and remove tests that include complex inputs * Add cuda versions of voigt_profile along with associated tests * Return cupy array by default * Size should be of output array * Fix typo in mathematical formula for `voigt_profile()` * Fix unit tests * Add CUDA JIT to `calc_n_effective()` (#119) * Add basic test for `calc_doppler_width()` * Refactor `calc_doppler_width()` and add test for vectorized implementation * Typecast to float * Add unwrapped cuda implementation of doppler_width Also typecast all global constants to float * Add wrapped cuda implementation of calc_doppler_width * Return cupy array by default * Add test for `calc_n_effective()` * Typecast inputs and refactor * Add cuda implementations and tests for `calc_n_effective()` * Return cupy array by default * Revert changes to `voigt_profile()` formula (#131) * setting up radiation field structure * further work moving structures around * hook up radiation field to run_stardis() * fix bug with output * add back voigt test * removing copied files * further clean up stardis/base.py * remove transport files * update documentation * Change name of blackbody function. Hook up RadiationField.source_function to raytrace. * fix imports * fix broadening merge * apply black * change benchmark paths --------- Co-authored-by: smokestacklightnin <[email protected]> Co-authored-by: light2802 <[email protected]>
* setting up radiation field structure * further work moving structures around * hook up radiation field to run_stardis() * fix bug with output * removing copied files * add back voigt test * further clean up stardis/base.py * remove transport files * update documentation * Add CUDA GPU JIT compilation to `faddeeva()` (tardis-sn#107) * Add cuda import * Add cuda boilerplate for faddeeva * Add square root of pi as constant to scope of entire file * Add faddeeva_cuda test * Vectorize faddeeva * Refactor faddeeva to be vectorized and work with cuda * Clean up faddeeva_cuda test * Rename variables to keep with convention * Refactor faddeeva_gpu and include associated tests * Add functionality for more datatypes for faddeeva_cuda Also add associated tests * Clean up faddeeva_gpu tests by testing numpy and cuda array types separately * Optimize faddeeva function to be branchless * Use cupy arrays instead of numba.cuda arrays Also call faddeeva_cuda with prespecified numbers of blocks and threads * Only import cupy if the machine has GPUs available * Typecast input to complex * Size should be of output array * Return cupy array by default * Change name of blackbody function. Hook up RadiationField.source_function to raytrace. * add units to stardis outputs (tardis-sn#122) * add units to stardis outputs * explicitly declare units to F_lambda * Benchmark internal STARDIS functions (tardis-sn#85) * Fix posting comment * Try addding token as env var * Put env var at correct place * fix benchmark comment bug * Benchmark raytrace function * Correct path for config soft link * Increase timeout time for raytrace benchmark * Add code to benchmark calc_alpha_line_at_nu * Add push trigger for all branches temporarily * Group all the benchmarking code under one class * Add tracing_lambdas attribute to benchmark class * Update benchmark setup * Fix breaking benchmark pipeline (tardis-sn#128) * Refactor and add CUDA JIT compilation to `voigt_profile()` (tardis-sn#113) * Add pi as a constant with file scope * Use real attribute of datatype rather than numpy call * Vectorize voigt_profile and add tests * Change indenting to improve readability * Typecast inputs to float and remove tests that include complex inputs * Add cuda versions of voigt_profile along with associated tests * Return cupy array by default * Size should be of output array * Fix typo in mathematical formula for `voigt_profile()` * Fix unit tests * Add CUDA JIT to `calc_n_effective()` (tardis-sn#119) * Add basic test for `calc_doppler_width()` * Refactor `calc_doppler_width()` and add test for vectorized implementation * Typecast to float * Add unwrapped cuda implementation of doppler_width Also typecast all global constants to float * Add wrapped cuda implementation of calc_doppler_width * Return cupy array by default * Add test for `calc_n_effective()` * Typecast inputs and refactor * Add cuda implementations and tests for `calc_n_effective()` * Return cupy array by default * Revert changes to `voigt_profile()` formula (tardis-sn#131) * setting up radiation field structure * further work moving structures around * hook up radiation field to run_stardis() * fix bug with output * add back voigt test * removing copied files * further clean up stardis/base.py * remove transport files * update documentation * Change name of blackbody function. Hook up RadiationField.source_function to raytrace. * fix imports * fix broadening merge * apply black * change benchmark paths --------- Co-authored-by: smokestacklightnin <[email protected]> Co-authored-by: light2802 <[email protected]>
* Restructure: Add radiation field (#123) * setting up radiation field structure * further work moving structures around * hook up radiation field to run_stardis() * fix bug with output * removing copied files * add back voigt test * further clean up stardis/base.py * remove transport files * update documentation * Add CUDA GPU JIT compilation to `faddeeva()` (#107) * Add cuda import * Add cuda boilerplate for faddeeva * Add square root of pi as constant to scope of entire file * Add faddeeva_cuda test * Vectorize faddeeva * Refactor faddeeva to be vectorized and work with cuda * Clean up faddeeva_cuda test * Rename variables to keep with convention * Refactor faddeeva_gpu and include associated tests * Add functionality for more datatypes for faddeeva_cuda Also add associated tests * Clean up faddeeva_gpu tests by testing numpy and cuda array types separately * Optimize faddeeva function to be branchless * Use cupy arrays instead of numba.cuda arrays Also call faddeeva_cuda with prespecified numbers of blocks and threads * Only import cupy if the machine has GPUs available * Typecast input to complex * Size should be of output array * Return cupy array by default * Change name of blackbody function. Hook up RadiationField.source_function to raytrace. * add units to stardis outputs (#122) * add units to stardis outputs * explicitly declare units to F_lambda * Benchmark internal STARDIS functions (#85) * Fix posting comment * Try addding token as env var * Put env var at correct place * fix benchmark comment bug * Benchmark raytrace function * Correct path for config soft link * Increase timeout time for raytrace benchmark * Add code to benchmark calc_alpha_line_at_nu * Add push trigger for all branches temporarily * Group all the benchmarking code under one class * Add tracing_lambdas attribute to benchmark class * Update benchmark setup * Fix breaking benchmark pipeline (#128) * Refactor and add CUDA JIT compilation to `voigt_profile()` (#113) * Add pi as a constant with file scope * Use real attribute of datatype rather than numpy call * Vectorize voigt_profile and add tests * Change indenting to improve readability * Typecast inputs to float and remove tests that include complex inputs * Add cuda versions of voigt_profile along with associated tests * Return cupy array by default * Size should be of output array * Fix typo in mathematical formula for `voigt_profile()` * Fix unit tests * Add CUDA JIT to `calc_n_effective()` (#119) * Add basic test for `calc_doppler_width()` * Refactor `calc_doppler_width()` and add test for vectorized implementation * Typecast to float * Add unwrapped cuda implementation of doppler_width Also typecast all global constants to float * Add wrapped cuda implementation of calc_doppler_width * Return cupy array by default * Add test for `calc_n_effective()` * Typecast inputs and refactor * Add cuda implementations and tests for `calc_n_effective()` * Return cupy array by default * Revert changes to `voigt_profile()` formula (#131) * setting up radiation field structure * further work moving structures around * hook up radiation field to run_stardis() * fix bug with output * add back voigt test * removing copied files * further clean up stardis/base.py * remove transport files * update documentation * Change name of blackbody function. Hook up RadiationField.source_function to raytrace. * fix imports * fix broadening merge * apply black * change benchmark paths --------- Co-authored-by: smokestacklightnin <[email protected]> Co-authored-by: light2802 <[email protected]> * fix voigt test that got overwritten by the rebase * add black to voigt.py * include fixto voight profile from pr 131 * fix benchmarks * further fix of benchmarks * remove blackbody import comment in radiation_field_solvers/base.py --------- Co-authored-by: smokestacklightnin <[email protected]> Co-authored-by: light2802 <[email protected]>
…#113) * Add pi as a constant with file scope * Use real attribute of datatype rather than numpy call * Vectorize voigt_profile and add tests * Change indenting to improve readability * Typecast inputs to float and remove tests that include complex inputs * Add cuda versions of voigt_profile along with associated tests * Return cupy array by default * Size should be of output array * Fix typo in mathematical formula for `voigt_profile()` * Fix unit tests
* setting up radiation field structure * further work moving structures around * hook up radiation field to run_stardis() * fix bug with output * removing copied files * add back voigt test * further clean up stardis/base.py * remove transport files * update documentation * Add CUDA GPU JIT compilation to `faddeeva()` (tardis-sn#107) * Add cuda import * Add cuda boilerplate for faddeeva * Add square root of pi as constant to scope of entire file * Add faddeeva_cuda test * Vectorize faddeeva * Refactor faddeeva to be vectorized and work with cuda * Clean up faddeeva_cuda test * Rename variables to keep with convention * Refactor faddeeva_gpu and include associated tests * Add functionality for more datatypes for faddeeva_cuda Also add associated tests * Clean up faddeeva_gpu tests by testing numpy and cuda array types separately * Optimize faddeeva function to be branchless * Use cupy arrays instead of numba.cuda arrays Also call faddeeva_cuda with prespecified numbers of blocks and threads * Only import cupy if the machine has GPUs available * Typecast input to complex * Size should be of output array * Return cupy array by default * Change name of blackbody function. Hook up RadiationField.source_function to raytrace. * add units to stardis outputs (tardis-sn#122) * add units to stardis outputs * explicitly declare units to F_lambda * Benchmark internal STARDIS functions (tardis-sn#85) * Fix posting comment * Try addding token as env var * Put env var at correct place * fix benchmark comment bug * Benchmark raytrace function * Correct path for config soft link * Increase timeout time for raytrace benchmark * Add code to benchmark calc_alpha_line_at_nu * Add push trigger for all branches temporarily * Group all the benchmarking code under one class * Add tracing_lambdas attribute to benchmark class * Update benchmark setup * Fix breaking benchmark pipeline (tardis-sn#128) * Refactor and add CUDA JIT compilation to `voigt_profile()` (tardis-sn#113) * Add pi as a constant with file scope * Use real attribute of datatype rather than numpy call * Vectorize voigt_profile and add tests * Change indenting to improve readability * Typecast inputs to float and remove tests that include complex inputs * Add cuda versions of voigt_profile along with associated tests * Return cupy array by default * Size should be of output array * Fix typo in mathematical formula for `voigt_profile()` * Fix unit tests * Add CUDA JIT to `calc_n_effective()` (tardis-sn#119) * Add basic test for `calc_doppler_width()` * Refactor `calc_doppler_width()` and add test for vectorized implementation * Typecast to float * Add unwrapped cuda implementation of doppler_width Also typecast all global constants to float * Add wrapped cuda implementation of calc_doppler_width * Return cupy array by default * Add test for `calc_n_effective()` * Typecast inputs and refactor * Add cuda implementations and tests for `calc_n_effective()` * Return cupy array by default * Revert changes to `voigt_profile()` formula (tardis-sn#131) * setting up radiation field structure * further work moving structures around * hook up radiation field to run_stardis() * fix bug with output * add back voigt test * removing copied files * further clean up stardis/base.py * remove transport files * update documentation * Change name of blackbody function. Hook up RadiationField.source_function to raytrace. * fix imports * fix broadening merge * apply black * change benchmark paths --------- Co-authored-by: smokestacklightnin <[email protected]> Co-authored-by: light2802 <[email protected]>
* Restructure: Add radiation field (tardis-sn#123) * setting up radiation field structure * further work moving structures around * hook up radiation field to run_stardis() * fix bug with output * removing copied files * add back voigt test * further clean up stardis/base.py * remove transport files * update documentation * Add CUDA GPU JIT compilation to `faddeeva()` (tardis-sn#107) * Add cuda import * Add cuda boilerplate for faddeeva * Add square root of pi as constant to scope of entire file * Add faddeeva_cuda test * Vectorize faddeeva * Refactor faddeeva to be vectorized and work with cuda * Clean up faddeeva_cuda test * Rename variables to keep with convention * Refactor faddeeva_gpu and include associated tests * Add functionality for more datatypes for faddeeva_cuda Also add associated tests * Clean up faddeeva_gpu tests by testing numpy and cuda array types separately * Optimize faddeeva function to be branchless * Use cupy arrays instead of numba.cuda arrays Also call faddeeva_cuda with prespecified numbers of blocks and threads * Only import cupy if the machine has GPUs available * Typecast input to complex * Size should be of output array * Return cupy array by default * Change name of blackbody function. Hook up RadiationField.source_function to raytrace. * add units to stardis outputs (tardis-sn#122) * add units to stardis outputs * explicitly declare units to F_lambda * Benchmark internal STARDIS functions (tardis-sn#85) * Fix posting comment * Try addding token as env var * Put env var at correct place * fix benchmark comment bug * Benchmark raytrace function * Correct path for config soft link * Increase timeout time for raytrace benchmark * Add code to benchmark calc_alpha_line_at_nu * Add push trigger for all branches temporarily * Group all the benchmarking code under one class * Add tracing_lambdas attribute to benchmark class * Update benchmark setup * Fix breaking benchmark pipeline (tardis-sn#128) * Refactor and add CUDA JIT compilation to `voigt_profile()` (tardis-sn#113) * Add pi as a constant with file scope * Use real attribute of datatype rather than numpy call * Vectorize voigt_profile and add tests * Change indenting to improve readability * Typecast inputs to float and remove tests that include complex inputs * Add cuda versions of voigt_profile along with associated tests * Return cupy array by default * Size should be of output array * Fix typo in mathematical formula for `voigt_profile()` * Fix unit tests * Add CUDA JIT to `calc_n_effective()` (tardis-sn#119) * Add basic test for `calc_doppler_width()` * Refactor `calc_doppler_width()` and add test for vectorized implementation * Typecast to float * Add unwrapped cuda implementation of doppler_width Also typecast all global constants to float * Add wrapped cuda implementation of calc_doppler_width * Return cupy array by default * Add test for `calc_n_effective()` * Typecast inputs and refactor * Add cuda implementations and tests for `calc_n_effective()` * Return cupy array by default * Revert changes to `voigt_profile()` formula (tardis-sn#131) * setting up radiation field structure * further work moving structures around * hook up radiation field to run_stardis() * fix bug with output * add back voigt test * removing copied files * further clean up stardis/base.py * remove transport files * update documentation * Change name of blackbody function. Hook up RadiationField.source_function to raytrace. * fix imports * fix broadening merge * apply black * change benchmark paths --------- Co-authored-by: smokestacklightnin <[email protected]> Co-authored-by: light2802 <[email protected]> * fix voigt test that got overwritten by the rebase * add black to voigt.py * include fixto voight profile from pr 131 * fix benchmarks * further fix of benchmarks * remove blackbody import comment in radiation_field_solvers/base.py --------- Co-authored-by: smokestacklightnin <[email protected]> Co-authored-by: light2802 <[email protected]>
📝 Description
Type: 🚀
feature
This pull request refactors
voigt_profile()
and adds CUDA JIT compilation to the same function. Basic unit tests are also included.This pull request should be merged (hopefully immediately) after pull request #107
🚦 Testing
How did you test these changes?
☑️ Checklist