Refactor and add CUDA JIT compilation to `voigt_profile()` #113

smokestacklightnin · 2023-08-09T00:10:32Z

📝 Description

Type: 🚀 feature

This pull request refactors voigt_profile() and adds CUDA JIT compilation to the same function. Basic unit tests are also included.

This pull request should be merged (hopefully immediately) after pull request #107

🚦 Testing

How did you test these changes?

Testing pipeline
Tests locally using nvidia gpu

☑️ Checklist

I requested two reviewers for this pull request
I updated the documentation according to my changes

codecov · 2023-08-09T00:12:56Z

Codecov Report

Merging #113 (7e322be) into main (9c2da67) will decrease coverage by 3.50%.
The diff coverage is 16.32%.

❗ Current head 7e322be differs from pull request most recent head 02cdb29. Consider uploading reports for the commit 02cdb29 to get more accurate results

@@            Coverage Diff             @@
##             main     #113      +/-   ##
==========================================
- Coverage   72.87%   69.37%   -3.50%     
==========================================
  Files          21       21              
  Lines         693      738      +45     
==========================================
+ Hits          505      512       +7     
- Misses        188      226      +38

Files Changed	Coverage Δ
stardis/opacities/tests/test_voigt.py	`0.00% <0.00%> (ø)`
stardis/opacities/voigt.py	`32.85% <33.33%> (+0.20%)`	⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

vamads

Looks good to me!

tardis-bot · 2023-08-18T18:25:21Z

*beep* *bop*
Hi human,
I ran benchmarks as you asked comparing main (9c2da67) and the latest commit (02cdb29).
Here are the logs produced by ASV.
Results can also be downloaded as artifacts here.
Significantly changed benchmarks:

All benchmarks:

All benchmarks:

     before           after         ratio
   [9c2da67c]       [02cdb293]
     13.4±0.04s       13.2±0.01s     0.98  run_stardis.BenchmarkRunStardis.time_run_stardis

* setting up radiation field structure * further work moving structures around * hook up radiation field to run_stardis() * fix bug with output * removing copied files * add back voigt test * further clean up stardis/base.py * remove transport files * update documentation * Add CUDA GPU JIT compilation to `faddeeva()` (#107) * Add cuda import * Add cuda boilerplate for faddeeva * Add square root of pi as constant to scope of entire file * Add faddeeva_cuda test * Vectorize faddeeva * Refactor faddeeva to be vectorized and work with cuda * Clean up faddeeva_cuda test * Rename variables to keep with convention * Refactor faddeeva_gpu and include associated tests * Add functionality for more datatypes for faddeeva_cuda Also add associated tests * Clean up faddeeva_gpu tests by testing numpy and cuda array types separately * Optimize faddeeva function to be branchless * Use cupy arrays instead of numba.cuda arrays Also call faddeeva_cuda with prespecified numbers of blocks and threads * Only import cupy if the machine has GPUs available * Typecast input to complex * Size should be of output array * Return cupy array by default * Change name of blackbody function. Hook up RadiationField.source_function to raytrace. * add units to stardis outputs (#122) * add units to stardis outputs * explicitly declare units to F_lambda * Benchmark internal STARDIS functions (#85) * Fix posting comment * Try addding token as env var * Put env var at correct place * fix benchmark comment bug * Benchmark raytrace function * Correct path for config soft link * Increase timeout time for raytrace benchmark * Add code to benchmark calc_alpha_line_at_nu * Add push trigger for all branches temporarily * Group all the benchmarking code under one class * Add tracing_lambdas attribute to benchmark class * Update benchmark setup * Fix breaking benchmark pipeline (#128) * Refactor and add CUDA JIT compilation to `voigt_profile()` (#113) * Add pi as a constant with file scope * Use real attribute of datatype rather than numpy call * Vectorize voigt_profile and add tests * Change indenting to improve readability * Typecast inputs to float and remove tests that include complex inputs * Add cuda versions of voigt_profile along with associated tests * Return cupy array by default * Size should be of output array * Fix typo in mathematical formula for `voigt_profile()` * Fix unit tests * Add CUDA JIT to `calc_n_effective()` (#119) * Add basic test for `calc_doppler_width()` * Refactor `calc_doppler_width()` and add test for vectorized implementation * Typecast to float * Add unwrapped cuda implementation of doppler_width Also typecast all global constants to float * Add wrapped cuda implementation of calc_doppler_width * Return cupy array by default * Add test for `calc_n_effective()` * Typecast inputs and refactor * Add cuda implementations and tests for `calc_n_effective()` * Return cupy array by default * Revert changes to `voigt_profile()` formula (#131) * setting up radiation field structure * further work moving structures around * hook up radiation field to run_stardis() * fix bug with output * add back voigt test * removing copied files * further clean up stardis/base.py * remove transport files * update documentation * Change name of blackbody function. Hook up RadiationField.source_function to raytrace. * fix imports * fix broadening merge * apply black * change benchmark paths --------- Co-authored-by: smokestacklightnin <[email protected]> Co-authored-by: light2802 <[email protected]>

* setting up radiation field structure * further work moving structures around * hook up radiation field to run_stardis() * fix bug with output * removing copied files * add back voigt test * further clean up stardis/base.py * remove transport files * update documentation * Add CUDA GPU JIT compilation to `faddeeva()` (tardis-sn#107) * Add cuda import * Add cuda boilerplate for faddeeva * Add square root of pi as constant to scope of entire file * Add faddeeva_cuda test * Vectorize faddeeva * Refactor faddeeva to be vectorized and work with cuda * Clean up faddeeva_cuda test * Rename variables to keep with convention * Refactor faddeeva_gpu and include associated tests * Add functionality for more datatypes for faddeeva_cuda Also add associated tests * Clean up faddeeva_gpu tests by testing numpy and cuda array types separately * Optimize faddeeva function to be branchless * Use cupy arrays instead of numba.cuda arrays Also call faddeeva_cuda with prespecified numbers of blocks and threads * Only import cupy if the machine has GPUs available * Typecast input to complex * Size should be of output array * Return cupy array by default * Change name of blackbody function. Hook up RadiationField.source_function to raytrace. * add units to stardis outputs (tardis-sn#122) * add units to stardis outputs * explicitly declare units to F_lambda * Benchmark internal STARDIS functions (tardis-sn#85) * Fix posting comment * Try addding token as env var * Put env var at correct place * fix benchmark comment bug * Benchmark raytrace function * Correct path for config soft link * Increase timeout time for raytrace benchmark * Add code to benchmark calc_alpha_line_at_nu * Add push trigger for all branches temporarily * Group all the benchmarking code under one class * Add tracing_lambdas attribute to benchmark class * Update benchmark setup * Fix breaking benchmark pipeline (tardis-sn#128) * Refactor and add CUDA JIT compilation to `voigt_profile()` (tardis-sn#113) * Add pi as a constant with file scope * Use real attribute of datatype rather than numpy call * Vectorize voigt_profile and add tests * Change indenting to improve readability * Typecast inputs to float and remove tests that include complex inputs * Add cuda versions of voigt_profile along with associated tests * Return cupy array by default * Size should be of output array * Fix typo in mathematical formula for `voigt_profile()` * Fix unit tests * Add CUDA JIT to `calc_n_effective()` (tardis-sn#119) * Add basic test for `calc_doppler_width()` * Refactor `calc_doppler_width()` and add test for vectorized implementation * Typecast to float * Add unwrapped cuda implementation of doppler_width Also typecast all global constants to float * Add wrapped cuda implementation of calc_doppler_width * Return cupy array by default * Add test for `calc_n_effective()` * Typecast inputs and refactor * Add cuda implementations and tests for `calc_n_effective()` * Return cupy array by default * Revert changes to `voigt_profile()` formula (tardis-sn#131) * setting up radiation field structure * further work moving structures around * hook up radiation field to run_stardis() * fix bug with output * add back voigt test * removing copied files * further clean up stardis/base.py * remove transport files * update documentation * Change name of blackbody function. Hook up RadiationField.source_function to raytrace. * fix imports * fix broadening merge * apply black * change benchmark paths --------- Co-authored-by: smokestacklightnin <[email protected]> Co-authored-by: light2802 <[email protected]>

* Restructure: Add radiation field (#123) * setting up radiation field structure * further work moving structures around * hook up radiation field to run_stardis() * fix bug with output * removing copied files * add back voigt test * further clean up stardis/base.py * remove transport files * update documentation * Add CUDA GPU JIT compilation to `faddeeva()` (#107) * Add cuda import * Add cuda boilerplate for faddeeva * Add square root of pi as constant to scope of entire file * Add faddeeva_cuda test * Vectorize faddeeva * Refactor faddeeva to be vectorized and work with cuda * Clean up faddeeva_cuda test * Rename variables to keep with convention * Refactor faddeeva_gpu and include associated tests * Add functionality for more datatypes for faddeeva_cuda Also add associated tests * Clean up faddeeva_gpu tests by testing numpy and cuda array types separately * Optimize faddeeva function to be branchless * Use cupy arrays instead of numba.cuda arrays Also call faddeeva_cuda with prespecified numbers of blocks and threads * Only import cupy if the machine has GPUs available * Typecast input to complex * Size should be of output array * Return cupy array by default * Change name of blackbody function. Hook up RadiationField.source_function to raytrace. * add units to stardis outputs (#122) * add units to stardis outputs * explicitly declare units to F_lambda * Benchmark internal STARDIS functions (#85) * Fix posting comment * Try addding token as env var * Put env var at correct place * fix benchmark comment bug * Benchmark raytrace function * Correct path for config soft link * Increase timeout time for raytrace benchmark * Add code to benchmark calc_alpha_line_at_nu * Add push trigger for all branches temporarily * Group all the benchmarking code under one class * Add tracing_lambdas attribute to benchmark class * Update benchmark setup * Fix breaking benchmark pipeline (#128) * Refactor and add CUDA JIT compilation to `voigt_profile()` (#113) * Add pi as a constant with file scope * Use real attribute of datatype rather than numpy call * Vectorize voigt_profile and add tests * Change indenting to improve readability * Typecast inputs to float and remove tests that include complex inputs * Add cuda versions of voigt_profile along with associated tests * Return cupy array by default * Size should be of output array * Fix typo in mathematical formula for `voigt_profile()` * Fix unit tests * Add CUDA JIT to `calc_n_effective()` (#119) * Add basic test for `calc_doppler_width()` * Refactor `calc_doppler_width()` and add test for vectorized implementation * Typecast to float * Add unwrapped cuda implementation of doppler_width Also typecast all global constants to float * Add wrapped cuda implementation of calc_doppler_width * Return cupy array by default * Add test for `calc_n_effective()` * Typecast inputs and refactor * Add cuda implementations and tests for `calc_n_effective()` * Return cupy array by default * Revert changes to `voigt_profile()` formula (#131) * setting up radiation field structure * further work moving structures around * hook up radiation field to run_stardis() * fix bug with output * add back voigt test * removing copied files * further clean up stardis/base.py * remove transport files * update documentation * Change name of blackbody function. Hook up RadiationField.source_function to raytrace. * fix imports * fix broadening merge * apply black * change benchmark paths --------- Co-authored-by: smokestacklightnin <[email protected]> Co-authored-by: light2802 <[email protected]> * fix voigt test that got overwritten by the rebase * add black to voigt.py * include fixto voight profile from pr 131 * fix benchmarks * further fix of benchmarks * remove blackbody import comment in radiation_field_solvers/base.py --------- Co-authored-by: smokestacklightnin <[email protected]> Co-authored-by: light2802 <[email protected]>

…#113) * Add pi as a constant with file scope * Use real attribute of datatype rather than numpy call * Vectorize voigt_profile and add tests * Change indenting to improve readability * Typecast inputs to float and remove tests that include complex inputs * Add cuda versions of voigt_profile along with associated tests * Return cupy array by default * Size should be of output array * Fix typo in mathematical formula for `voigt_profile()` * Fix unit tests

* setting up radiation field structure * further work moving structures around * hook up radiation field to run_stardis() * fix bug with output * removing copied files * add back voigt test * further clean up stardis/base.py * remove transport files * update documentation * Add CUDA GPU JIT compilation to `faddeeva()` (tardis-sn#107) * Add cuda import * Add cuda boilerplate for faddeeva * Add square root of pi as constant to scope of entire file * Add faddeeva_cuda test * Vectorize faddeeva * Refactor faddeeva to be vectorized and work with cuda * Clean up faddeeva_cuda test * Rename variables to keep with convention * Refactor faddeeva_gpu and include associated tests * Add functionality for more datatypes for faddeeva_cuda Also add associated tests * Clean up faddeeva_gpu tests by testing numpy and cuda array types separately * Optimize faddeeva function to be branchless * Use cupy arrays instead of numba.cuda arrays Also call faddeeva_cuda with prespecified numbers of blocks and threads * Only import cupy if the machine has GPUs available * Typecast input to complex * Size should be of output array * Return cupy array by default * Change name of blackbody function. Hook up RadiationField.source_function to raytrace. * add units to stardis outputs (tardis-sn#122) * add units to stardis outputs * explicitly declare units to F_lambda * Benchmark internal STARDIS functions (tardis-sn#85) * Fix posting comment * Try addding token as env var * Put env var at correct place * fix benchmark comment bug * Benchmark raytrace function * Correct path for config soft link * Increase timeout time for raytrace benchmark * Add code to benchmark calc_alpha_line_at_nu * Add push trigger for all branches temporarily * Group all the benchmarking code under one class * Add tracing_lambdas attribute to benchmark class * Update benchmark setup * Fix breaking benchmark pipeline (tardis-sn#128) * Refactor and add CUDA JIT compilation to `voigt_profile()` (tardis-sn#113) * Add pi as a constant with file scope * Use real attribute of datatype rather than numpy call * Vectorize voigt_profile and add tests * Change indenting to improve readability * Typecast inputs to float and remove tests that include complex inputs * Add cuda versions of voigt_profile along with associated tests * Return cupy array by default * Size should be of output array * Fix typo in mathematical formula for `voigt_profile()` * Fix unit tests * Add CUDA JIT to `calc_n_effective()` (tardis-sn#119) * Add basic test for `calc_doppler_width()` * Refactor `calc_doppler_width()` and add test for vectorized implementation * Typecast to float * Add unwrapped cuda implementation of doppler_width Also typecast all global constants to float * Add wrapped cuda implementation of calc_doppler_width * Return cupy array by default * Add test for `calc_n_effective()` * Typecast inputs and refactor * Add cuda implementations and tests for `calc_n_effective()` * Return cupy array by default * Revert changes to `voigt_profile()` formula (tardis-sn#131) * setting up radiation field structure * further work moving structures around * hook up radiation field to run_stardis() * fix bug with output * add back voigt test * removing copied files * further clean up stardis/base.py * remove transport files * update documentation * Change name of blackbody function. Hook up RadiationField.source_function to raytrace. * fix imports * fix broadening merge * apply black * change benchmark paths --------- Co-authored-by: smokestacklightnin <[email protected]> Co-authored-by: light2802 <[email protected]>

* Restructure: Add radiation field (tardis-sn#123) * setting up radiation field structure * further work moving structures around * hook up radiation field to run_stardis() * fix bug with output * removing copied files * add back voigt test * further clean up stardis/base.py * remove transport files * update documentation * Add CUDA GPU JIT compilation to `faddeeva()` (tardis-sn#107) * Add cuda import * Add cuda boilerplate for faddeeva * Add square root of pi as constant to scope of entire file * Add faddeeva_cuda test * Vectorize faddeeva * Refactor faddeeva to be vectorized and work with cuda * Clean up faddeeva_cuda test * Rename variables to keep with convention * Refactor faddeeva_gpu and include associated tests * Add functionality for more datatypes for faddeeva_cuda Also add associated tests * Clean up faddeeva_gpu tests by testing numpy and cuda array types separately * Optimize faddeeva function to be branchless * Use cupy arrays instead of numba.cuda arrays Also call faddeeva_cuda with prespecified numbers of blocks and threads * Only import cupy if the machine has GPUs available * Typecast input to complex * Size should be of output array * Return cupy array by default * Change name of blackbody function. Hook up RadiationField.source_function to raytrace. * add units to stardis outputs (tardis-sn#122) * add units to stardis outputs * explicitly declare units to F_lambda * Benchmark internal STARDIS functions (tardis-sn#85) * Fix posting comment * Try addding token as env var * Put env var at correct place * fix benchmark comment bug * Benchmark raytrace function * Correct path for config soft link * Increase timeout time for raytrace benchmark * Add code to benchmark calc_alpha_line_at_nu * Add push trigger for all branches temporarily * Group all the benchmarking code under one class * Add tracing_lambdas attribute to benchmark class * Update benchmark setup * Fix breaking benchmark pipeline (tardis-sn#128) * Refactor and add CUDA JIT compilation to `voigt_profile()` (tardis-sn#113) * Add pi as a constant with file scope * Use real attribute of datatype rather than numpy call * Vectorize voigt_profile and add tests * Change indenting to improve readability * Typecast inputs to float and remove tests that include complex inputs * Add cuda versions of voigt_profile along with associated tests * Return cupy array by default * Size should be of output array * Fix typo in mathematical formula for `voigt_profile()` * Fix unit tests * Add CUDA JIT to `calc_n_effective()` (tardis-sn#119) * Add basic test for `calc_doppler_width()` * Refactor `calc_doppler_width()` and add test for vectorized implementation * Typecast to float * Add unwrapped cuda implementation of doppler_width Also typecast all global constants to float * Add wrapped cuda implementation of calc_doppler_width * Return cupy array by default * Add test for `calc_n_effective()` * Typecast inputs and refactor * Add cuda implementations and tests for `calc_n_effective()` * Return cupy array by default * Revert changes to `voigt_profile()` formula (tardis-sn#131) * setting up radiation field structure * further work moving structures around * hook up radiation field to run_stardis() * fix bug with output * add back voigt test * removing copied files * further clean up stardis/base.py * remove transport files * update documentation * Change name of blackbody function. Hook up RadiationField.source_function to raytrace. * fix imports * fix broadening merge * apply black * change benchmark paths --------- Co-authored-by: smokestacklightnin <[email protected]> Co-authored-by: light2802 <[email protected]> * fix voigt test that got overwritten by the rebase * add black to voigt.py * include fixto voight profile from pr 131 * fix benchmarks * further fix of benchmarks * remove blackbody import comment in radiation_field_solvers/base.py --------- Co-authored-by: smokestacklightnin <[email protected]> Co-authored-by: light2802 <[email protected]>

smokestacklightnin marked this pull request as draft August 9, 2023 00:10

smokestacklightnin requested review from wkerzendorf, andrewfullard, jaladh-singhal, jvshields, isaacgsmith, epassaro and atharva-2001 August 9, 2023 00:10

smokestacklightnin force-pushed the gpu/opacities/voigt-profile/add-cuda branch 3 times, most recently from dde8a02 to 79f9446 Compare August 10, 2023 08:05

andrewfullard removed their request for review August 10, 2023 19:20

smokestacklightnin force-pushed the gpu/opacities/voigt-profile/add-cuda branch from da68531 to 42dbca0 Compare August 10, 2023 20:39

smokestacklightnin assigned vamads and smokestacklightnin Aug 16, 2023

smokestacklightnin added 10 commits August 16, 2023 18:33

Add pi as a constant with file scope

dd0a95e

Use real attribute of datatype rather than numpy call

3606b38

Vectorize voigt_profile and add tests

5c0e10a

Change indenting to improve readability

1210627

Typecast inputs to float and remove tests that include complex inputs

3bce273

Add cuda versions of voigt_profile along with associated tests

b7962fb

Return cupy array by default

915b9e6

Size should be of output array

7f1ae45

Fix typo in mathematical formula for voigt_profile()

683bc52

Fix unit tests

02cdb29

smokestacklightnin force-pushed the gpu/opacities/voigt-profile/add-cuda branch from 62f61d5 to 02cdb29 Compare August 17, 2023 01:33

smokestacklightnin marked this pull request as ready for review August 17, 2023 01:33

vamads approved these changes Aug 18, 2023

View reviewed changes

andrewfullard added the benchmarks Trigger benchmarks to run on this pr label Aug 18, 2023

jvshields approved these changes Aug 18, 2023

View reviewed changes

jvshields merged commit d3ab793 into tardis-sn:main Aug 18, 2023
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor and add CUDA JIT compilation to `voigt_profile()` #113

Refactor and add CUDA JIT compilation to `voigt_profile()` #113

smokestacklightnin commented Aug 9, 2023 •

edited

Loading

codecov bot commented Aug 9, 2023 •

edited

Loading

vamads left a comment

tardis-bot commented Aug 18, 2023

Refactor and add CUDA JIT compilation to voigt_profile() #113

Refactor and add CUDA JIT compilation to voigt_profile() #113

Conversation

smokestacklightnin commented Aug 9, 2023 • edited Loading

📝 Description

🚦 Testing

☑️ Checklist

codecov bot commented Aug 9, 2023 • edited Loading

Codecov Report

vamads left a comment

Choose a reason for hiding this comment

tardis-bot commented Aug 18, 2023

Refactor and add CUDA JIT compilation to `voigt_profile()` #113

Refactor and add CUDA JIT compilation to `voigt_profile()` #113

smokestacklightnin commented Aug 9, 2023 •

edited

Loading

codecov bot commented Aug 9, 2023 •

edited

Loading