Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor and add CUDA JIT compilation to voigt_profile() #113

Merged

Conversation

smokestacklightnin
Copy link
Contributor

@smokestacklightnin smokestacklightnin commented Aug 9, 2023

📝 Description

Type: 🚀 feature

This pull request refactors voigt_profile() and adds CUDA JIT compilation to the same function. Basic unit tests are also included.

This pull request should be merged (hopefully immediately) after pull request #107

🚦 Testing

How did you test these changes?

  • Testing pipeline
  • Tests locally using nvidia gpu

☑️ Checklist

  • I requested two reviewers for this pull request
  • I updated the documentation according to my changes

@codecov
Copy link

codecov bot commented Aug 9, 2023

Codecov Report

Merging #113 (7e322be) into main (9c2da67) will decrease coverage by 3.50%.
The diff coverage is 16.32%.

❗ Current head 7e322be differs from pull request most recent head 02cdb29. Consider uploading reports for the commit 02cdb29 to get more accurate results

@@            Coverage Diff             @@
##             main     #113      +/-   ##
==========================================
- Coverage   72.87%   69.37%   -3.50%     
==========================================
  Files          21       21              
  Lines         693      738      +45     
==========================================
+ Hits          505      512       +7     
- Misses        188      226      +38     
Files Changed Coverage Δ
stardis/opacities/tests/test_voigt.py 0.00% <0.00%> (ø)
stardis/opacities/voigt.py 32.85% <33.33%> (+0.20%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@smokestacklightnin smokestacklightnin force-pushed the gpu/opacities/voigt-profile/add-cuda branch 3 times, most recently from dde8a02 to 79f9446 Compare August 10, 2023 08:05
@andrewfullard andrewfullard removed their request for review August 10, 2023 19:20
@smokestacklightnin smokestacklightnin force-pushed the gpu/opacities/voigt-profile/add-cuda branch from da68531 to 42dbca0 Compare August 10, 2023 20:39
@smokestacklightnin smokestacklightnin force-pushed the gpu/opacities/voigt-profile/add-cuda branch from 62f61d5 to 02cdb29 Compare August 17, 2023 01:33
@smokestacklightnin smokestacklightnin marked this pull request as ready for review August 17, 2023 01:33
Copy link

@vamads vamads left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@andrewfullard andrewfullard added the benchmarks Trigger benchmarks to run on this pr label Aug 18, 2023
@tardis-bot
Copy link
Contributor

*beep* *bop*
Hi human,
I ran benchmarks as you asked comparing main (9c2da67) and the latest commit (02cdb29).
Here are the logs produced by ASV.
Results can also be downloaded as artifacts here.
Significantly changed benchmarks:

All benchmarks:

All benchmarks:

     before           after         ratio
   [9c2da67c]       [02cdb293]
     13.4±0.04s       13.2±0.01s     0.98  run_stardis.BenchmarkRunStardis.time_run_stardis

@jvshields jvshields merged commit d3ab793 into tardis-sn:main Aug 18, 2023
7 checks passed
jvshields added a commit that referenced this pull request Aug 21, 2023
* setting up radiation field structure

* further work moving structures around

* hook up radiation field to run_stardis()

* fix bug with output

* removing copied files

* add back voigt test

* further clean up stardis/base.py

* remove transport files

* update documentation

* Add CUDA GPU JIT compilation to `faddeeva()` (#107)

* Add cuda import

* Add cuda boilerplate for faddeeva

* Add square root of pi as constant to scope of entire file

* Add faddeeva_cuda test

* Vectorize faddeeva

* Refactor faddeeva to be vectorized and work with cuda

* Clean up faddeeva_cuda test

* Rename variables to keep with convention

* Refactor faddeeva_gpu and include associated tests

* Add functionality for more datatypes for faddeeva_cuda

Also add associated tests

* Clean up faddeeva_gpu tests by testing numpy and cuda array types separately

* Optimize faddeeva function to be branchless

* Use cupy arrays instead of numba.cuda arrays

Also call faddeeva_cuda with prespecified numbers of blocks and threads

* Only import cupy if the machine has GPUs available

* Typecast input to complex

* Size should be of output array

* Return cupy array by default

* Change name of blackbody function. Hook up RadiationField.source_function to raytrace.

* add units to stardis outputs (#122)

* add units to stardis outputs

* explicitly declare units to F_lambda

* Benchmark internal STARDIS functions (#85)

* Fix posting comment

* Try addding token as env var

* Put env var at correct place

* fix benchmark comment bug

* Benchmark raytrace function

* Correct path for config soft link

* Increase timeout time for raytrace benchmark

* Add code to benchmark calc_alpha_line_at_nu

* Add push trigger for all branches temporarily

* Group all the benchmarking code under one class

* Add tracing_lambdas attribute to benchmark class

* Update benchmark setup

* Fix breaking benchmark pipeline (#128)

* Refactor and add CUDA JIT compilation to `voigt_profile()` (#113)

* Add pi as a constant with file scope

* Use real attribute of datatype rather than numpy call

* Vectorize voigt_profile and add tests

* Change indenting to improve readability

* Typecast inputs to float and remove tests that include complex inputs

* Add cuda versions of voigt_profile along with associated tests

* Return cupy array by default

* Size should be of output array

* Fix typo in mathematical formula for `voigt_profile()`

* Fix unit tests

* Add CUDA JIT to `calc_n_effective()` (#119)

* Add basic test for `calc_doppler_width()`

* Refactor `calc_doppler_width()` and add test for vectorized implementation

* Typecast to float

* Add unwrapped cuda implementation of doppler_width

Also typecast all global constants to float

* Add wrapped cuda implementation of calc_doppler_width

* Return cupy array by default

* Add test for `calc_n_effective()`

* Typecast inputs and refactor

* Add cuda implementations and tests for `calc_n_effective()`

* Return cupy array by default

* Revert changes to `voigt_profile()` formula (#131)

* setting up radiation field structure

* further work moving structures around

* hook up radiation field to run_stardis()

* fix bug with output

* add back voigt test

* removing copied files

* further clean up stardis/base.py

* remove transport files

* update documentation

* Change name of blackbody function. Hook up RadiationField.source_function to raytrace.

* fix imports

* fix broadening merge

* apply black

* change benchmark paths

---------

Co-authored-by: smokestacklightnin <[email protected]>
Co-authored-by: light2802 <[email protected]>
jvshields added a commit to jvshields/stardis that referenced this pull request Aug 22, 2023
* setting up radiation field structure

* further work moving structures around

* hook up radiation field to run_stardis()

* fix bug with output

* removing copied files

* add back voigt test

* further clean up stardis/base.py

* remove transport files

* update documentation

* Add CUDA GPU JIT compilation to `faddeeva()` (tardis-sn#107)

* Add cuda import

* Add cuda boilerplate for faddeeva

* Add square root of pi as constant to scope of entire file

* Add faddeeva_cuda test

* Vectorize faddeeva

* Refactor faddeeva to be vectorized and work with cuda

* Clean up faddeeva_cuda test

* Rename variables to keep with convention

* Refactor faddeeva_gpu and include associated tests

* Add functionality for more datatypes for faddeeva_cuda

Also add associated tests

* Clean up faddeeva_gpu tests by testing numpy and cuda array types separately

* Optimize faddeeva function to be branchless

* Use cupy arrays instead of numba.cuda arrays

Also call faddeeva_cuda with prespecified numbers of blocks and threads

* Only import cupy if the machine has GPUs available

* Typecast input to complex

* Size should be of output array

* Return cupy array by default

* Change name of blackbody function. Hook up RadiationField.source_function to raytrace.

* add units to stardis outputs (tardis-sn#122)

* add units to stardis outputs

* explicitly declare units to F_lambda

* Benchmark internal STARDIS functions (tardis-sn#85)

* Fix posting comment

* Try addding token as env var

* Put env var at correct place

* fix benchmark comment bug

* Benchmark raytrace function

* Correct path for config soft link

* Increase timeout time for raytrace benchmark

* Add code to benchmark calc_alpha_line_at_nu

* Add push trigger for all branches temporarily

* Group all the benchmarking code under one class

* Add tracing_lambdas attribute to benchmark class

* Update benchmark setup

* Fix breaking benchmark pipeline (tardis-sn#128)

* Refactor and add CUDA JIT compilation to `voigt_profile()` (tardis-sn#113)

* Add pi as a constant with file scope

* Use real attribute of datatype rather than numpy call

* Vectorize voigt_profile and add tests

* Change indenting to improve readability

* Typecast inputs to float and remove tests that include complex inputs

* Add cuda versions of voigt_profile along with associated tests

* Return cupy array by default

* Size should be of output array

* Fix typo in mathematical formula for `voigt_profile()`

* Fix unit tests

* Add CUDA JIT to `calc_n_effective()` (tardis-sn#119)

* Add basic test for `calc_doppler_width()`

* Refactor `calc_doppler_width()` and add test for vectorized implementation

* Typecast to float

* Add unwrapped cuda implementation of doppler_width

Also typecast all global constants to float

* Add wrapped cuda implementation of calc_doppler_width

* Return cupy array by default

* Add test for `calc_n_effective()`

* Typecast inputs and refactor

* Add cuda implementations and tests for `calc_n_effective()`

* Return cupy array by default

* Revert changes to `voigt_profile()` formula (tardis-sn#131)

* setting up radiation field structure

* further work moving structures around

* hook up radiation field to run_stardis()

* fix bug with output

* add back voigt test

* removing copied files

* further clean up stardis/base.py

* remove transport files

* update documentation

* Change name of blackbody function. Hook up RadiationField.source_function to raytrace.

* fix imports

* fix broadening merge

* apply black

* change benchmark paths

---------

Co-authored-by: smokestacklightnin <[email protected]>
Co-authored-by: light2802 <[email protected]>
andrewfullard pushed a commit that referenced this pull request Sep 11, 2023
* Restructure: Add radiation field (#123)

* setting up radiation field structure

* further work moving structures around

* hook up radiation field to run_stardis()

* fix bug with output

* removing copied files

* add back voigt test

* further clean up stardis/base.py

* remove transport files

* update documentation

* Add CUDA GPU JIT compilation to `faddeeva()` (#107)

* Add cuda import

* Add cuda boilerplate for faddeeva

* Add square root of pi as constant to scope of entire file

* Add faddeeva_cuda test

* Vectorize faddeeva

* Refactor faddeeva to be vectorized and work with cuda

* Clean up faddeeva_cuda test

* Rename variables to keep with convention

* Refactor faddeeva_gpu and include associated tests

* Add functionality for more datatypes for faddeeva_cuda

Also add associated tests

* Clean up faddeeva_gpu tests by testing numpy and cuda array types separately

* Optimize faddeeva function to be branchless

* Use cupy arrays instead of numba.cuda arrays

Also call faddeeva_cuda with prespecified numbers of blocks and threads

* Only import cupy if the machine has GPUs available

* Typecast input to complex

* Size should be of output array

* Return cupy array by default

* Change name of blackbody function. Hook up RadiationField.source_function to raytrace.

* add units to stardis outputs (#122)

* add units to stardis outputs

* explicitly declare units to F_lambda

* Benchmark internal STARDIS functions (#85)

* Fix posting comment

* Try addding token as env var

* Put env var at correct place

* fix benchmark comment bug

* Benchmark raytrace function

* Correct path for config soft link

* Increase timeout time for raytrace benchmark

* Add code to benchmark calc_alpha_line_at_nu

* Add push trigger for all branches temporarily

* Group all the benchmarking code under one class

* Add tracing_lambdas attribute to benchmark class

* Update benchmark setup

* Fix breaking benchmark pipeline (#128)

* Refactor and add CUDA JIT compilation to `voigt_profile()` (#113)

* Add pi as a constant with file scope

* Use real attribute of datatype rather than numpy call

* Vectorize voigt_profile and add tests

* Change indenting to improve readability

* Typecast inputs to float and remove tests that include complex inputs

* Add cuda versions of voigt_profile along with associated tests

* Return cupy array by default

* Size should be of output array

* Fix typo in mathematical formula for `voigt_profile()`

* Fix unit tests

* Add CUDA JIT to `calc_n_effective()` (#119)

* Add basic test for `calc_doppler_width()`

* Refactor `calc_doppler_width()` and add test for vectorized implementation

* Typecast to float

* Add unwrapped cuda implementation of doppler_width

Also typecast all global constants to float

* Add wrapped cuda implementation of calc_doppler_width

* Return cupy array by default

* Add test for `calc_n_effective()`

* Typecast inputs and refactor

* Add cuda implementations and tests for `calc_n_effective()`

* Return cupy array by default

* Revert changes to `voigt_profile()` formula (#131)

* setting up radiation field structure

* further work moving structures around

* hook up radiation field to run_stardis()

* fix bug with output

* add back voigt test

* removing copied files

* further clean up stardis/base.py

* remove transport files

* update documentation

* Change name of blackbody function. Hook up RadiationField.source_function to raytrace.

* fix imports

* fix broadening merge

* apply black

* change benchmark paths

---------

Co-authored-by: smokestacklightnin <[email protected]>
Co-authored-by: light2802 <[email protected]>

* fix voigt test that got overwritten by the rebase

* add black to voigt.py

* include fixto voight profile from pr 131

* fix benchmarks

* further fix of benchmarks

* remove blackbody import comment in radiation_field_solvers/base.py

---------

Co-authored-by: smokestacklightnin <[email protected]>
Co-authored-by: light2802 <[email protected]>
smokestacklightnin added a commit to smokestacklightnin/stardis that referenced this pull request Sep 20, 2023
…#113)

* Add pi as a constant with file scope

* Use real attribute of datatype rather than numpy call

* Vectorize voigt_profile and add tests

* Change indenting to improve readability

* Typecast inputs to float and remove tests that include complex inputs

* Add cuda versions of voigt_profile along with associated tests

* Return cupy array by default

* Size should be of output array

* Fix typo in mathematical formula for `voigt_profile()`

* Fix unit tests
smokestacklightnin added a commit to smokestacklightnin/stardis that referenced this pull request Sep 20, 2023
* setting up radiation field structure

* further work moving structures around

* hook up radiation field to run_stardis()

* fix bug with output

* removing copied files

* add back voigt test

* further clean up stardis/base.py

* remove transport files

* update documentation

* Add CUDA GPU JIT compilation to `faddeeva()` (tardis-sn#107)

* Add cuda import

* Add cuda boilerplate for faddeeva

* Add square root of pi as constant to scope of entire file

* Add faddeeva_cuda test

* Vectorize faddeeva

* Refactor faddeeva to be vectorized and work with cuda

* Clean up faddeeva_cuda test

* Rename variables to keep with convention

* Refactor faddeeva_gpu and include associated tests

* Add functionality for more datatypes for faddeeva_cuda

Also add associated tests

* Clean up faddeeva_gpu tests by testing numpy and cuda array types separately

* Optimize faddeeva function to be branchless

* Use cupy arrays instead of numba.cuda arrays

Also call faddeeva_cuda with prespecified numbers of blocks and threads

* Only import cupy if the machine has GPUs available

* Typecast input to complex

* Size should be of output array

* Return cupy array by default

* Change name of blackbody function. Hook up RadiationField.source_function to raytrace.

* add units to stardis outputs (tardis-sn#122)

* add units to stardis outputs

* explicitly declare units to F_lambda

* Benchmark internal STARDIS functions (tardis-sn#85)

* Fix posting comment

* Try addding token as env var

* Put env var at correct place

* fix benchmark comment bug

* Benchmark raytrace function

* Correct path for config soft link

* Increase timeout time for raytrace benchmark

* Add code to benchmark calc_alpha_line_at_nu

* Add push trigger for all branches temporarily

* Group all the benchmarking code under one class

* Add tracing_lambdas attribute to benchmark class

* Update benchmark setup

* Fix breaking benchmark pipeline (tardis-sn#128)

* Refactor and add CUDA JIT compilation to `voigt_profile()` (tardis-sn#113)

* Add pi as a constant with file scope

* Use real attribute of datatype rather than numpy call

* Vectorize voigt_profile and add tests

* Change indenting to improve readability

* Typecast inputs to float and remove tests that include complex inputs

* Add cuda versions of voigt_profile along with associated tests

* Return cupy array by default

* Size should be of output array

* Fix typo in mathematical formula for `voigt_profile()`

* Fix unit tests

* Add CUDA JIT to `calc_n_effective()` (tardis-sn#119)

* Add basic test for `calc_doppler_width()`

* Refactor `calc_doppler_width()` and add test for vectorized implementation

* Typecast to float

* Add unwrapped cuda implementation of doppler_width

Also typecast all global constants to float

* Add wrapped cuda implementation of calc_doppler_width

* Return cupy array by default

* Add test for `calc_n_effective()`

* Typecast inputs and refactor

* Add cuda implementations and tests for `calc_n_effective()`

* Return cupy array by default

* Revert changes to `voigt_profile()` formula (tardis-sn#131)

* setting up radiation field structure

* further work moving structures around

* hook up radiation field to run_stardis()

* fix bug with output

* add back voigt test

* removing copied files

* further clean up stardis/base.py

* remove transport files

* update documentation

* Change name of blackbody function. Hook up RadiationField.source_function to raytrace.

* fix imports

* fix broadening merge

* apply black

* change benchmark paths

---------

Co-authored-by: smokestacklightnin <[email protected]>
Co-authored-by: light2802 <[email protected]>
smokestacklightnin added a commit to smokestacklightnin/stardis that referenced this pull request Sep 20, 2023
* Restructure: Add radiation field (tardis-sn#123)

* setting up radiation field structure

* further work moving structures around

* hook up radiation field to run_stardis()

* fix bug with output

* removing copied files

* add back voigt test

* further clean up stardis/base.py

* remove transport files

* update documentation

* Add CUDA GPU JIT compilation to `faddeeva()` (tardis-sn#107)

* Add cuda import

* Add cuda boilerplate for faddeeva

* Add square root of pi as constant to scope of entire file

* Add faddeeva_cuda test

* Vectorize faddeeva

* Refactor faddeeva to be vectorized and work with cuda

* Clean up faddeeva_cuda test

* Rename variables to keep with convention

* Refactor faddeeva_gpu and include associated tests

* Add functionality for more datatypes for faddeeva_cuda

Also add associated tests

* Clean up faddeeva_gpu tests by testing numpy and cuda array types separately

* Optimize faddeeva function to be branchless

* Use cupy arrays instead of numba.cuda arrays

Also call faddeeva_cuda with prespecified numbers of blocks and threads

* Only import cupy if the machine has GPUs available

* Typecast input to complex

* Size should be of output array

* Return cupy array by default

* Change name of blackbody function. Hook up RadiationField.source_function to raytrace.

* add units to stardis outputs (tardis-sn#122)

* add units to stardis outputs

* explicitly declare units to F_lambda

* Benchmark internal STARDIS functions (tardis-sn#85)

* Fix posting comment

* Try addding token as env var

* Put env var at correct place

* fix benchmark comment bug

* Benchmark raytrace function

* Correct path for config soft link

* Increase timeout time for raytrace benchmark

* Add code to benchmark calc_alpha_line_at_nu

* Add push trigger for all branches temporarily

* Group all the benchmarking code under one class

* Add tracing_lambdas attribute to benchmark class

* Update benchmark setup

* Fix breaking benchmark pipeline (tardis-sn#128)

* Refactor and add CUDA JIT compilation to `voigt_profile()` (tardis-sn#113)

* Add pi as a constant with file scope

* Use real attribute of datatype rather than numpy call

* Vectorize voigt_profile and add tests

* Change indenting to improve readability

* Typecast inputs to float and remove tests that include complex inputs

* Add cuda versions of voigt_profile along with associated tests

* Return cupy array by default

* Size should be of output array

* Fix typo in mathematical formula for `voigt_profile()`

* Fix unit tests

* Add CUDA JIT to `calc_n_effective()` (tardis-sn#119)

* Add basic test for `calc_doppler_width()`

* Refactor `calc_doppler_width()` and add test for vectorized implementation

* Typecast to float

* Add unwrapped cuda implementation of doppler_width

Also typecast all global constants to float

* Add wrapped cuda implementation of calc_doppler_width

* Return cupy array by default

* Add test for `calc_n_effective()`

* Typecast inputs and refactor

* Add cuda implementations and tests for `calc_n_effective()`

* Return cupy array by default

* Revert changes to `voigt_profile()` formula (tardis-sn#131)

* setting up radiation field structure

* further work moving structures around

* hook up radiation field to run_stardis()

* fix bug with output

* add back voigt test

* removing copied files

* further clean up stardis/base.py

* remove transport files

* update documentation

* Change name of blackbody function. Hook up RadiationField.source_function to raytrace.

* fix imports

* fix broadening merge

* apply black

* change benchmark paths

---------

Co-authored-by: smokestacklightnin <[email protected]>
Co-authored-by: light2802 <[email protected]>

* fix voigt test that got overwritten by the rebase

* add black to voigt.py

* include fixto voight profile from pr 131

* fix benchmarks

* further fix of benchmarks

* remove blackbody import comment in radiation_field_solvers/base.py

---------

Co-authored-by: smokestacklightnin <[email protected]>
Co-authored-by: light2802 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
benchmarks Trigger benchmarks to run on this pr
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants