GPU-based FFT + CMake cleanup + linalg::matrix_product #44

mhaseeb123 · 2023-10-20T03:36:43Z

What's new:

Adding optimized single-source FFT code for CPUs and GPUs
Cleanup CMake files - Tested with nvc++/23.7 and nvc++/23.1 with both -stdpar=multicore and -stdpar=gpu
Removed the --gcc-toolchain flag from the CMake files to encourage the use of localrc file (already setup for nvc++/23.7 btw or can be done by exporting GCCLOCALRC=/path/to/localrc) instead of the bug-prone flag.
Added use of std::experimental::linalg library for FFT validation.
Performance analysis for the FFT codes remains until PM GPUs free up for use. Hoping for the best though 🤞🏼
FIXME: clang-format needed on all source and CMake files.

Note: FFT apps give a linker error for libcublas with nvc++/23.1. Added a commented out line in its CMake file which can be uncommented to add the libcublas path if needed. Interestingly nvc++/23.7 does not complain about libcublas.

weilewei

The code looks good overall to me! Thanks for the contribution!

One major concern is about extensive usage of macros. As our codebase grow, we probably need a better documentation for how does each macro is used. In a more ideal world, less macros, the better.

CMakeLists.txt

apps/1d_stencil/stencil_stdexec.cpp

apps/fft/fft-serial.cpp

apps/fft/fft-stdexec.cpp

apps/fft/fft.hpp

apps/fft/fft-stdexec.cpp

mhaseeb123 added 2 commits October 18, 2023 19:50

fft gpu - perf analysis remains

46f1c4e

optimize fft, cmake cleanup, using linalg in DFT

af2a10d

mhaseeb123 requested a review from weilewei October 20, 2023 03:36

weilewei reviewed Oct 20, 2023

View reviewed changes

apps/fft/fft.hpp Show resolved Hide resolved

weilewei reviewed Oct 20, 2023

View reviewed changes

apps/fft/fft-stdexec.cpp Outdated Show resolved Hide resolved

updates for review on #44

99c4ce5

mhaseeb123 merged commit b2f02ec into main Oct 20, 2023
1 check failed

mhaseeb123 deleted the fft-gpu-new branch October 20, 2023 20:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPU-based FFT + CMake cleanup + linalg::matrix_product #44

GPU-based FFT + CMake cleanup + linalg::matrix_product #44

mhaseeb123 commented Oct 20, 2023 •

edited

Loading

weilewei left a comment

GPU-based FFT + CMake cleanup + linalg::matrix_product #44

GPU-based FFT + CMake cleanup + linalg::matrix_product #44

Conversation

mhaseeb123 commented Oct 20, 2023 • edited Loading

weilewei left a comment

Choose a reason for hiding this comment

mhaseeb123 commented Oct 20, 2023 •

edited

Loading