Skip to content
This repository has been archived by the owner on Nov 2, 2023. It is now read-only.

GPU-based FFT + CMake cleanup + linalg::matrix_product #44

Merged
merged 3 commits into from
Oct 20, 2023
Merged

Conversation

mhaseeb123
Copy link
Owner

@mhaseeb123 mhaseeb123 commented Oct 20, 2023

What's new:

  • Adding optimized single-source FFT code for CPUs and GPUs
  • Cleanup CMake files - Tested with nvc++/23.7 and nvc++/23.1 with both -stdpar=multicore and -stdpar=gpu
  • Removed the --gcc-toolchain flag from the CMake files to encourage the use of localrc file (already setup for nvc++/23.7 btw or can be done by exporting GCCLOCALRC=/path/to/localrc) instead of the bug-prone flag.
  • Added use of std::experimental::linalg library for FFT validation.
  • Performance analysis for the FFT codes remains until PM GPUs free up for use. Hoping for the best though 🤞🏼
  • FIXME: clang-format needed on all source and CMake files.

Note: FFT apps give a linker error for libcublas with nvc++/23.1. Added a commented out line in its CMake file which can be uncommented to add the libcublas path if needed. Interestingly nvc++/23.7 does not complain about libcublas.

@mhaseeb123 mhaseeb123 requested a review from weilewei October 20, 2023 03:36
Copy link
Collaborator

@weilewei weilewei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks good overall to me! Thanks for the contribution!

One major concern is about extensive usage of macros. As our codebase grow, we probably need a better documentation for how does each macro is used. In a more ideal world, less macros, the better.

CMakeLists.txt Outdated Show resolved Hide resolved
CMakeLists.txt Outdated Show resolved Hide resolved
apps/1d_stencil/stencil_stdexec.cpp Outdated Show resolved Hide resolved
apps/fft/fft-serial.cpp Outdated Show resolved Hide resolved
apps/fft/fft-serial.cpp Outdated Show resolved Hide resolved
apps/fft/fft-stdexec.cpp Outdated Show resolved Hide resolved
apps/fft/fft-stdexec.cpp Outdated Show resolved Hide resolved
apps/fft/fft.hpp Outdated Show resolved Hide resolved
apps/fft/fft.hpp Outdated Show resolved Hide resolved
apps/fft/fft.hpp Outdated Show resolved Hide resolved
apps/fft/fft-stdexec.cpp Outdated Show resolved Hide resolved
@mhaseeb123 mhaseeb123 merged commit b2f02ec into main Oct 20, 2023
1 check failed
@mhaseeb123 mhaseeb123 deleted the fft-gpu-new branch October 20, 2023 20:14
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants