You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This PR inherits commits originally introduced in PR #67.
The discussion of some of the details can also be found there.
The list of changes is as follows:
1. Introduces a CMake build option for GPU support (specifically, CUDA
support) in RandLAPACK. This is enabled with ``-DRequireCUDA=ON``.
2. Introduces rl_cuda_kernels.cuh - file contains various utility GPU
functions, including some BLAS and LAPACK-level routines.
3. Introduces rl_cqrrpt_gpu.cuh, - a GPU version of CQRRPT. Note that
since many parts of CQRRPT (including sketching) do not (currently) have
GPU versions, the data offload happens inside of the algorithm. The
input data is expected to be located on a CPU.
4. Introduces rl_cqrrp_gpu.cuh - a GPU version of CQRRP algorithm, which
accepts data allocated on a GPU.
5. Includes tests for the functions from the above files and benchmarks
(living in test space) for CQRRP algorithm. In the future, these should
be moved into benchmarking space (built separately). For now, we can
avoid running these with the rest of the tests by using `ctest
--gtest_filter=-*bench*`.
Issues #77 - #80 are related to this PR.
---------
Co-authored-by: Riley John Murray <[email protected]>
Co-authored-by: Riley John Murray <[email protected]>
Co-authored-by: Max Melnichenko <[email protected]>
Co-authored-by: rileyjmurray <[email protected]>
Better yet, we remove benchmarks from the test infrastructure.
The text was updated successfully, but these errors were encountered: