diff --git a/README.md b/README.md index 7ef191c..f06c53b 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,4 @@ -# CUDSS.jl +# CUDSS.jl: Julia interface for NVIDIA cuDSS [![docs-stable][docs-stable-img]][docs-stable-url] [![docs-dev][docs-dev-img]][docs-dev-url] @@ -7,6 +7,16 @@ [docs-dev-img]: https://img.shields.io/badge/docs-dev-purple.svg [docs-dev-url]: https://exanauts.github.io/CUDSS.jl/dev +## Overview + +[CUDSS.jl](https://github.com/exanauts/CUDSS.jl) is a Julia interface to the NVIDIA [cuDSS](https://developer.nvidia.com/cudss) library. +NVIDIA cuDSS provides three factorizations (LU, LDLᵀ, LLᵀ) for solving sparse linear systems on GPUs. + +### Why CUDSS.jl? + +Unlike other CUDA libraries that are commonly bundled together, cuDSS is currently in preview. For this reason, it is not included in [CUDA.jl](https://github.com/JuliaGPU/CUDA.jl). +To maintain consistency with the naming conventions used for other CUDA libraries (such as CUBLAS, CUSOLVER, CUSPARSE, etc.), we have named this interface CUDSS.jl. + ## Installation CUDSS.jl can be installed and tested through the Julia package manager: @@ -19,12 +29,13 @@ pkg> test CUDSS ## Examples +### Example 1: Sparse unsymmetric linear system with one right-hand side + ```julia using CUDA, CUDA.CUSPARSE using CUDSS using SparseArrays -# Solve an unsymmetric linear system with one right-hand side T = Float64 n = 100 A_cpu = sprand(T, n, n, 0.05) + I @@ -44,19 +55,21 @@ cudss("solve", solver, x_gpu, b_gpu) r_gpu = b_gpu - A_gpu * x_gpu norm(r_gpu) ``` + +### Example 2: Sparse symmetric linear system with multiple right-hand sides + ```julia using CUDA, CUDA.CUSPARSE using CUDSS using SparseArrays -# Solve a symmetric linear system with multiple right-hand sides T = Float64 n = 100 p = 5 -A_cpu = sprand(n, n, 0.05) + I +A_cpu = sprand(T, n, n, 0.05) + I A_cpu = A_cpu + A_cpu' -X_cpu = zeros(n, p) -B_cpu = rand(n, p) +X_cpu = zeros(T, n, p) +B_cpu = rand(T, n, p) A_gpu = CuSparseMatrixCSR(A_cpu |> tril) X_gpu = CuMatrix(X_cpu) @@ -72,6 +85,9 @@ cudss("solve", solver, X_gpu, B_gpu) R_gpu = B_gpu - CuSparseMatrixCSR(A_cpu) * X_gpu norm(R_gpu) ``` + +### Example 3: Sparse hermitian positive definite linear system with multiple right-hand sides + ```julia using CUDA, CUDA.CUSPARSE using CUDSS @@ -82,10 +98,10 @@ using SparseArrays T = ComplexF64 n = 100 p = 5 -A_cpu = sprand(n, n, 0.01) +A_cpu = sprand(T, n, n, 0.01) A_cpu = A_cpu * A_cpu' + I -X_cpu = zeros(n, p) -B_cpu = rand(n, p) +X_cpu = zeros(T, n, p) +B_cpu = rand(T, n, p) A_gpu = CuSparseMatrixCSR(A_cpu |> triu) X_gpu = CuMatrix(X_cpu) diff --git a/docs/src/index.md b/docs/src/index.md index ce04a97..6f38eed 100644 --- a/docs/src/index.md +++ b/docs/src/index.md @@ -1,5 +1,11 @@ # [CUDSS.jl documentation](@id Home) +## Overview + +[CUDSS.jl](https://github.com/exanauts/CUDSS.jl) is a Julia interface to the NVIDIA [cuDSS](https://developer.nvidia.com/cudss) library. +NVIDIA cuDSS provides three factorizations (LU, LDLᵀ, LLᵀ) for solving sparse linear systems on GPUs. +For more details on using cuDSS, refer to the official [cuDSS documentation](https://docs.nvidia.com/cuda/cudss/index.html). + ## Installation ```julia