Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proper support for similar on CuSparseMats #2652

Merged
merged 1 commit into from
Feb 18, 2025
Merged

Conversation

kshyatt
Copy link
Contributor

@kshyatt kshyatt commented Feb 13, 2025

Should address #1667

@kshyatt kshyatt added cuda array Stuff about CuArray. bugfix This gets something working again. labels Feb 13, 2025
@kshyatt kshyatt requested a review from amontoison February 13, 2025 04:23
Copy link
Contributor

github-actions bot commented Feb 13, 2025

Your PR requires formatting changes to meet the project's style guidelines.
Please consider running Runic (git runic master) to apply these changes.

Click here to view the suggested changes.
diff --git a/lib/cusparse/array.jl b/lib/cusparse/array.jl
index b278b8129..81e34c159 100644
--- a/lib/cusparse/array.jl
+++ b/lib/cusparse/array.jl
@@ -247,21 +247,21 @@ Base.similar(Mat::CuSparseMatrixCSR, T::Type) = CuSparseMatrixCSR(copy(Mat.rowPt
 Base.similar(Mat::CuSparseMatrixBSR, T::Type) = CuSparseMatrixBSR(copy(Mat.rowPtr), copy(Mat.colVal), similar(nonzeros(Mat), T), Mat.blockDim, Mat.dir, nnz(Mat), size(Mat))
 Base.similar(Mat::CuSparseMatrixCOO, T::Type) = CuSparseMatrixCOO(copy(Mat.rowInd), copy(Mat.colInd), similar(nonzeros(Mat), T), size(Mat), nnz(Mat))
 
-Base.similar(Mat::CuSparseMatrixCSC, T::Type, N::Int, M::Int) =  CuSparseMatrixCSC(CUDA.zeros(Int32, 1), CUDA.zeros(Int32, 0), CuVector{T}(undef, 0), (N, M))
-Base.similar(Mat::CuSparseMatrixCSR, T::Type, N::Int, M::Int) =  CuSparseMatrixCSR(CUDA.zeros(Int32, 1), CUDA.zeros(Int32, 0), CuVector{T}(undef, 0), (N,M))
-Base.similar(Mat::CuSparseMatrixCOO, T::Type, N::Int, M::Int) =  CuSparseMatrixCOO(CUDA.zeros(Int32, 0), CUDA.zeros(Int32, 0), CuVector{T}(undef, 0), (N,M))
+Base.similar(Mat::CuSparseMatrixCSC, T::Type, N::Int, M::Int) = CuSparseMatrixCSC(CUDA.zeros(Int32, 1), CUDA.zeros(Int32, 0), CuVector{T}(undef, 0), (N, M))
+Base.similar(Mat::CuSparseMatrixCSR, T::Type, N::Int, M::Int) = CuSparseMatrixCSR(CUDA.zeros(Int32, 1), CUDA.zeros(Int32, 0), CuVector{T}(undef, 0), (N, M))
+Base.similar(Mat::CuSparseMatrixCOO, T::Type, N::Int, M::Int) = CuSparseMatrixCOO(CUDA.zeros(Int32, 0), CUDA.zeros(Int32, 0), CuVector{T}(undef, 0), (N, M))
 
-Base.similar(Mat::CuSparseMatrixCSC{Tv, Ti}, N::Int, M::Int) where {Tv, Ti} = similar(Mat, Tv, N, M) 
-Base.similar(Mat::CuSparseMatrixCSR{Tv, Ti}, N::Int, M::Int) where {Tv, Ti} = similar(Mat, Tv, N, M) 
-Base.similar(Mat::CuSparseMatrixCOO{Tv, Ti}, N::Int, M::Int) where {Tv, Ti} = similar(Mat, Tv, N, M) 
+Base.similar(Mat::CuSparseMatrixCSC{Tv, Ti}, N::Int, M::Int) where {Tv, Ti} = similar(Mat, Tv, N, M)
+Base.similar(Mat::CuSparseMatrixCSR{Tv, Ti}, N::Int, M::Int) where {Tv, Ti} = similar(Mat, Tv, N, M)
+Base.similar(Mat::CuSparseMatrixCOO{Tv, Ti}, N::Int, M::Int) where {Tv, Ti} = similar(Mat, Tv, N, M)
 
-Base.similar(Mat::CuSparseMatrixCSC, T::Type, dims::Tuple{Int, Int}) = similar(Mat, T, dims...) 
-Base.similar(Mat::CuSparseMatrixCSR, T::Type, dims::Tuple{Int, Int}) = similar(Mat, T, dims...) 
-Base.similar(Mat::CuSparseMatrixCOO, T::Type, dims::Tuple{Int, Int}) = similar(Mat, T, dims...) 
+Base.similar(Mat::CuSparseMatrixCSC, T::Type, dims::Tuple{Int, Int}) = similar(Mat, T, dims...)
+Base.similar(Mat::CuSparseMatrixCSR, T::Type, dims::Tuple{Int, Int}) = similar(Mat, T, dims...)
+Base.similar(Mat::CuSparseMatrixCOO, T::Type, dims::Tuple{Int, Int}) = similar(Mat, T, dims...)
 
-Base.similar(Mat::CuSparseMatrixCSC, dims::Tuple{Int, Int}) = similar(Mat, dims...) 
-Base.similar(Mat::CuSparseMatrixCSR, dims::Tuple{Int, Int}) = similar(Mat, dims...) 
-Base.similar(Mat::CuSparseMatrixCOO, dims::Tuple{Int, Int}) = similar(Mat, dims...) 
+Base.similar(Mat::CuSparseMatrixCSC, dims::Tuple{Int, Int}) = similar(Mat, dims...)
+Base.similar(Mat::CuSparseMatrixCSR, dims::Tuple{Int, Int}) = similar(Mat, dims...)
+Base.similar(Mat::CuSparseMatrixCOO, dims::Tuple{Int, Int}) = similar(Mat, dims...)
 
 Base.similar(Mat::CuSparseArrayCSR) = CuSparseArrayCSR(copy(Mat.rowPtr), copy(Mat.colVal), similar(nonzeros(Mat)), size(Mat))
 
diff --git a/test/libraries/cusparse.jl b/test/libraries/cusparse.jl
index 823165c25..1d72972fd 100644
--- a/test/libraries/cusparse.jl
+++ b/test/libraries/cusparse.jl
@@ -153,10 +153,10 @@ end
             @test size(similar(d_x, (3, 4))) == (3, 4)
             @test similar(d_x, Float32) isa CuSparseMatrixCSR{Float32}
         end
-        
+
         @testset "COO" begin
-            x = sprand(elty,m,n, 0.2)
-            d_x  = CuSparseMatrixCOO(x)
+            x = sprand(elty, m, n, 0.2)
+            d_x = CuSparseMatrixCOO(x)
             @test collect(d_x) == collect(x)
             @test similar(d_x) isa CuSparseMatrixCOO{elty}
             @test similar(d_x, (3, 4)) isa CuSparseMatrixCOO{elty}

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CUDA.jl Benchmarks

Benchmark suite Current: ef8f9af Previous: c735d30 Ratio
latency/precompile 46590943591.5 ns 46632963038 ns 1.00
latency/ttfp 7008753919 ns 6945348552 ns 1.01
latency/import 3651469460 ns 3627556110 ns 1.01
integration/volumerhs 9624046.5 ns 9624924.5 ns 1.00
integration/byval/slices=1 147045 ns 146732 ns 1.00
integration/byval/slices=3 425236 ns 425273.5 ns 1.00
integration/byval/reference 145031 ns 144844 ns 1.00
integration/byval/slices=2 285986 ns 285928 ns 1.00
integration/cudadevrt 103563.5 ns 103266 ns 1.00
kernel/indexing 14055 ns 14149 ns 0.99
kernel/indexing_checked 14776 ns 14787 ns 1.00
kernel/occupancy 630.9941520467836 ns 643.4260355029586 ns 0.98
kernel/launch 2017.2 ns 2148.4444444444443 ns 0.94
kernel/rand 14516 ns 16462 ns 0.88
array/reverse/1d 19888.5 ns 19527 ns 1.02
array/reverse/2d 25080 ns 24746 ns 1.01
array/reverse/1d_inplace 11372 ns 10303 ns 1.10
array/reverse/2d_inplace 13200 ns 11908 ns 1.11
array/copy 21004 ns 20974 ns 1.00
array/iteration/findall/int 159106 ns 158927 ns 1.00
array/iteration/findall/bool 139328 ns 139468 ns 1.00
array/iteration/findfirst/int 153717 ns 154085 ns 1.00
array/iteration/findfirst/bool 154809 ns 155140 ns 1.00
array/iteration/scalar 71086 ns 71174 ns 1.00
array/iteration/logical 207585.5 ns 215769 ns 0.96
array/iteration/findmin/1d 41284 ns 41611 ns 0.99
array/iteration/findmin/2d 94234.5 ns 94406 ns 1.00
array/reductions/reduce/1d 45800 ns 40170 ns 1.14
array/reductions/reduce/2d 50576.5 ns 45595.5 ns 1.11
array/reductions/mapreduce/1d 40950 ns 37717 ns 1.09
array/reductions/mapreduce/2d 48838.5 ns 50935 ns 0.96
array/broadcast 20825.5 ns 20980 ns 0.99
array/copyto!/gpu_to_gpu 13517 ns 11688 ns 1.16
array/copyto!/cpu_to_gpu 208316 ns 209993 ns 0.99
array/copyto!/gpu_to_cpu 243999 ns 244189.5 ns 1.00
array/accumulate/1d 108328 ns 108613 ns 1.00
array/accumulate/2d 80024 ns 80233 ns 1.00
array/construct 1296.85 ns 1265 ns 1.03
array/random/randn/Float32 43291.5 ns 44268 ns 0.98
array/random/randn!/Float32 26491 ns 26318 ns 1.01
array/random/rand!/Int64 26944 ns 27025 ns 1.00
array/random/rand!/Float32 8891.666666666666 ns 8630 ns 1.03
array/random/rand/Int64 29836 ns 29758 ns 1.00
array/random/rand/Float32 13198 ns 12929 ns 1.02
array/permutedims/4d 60846 ns 61037 ns 1.00
array/permutedims/2d 54854 ns 55629 ns 0.99
array/permutedims/3d 55949 ns 56014 ns 1.00
array/sorting/1d 2764445 ns 2765122 ns 1.00
array/sorting/by 3352239 ns 3354655 ns 1.00
array/sorting/2d 1081434 ns 1084634.5 ns 1.00
cuda/synchronization/stream/auto 1039 ns 1031.1 ns 1.01
cuda/synchronization/stream/nonblocking 6296.8 ns 6343.8 ns 0.99
cuda/synchronization/stream/blocking 815.6382978723404 ns 793.3529411764706 ns 1.03
cuda/synchronization/context/auto 1180.6 ns 1192.4 ns 0.99
cuda/synchronization/context/nonblocking 6630.4 ns 6640 ns 1.00
cuda/synchronization/context/blocking 940.936170212766 ns 927.8409090909091 ns 1.01

This comment was automatically generated by workflow using github-action-benchmark.

@maleadt
Copy link
Member

maleadt commented Feb 13, 2025

CI failures look related.

@kshyatt
Copy link
Contributor Author

kshyatt commented Feb 13, 2025

Modified the test as I think it was only passing due to the behind-the-scenes similar creating a host array 😬 . Also modified the sized similar because in cases where N * M < nnz(Mat), the current setup is invalid.

Copy link

codecov bot commented Feb 13, 2025

Codecov Report

Attention: Patch coverage is 75.00000% with 3 lines in your changes missing coverage. Please review.

Project coverage is 73.55%. Comparing base (c735d30) to head (ef8f9af).
Report is 8 commits behind head on master.

Files with missing lines Patch % Lines
lib/cusparse/array.jl 75.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2652      +/-   ##
==========================================
- Coverage   73.56%   73.55%   -0.01%     
==========================================
  Files         158      158              
  Lines       15326    15335       +9     
==========================================
+ Hits        11274    11280       +6     
- Misses       4052     4055       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@kshyatt
Copy link
Contributor Author

kshyatt commented Feb 14, 2025

Look ok to merge?

@maleadt maleadt merged commit ffd75e8 into master Feb 18, 2025
3 checks passed
@maleadt maleadt deleted the ksh/sparse_similar branch February 18, 2025 11:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix This gets something working again. cuda array Stuff about CuArray.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants