Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with iterative refinement #21

Closed
amontoison opened this issue Mar 21, 2024 · 1 comment · Fixed by #36
Closed

Issue with iterative refinement #21

amontoison opened this issue Mar 21, 2024 · 1 comment · Fixed by #36
Labels
bug Something isn't working

Comments

@amontoison
Copy link
Member

amontoison commented Mar 21, 2024

using CUDA, CUDA.CUSPARSE
using CUDSS
using SparseArrays, LinearAlgebra

using Random
Random.seed!(666)

function example1(T::DataType, n::Int, ir::Int)
	A_cpu = sprand(T, n, n, 0.05) + I
	x_cpu = zeros(T, n)
	b_cpu = rand(T, n)

	A_gpu = CuSparseMatrixCSR(A_cpu)
	x_gpu = CuVector(x_cpu)
	b_gpu = CuVector(b_cpu)

	solver = CudssSolver(A_gpu, "G", 'F')
	cudss_set(solver, "ir_n_steps", ir)

	cudss("analysis", solver, x_gpu, b_gpu)
	cudss("factorization", solver, x_gpu, b_gpu)
	cudss("solve", solver, x_gpu, b_gpu)

	r_gpu = b_gpu - A_gpu * x_gpu
	norm(r_gpu)
end

function example2(T::DataType, n::Int, ir::Int)
	p = 5
	A_cpu = sprand(T, n, n, 0.05) + I
	A_cpu = A_cpu + A_cpu'
	X_cpu = zeros(T, n, p)
	B_cpu = rand(T, n, p)

	A_gpu = CuSparseMatrixCSR(A_cpu |> tril)
	X_gpu = CuMatrix(X_cpu)
	B_gpu = CuMatrix(B_cpu)

	structure = T <: Real ? "S" : "H"
	solver = CudssSolver(A_gpu, structure, 'L')
	cudss_set(solver, "ir_n_steps", ir)
	
	# Required with CUDSS 0.1.0
	(T <: Complex) && cudss_set(solver, "pivot_type", 'N')

	cudss("analysis", solver, X_gpu, B_gpu)
	cudss("factorization", solver, X_gpu, B_gpu)
	cudss("solve", solver, X_gpu, B_gpu)

	R_gpu = B_gpu - CuSparseMatrixCSR(A_cpu) * X_gpu
	norm(R_gpu)
end

function example3(T::DataType, n::Int, ir::Int)
	p = 5
	A_cpu = sprand(T, n, n, 0.01)
	A_cpu = A_cpu * A_cpu' + I
	X_cpu = zeros(T, n, p)
	B_cpu = rand(T, n, p)

	A_gpu = CuSparseMatrixCSR(A_cpu |> triu)
	X_gpu = CuMatrix(X_cpu)
	B_gpu = CuMatrix(B_cpu)

	structure = T <: Real ? "SPD" : "HPD"
	solver = CudssSolver(A_gpu, structure, 'U')
	cudss_set(solver, "ir_n_steps", ir)

	cudss("analysis", solver, X_gpu, B_gpu)
	cudss("factorization", solver, X_gpu, B_gpu)
	cudss("solve", solver, X_gpu, B_gpu)

	R_gpu = B_gpu - CuSparseMatrixCSR(A_cpu) * X_gpu
	norm(R_gpu)
end

# Size of the random linear systems
n = 100

for ir in (0,1,2)
	println("Number of iterations of iterative refinement: $ir")

	for T in (Float32, Float64, ComplexF32, ComplexF64)
		println("Precision: $T")

		r1 = example1(T, n, ir)
		println("Residual norm for example1: $r1")

		r2 = example2(T, n, ir)
		println("Residual norm for example2: $r2")

		r3 = example3(T, n, ir)
		println("Residual norm for example3: $r3")
	end

	println()
end
Number of iterations of iterative refinement: 0
Precision: Float32
Residual norm for example1: 3.692087e-5
Residual norm for example2: 0.0002999944
Residual norm for example3: 8.5040307e-7
Precision: Float64
Residual norm for example1: 3.999599835861191e-13
Residual norm for example2: 1.8113843282606318e-13
Residual norm for example3: 1.438189339051409e-15
Precision: ComplexF32
Residual norm for example1: 7.7133656e-5
Residual norm for example2: 0.0015232839
Residual norm for example3: 1.3667711e-6
Precision: ComplexF64
Residual norm for example1: 5.983645231861647e-14
Residual norm for example2: 3.5220521611201605e-12
Residual norm for example3: 2.6205255057235815e-15

Number of iterations of iterative refinement: 1
Precision: Float32
Residual norm for example1: 689.0252
Residual norm for example2: 35.675114
Residual norm for example3: 19.295319
Precision: Float64
Residual norm for example1: 17.012903059690192
Residual norm for example2: 128.73250852726014
Residual norm for example3: 19.50544156059542
Precision: ComplexF32
Residual norm for example1: 91.10225
Residual norm for example2: 169.85542
Residual norm for example3: 37.091965
Precision: ComplexF64
Residual norm for example1: 66.7120050629793
Residual norm for example2: 178.9033657822368
Residual norm for example3: 33.54613616599457

Number of iterations of iterative refinement: 2
Precision: Float32
Residual norm for example1: 398.21548
Residual norm for example2: 144.65608
Residual norm for example3: 53.373383
Precision: Float64
Residual norm for example1: 351.5959719021192
Residual norm for example2: 785.4054754458682
Residual norm for example3: 52.362171222079056
Precision: ComplexF32
Residual norm for example1: 69.65438
Residual norm for example2: 869.5622
Residual norm for example3: 134.29175
Precision: ComplexF64
Residual norm for example1: 4401.508407855525
Residual norm for example2: 2843.681478289806
Residual norm for example3: 117.71330055583913
@amontoison amontoison added the bug Something isn't working label Mar 21, 2024
@amontoison
Copy link
Member Author

amontoison commented Apr 19, 2024

The issue is fixed with the release 0.2.1:

Number of iterations of iterative refinement: 0
Precision: Float32
Residual norm for example1: 1.847538e-5
Residual norm for example2: 0.00021071904
Residual norm for example3: 8.379088e-7
Precision: Float64
Residual norm for example1: 3.340602287924343e-13
Residual norm for example2: 1.6954324066196217e-13
Residual norm for example3: 1.4466012900106264e-15
Precision: ComplexF32
Residual norm for example1: 5.618822e-5
Residual norm for example2: 0.001547979
Residual norm for example3: 1.3792666e-6
Precision: ComplexF64
Residual norm for example1: 8.186746897843183e-14
Residual norm for example2: 3.8232027439101534e-12
Residual norm for example3: 2.616020123074188e-15

Number of iterations of iterative refinement: 1
Precision: Float32
Residual norm for example1: 2.7082282e-5
Residual norm for example2: 3.2331143e-6
Residual norm for example3: 4.3392834e-7
Precision: Float64
Residual norm for example1: 1.3756521229420242e-15
Residual norm for example2: 1.9698506979498538e-14
Residual norm for example3: 7.778218420081461e-16
Precision: ComplexF32
Residual norm for example1: 4.42893e-6
Residual norm for example2: 1.5342164e-5
Residual norm for example3: 8.035846e-7
Precision: ComplexF64
Residual norm for example1: 6.0055395163591184e-15
Residual norm for example2: 3.628518735237144e-14
Residual norm for example3: 1.6393136883834072e-15

Number of iterations of iterative refinement: 2
Precision: Float32
Residual norm for example1: 2.8285526e-6
Residual norm for example2: 2.5343484e-6
Residual norm for example3: 4.0934142e-7
Precision: Float64
Residual norm for example1: 4.907009938517242e-15
Residual norm for example2: 1.1293433006802389e-14
Residual norm for example3: 7.538526237535183e-16
Precision: ComplexF32
Residual norm for example1: 1.4131082e-6
Residual norm for example2: 8.459211e-6
Residual norm for example3: 7.952513e-7
Precision: ComplexF64
Residual norm for example1: 1.2464818569137624e-14
Residual norm for example2: 3.1703465299248516e-14
Residual norm for example3: 1.3548190347378863e-15

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant