Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support GPU Simulations #225

Merged
merged 37 commits into from
May 15, 2024
Merged

Support GPU Simulations #225

merged 37 commits into from
May 15, 2024

Conversation

GeorgeR227
Copy link
Collaborator

Adding CUDA support into Decapodes. This adds a code target keyword into gensim that allows a user to output code that can target either the CPU or a Nvidia GPU with CUDA. This is done in a CUDA extension file.

@GeorgeR227 GeorgeR227 marked this pull request as ready for review May 3, 2024 13:18
@GeorgeR227
Copy link
Collaborator Author

Assuming that AlgebraicJulia/CombinatorialSpaces.jl#75 isn't altered in any drastic way, and it shouldn't be at this point, this PR is ready for review.

examples/diff_adv/wave.jl Outdated Show resolved Hide resolved
docs/src/cahn_hilliard.md Outdated Show resolved Hide resolved
docs/src/cahn_hilliard.md Show resolved Hide resolved
docs/src/cahn_hilliard.md Outdated Show resolved Hide resolved
docs/src/cahn_hilliard.md Outdated Show resolved Hide resolved
@@ -93,6 +93,40 @@ soln(0.9)
# ob.color = findnode(soln(t), :C)[point_map]
# end

function closest_point(p1, p2, dims)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't this is used in flat_op

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing these should become an issue and a separate PR. I would consider removing this not in the scope of this PR.

Point3{Float64}(p_res...)
end

function flat_op(s::AbstractDeltaDualComplex2D, X::AbstractVector; dims=[Inf, Inf, Inf])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't this is used in line:

v = flat_op(periodic_mesh, DualVectorField(velocity.(periodic_mesh[triangle_center(periodic_mesh),:dual_point])); dims=[30, 10, Inf])

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing these should become an issue and a separate PR. I would consider removing this not in the scope of this PR.

test/cuda_sims.jl Outdated Show resolved Hide resolved
end

@testset "Brusselator Float64" begin
Brusselator = @decapode begin
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Canon

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turn this into an issue and another PR since there are tests in other files that should also use canon.

test/cuda_sims.jl Show resolved Hide resolved
@GeorgeR227
Copy link
Collaborator Author

@lukem12345 I've fixed the most relevant issues you brought up. I don't want merging this branch to get delayed behind examples, when there are other examples anyway that are old/broken, or documentation, that doesn't work with CUDA anyway, so I've removed any with issues and made a new branch with them we can fix and merge later.

@lukem12345 lukem12345 changed the title Cm/compiler refactor Support GPU Simulations May 6, 2024
GeorgeR227 and others added 22 commits May 14, 2024 09:44
This still needs a CUDA specific file that reimplements the default_dec_matrix_generate function with CUDA specific function implementations. We also need a better way to keep track of operator attributes. So like if it is a matrix, if it is diagonal/sparse, or if it can be done in-place or not.
This should be conditionally loaded when CUDA.jl is detected.
Also had a thought that maybe we should replace the explicit multiplication in making the contracted matrices with a function call that can have more complicated behavior.
Also introduced Krylov gmres solver for inverse hodge
Tested that it run on both Brusselator and Klausmeier. These together tests the use of:
1. Regular DEC matrices
2. Contracted DEC matrices
3. User provided operators
4. DEC wedge kernels
5. Parameters/Constants

A simulation using the inverse geometric hodge 1 should also be tested now.

One small bug that should be fixed is making the operators_CUDA into an extension. As of now, the similations will only work when using eval(gensim(...)), it will not with evalsim(...). The error is that "CuVector is not defined" and so I believe the reason for this is that the first places the code into a context with CUDA.jl while the second has it in the simulations.jl context, where CUDA.jl doesn't exist. Using an extension and placing the simulation hooks there should fix this.
Also some changes to how the CUDA hodge 1 is gotten, due to the Geometric variant not being diagonal.
Might use this for showcase
Also changed the code_target tags to lowercase.
Also updated CHT cuda sim.
This is to prevent some weird scalar indexing errors caused by diagonal and sparse matrix multiplications when in certain groups. For example in Diag * Sparse * Diag.
This include Heat, Brusselator and Klausmeier as well as Float32 variants for Heat and Klausmeier
This should be conditionally loaded when CUDA.jl is detected.
Also introduced Krylov gmres solver for inverse hodge
Tested that it run on both Brusselator and Klausmeier. These together tests the use of:
1. Regular DEC matrices
2. Contracted DEC matrices
3. User provided operators
4. DEC wedge kernels
5. Parameters/Constants

A simulation using the inverse geometric hodge 1 should also be tested now.

One small bug that should be fixed is making the operators_CUDA into an extension. As of now, the similations will only work when using eval(gensim(...)), it will not with evalsim(...). The error is that "CuVector is not defined" and so I believe the reason for this is that the first places the code into a context with CUDA.jl while the second has it in the simulations.jl context, where CUDA.jl doesn't exist. Using an extension and placing the simulation hooks there should fix this.
@lukem12345
Copy link
Member

Something went wonky with that last rebase. It looks like a whole lot of commits happened twice @GeorgeR227

@GeorgeR227
Copy link
Collaborator Author

Yeah I'm looking at the history now and it seems like there have been some duplicated commits from even further back, I think from the last rebase. Take note of 3fd418e and 659ff27. I'm thinking the history is messed up because it was already tangled with main's history. Notice that these new commits begin after the latest commit in main, namely 05376d9.

Required for CUDA code
@GeorgeR227 GeorgeR227 force-pushed the cm/compiler-refactor branch from 6ae30c6 to 9ea2d84 Compare May 14, 2024 20:11
@GeorgeR227 GeorgeR227 requested a review from lukem12345 May 14, 2024 20:49
Copy link
Member

@lukem12345 lukem12345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

State of the PR is a lot better now. Just change up those CPU and CUDA types then we are set. Remaining topics will be addressed in other PRs

@GeorgeR227
Copy link
Collaborator Author

@lukem12345 All tests are passing, including CUDA tests on HiPerGator. While the docs are failing I think this has been an ongoing issue and so should be addressed in the next PR.

@lukem12345 lukem12345 merged commit 724e180 into main May 15, 2024
6 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants